(syllabus and calendar)

Ch. 10. Using I/0
pp. 365-106

week10-io-2014-fall.zip

 

Session 10


quiz


If there is time remaining, we can look at extras

Agenda and Evaluation Forms

http://www.write-technical.com/126581/session10/

I/O - (input/ouput) overview

When we first learned how to read, we started with our 'A', 'B', 'C's, the name of individual letters of the alphabet (bytes, chars). Soon, we read individual words isolation, then sentences containing multiple words, then paragraphs, then entire articles in the Sports section. To read an entire article, we still have to deal with its paragraphs. Consider a newspaper article to be a file, and a paragraph to the chunk of paragraph text we absorb as a page (or screen). Our minds can hold a paragraph of information in short-term memory, which we might consider to be a buffer. Just as we process a two-page article paragraph by paragraph, so Java processes a file by using a buffer.


Review of Scanner, lightweight IO

java.util.Scanner

http://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html

Java 6 introduced the Scanner class, which can be more convenient than previous IO APIs for processing input. Here, we create a scanner object to work with integers.

This scanner gets lines of text, and can parse for a boolean value.

This scanner works with regular expressions.

A scanner can read from a file - https://docs.oracle.com/javase/8/docs/api/index.html?overview-summary.html if it has an input stream from that file, which requires the use of java.io package.

but a scanner cannot write to a file. However, a scanner can read from the console and, if the class also uses the java.io package, the class can write the input the scanner receives to a file. See ScannerWithFileWriter.java.


Creating a file or a directory

Scanner cannot create a new file or a new directory, but java.io.FileWriter can.

However, you could use Scanner to get from the user the name that FileWriter users to create the file.


Input and output is about movement or transportation. Just as intermodal containers are fixed blocks for transporting with ships, trains, and trucks,

so, for computers, a disk "block" or "sector" might have a capacity of 512, 1024, 2048, or 4096 bytes, and a block of data might be moved from one network or device location to another.

How do we fill a container? One way is atomically, each individual piece, one-by-one. Another way is with a buffer. What is a buffer? A buffer is a mechanism for TEMPORARY absorption that provides efficiency.

A funnel is a buffer that prevents spilling and waste:

In fluid dynamics, a buffer also provides efficiency and supports steady processing. This input buffer must be filled from the below before it will output from above:

   

Video streams are buffered so that the flow of data is steady, even if the internet connection is a bit less steady.

A balloon buffers the input stream of air. During this buffer's output or purge operation, the output stream acts like a jet, and can carry the balloon upward.

Here, the buffer, although big, is filled in an efficient manner, but the output might not be as efficient as the original input operation.
Is this buffer similar to a queue (first-in, first-out "FIFO") or a stack (last-in, first-out "LIFO")?

The mouth is a fairly small buffer insofar as it cannot contain an entire pie in one operation, but can only process in "bits" and "bytes".


It is more efficient to reduce input/output operations by grouping the individual elements together for processing. Another analogy might be using a big spoon to eat a set of corn flakes, instead of individual flake by flake, which would be a lot of input operations. Note that many people prefer eating potato chips one-by-one (the fun of "finger-food"), instead of crushing them into a big spoon. Potato chips are IO-intensive because sometimes we favor the savor and flavor of  inefficiency.

 

Cattle are efficient in their input and and productive in their output, as a hike on many trails makes evident. Cows prefer to chomp on entire large mouthfuls of grass rather than pick at individual blades of grass (or individual potato chips). Similarly, dogs are IO-efficient, filling the entire buffer of their maul at once, unlike cats, who nibble in a more dainty, IO-intensive manner.


Stream

Another way to process food is with a stream. IO-efficient folks slurp up long spaghetti instead of cutting each individual spaghetto. Some call it uncouth, others call it IO-efficient.

A straw is an IO-efficient way to stream the contents of a beverage into a mouthful (a buffer).

Each big gulp absorbs in a batch operation what has been streamed gradually into the mouth, and thus we avoid having to make constant swallowing operations for each micro-liter of fluid.

Similarly, in Java, we can use a buffer for the input stream to get (or "consume") a large quantity in a single operation.

The Java reader for an input stream converts the bytes associated with keyboard events to characters of the current charset. http://download.oracle.com/javase/8/docs/api/java/io/InputStreamReader.html

InputStreamReader isr = new InputStreamReader(System.in);

The input stream reader is "wrapped" in a buffered reader for efficiency, that is, the reduction of input/output operations:

BufferedReader buffer = new BufferedReader(isr);

or, the shorter version that wraps a constructor call within a constructor call:

BufferedReader br = new BufferedReader(new InputStreamReader(System.in);

In this case, the newly created input stream reader does not need an identifier because it is the runtime argument to the constructor of a buffered reader.

(In constructor wrapping, we use the inner constructor one time, and do not need a variable (or handle) to the object it initializes.)

Think of the straw as System.in, the fluid moving through the straw as an input stream, and the throat receiving a buffered mouthful/gulp.

An input stream is like an input straw, and a buffered reader is like a big gulp that only occurs from time to time.

Similarly, if you were going to send an email message, you would not do it as many emails, one letter per email. Instead, you collect a bunch or batch of letters, words, sentences, paragraphs into a single message and send the whole collection at once.
import java.io.*;
FileOutputStream fout = new FileOutputStream(args[1]);
BufferedOutputStream bos = new BufferedOutputStream(fout);

Flushing the buffer

The Panama canal has "locks" that empty to flush out water, thus lowering the level of a ship so it can continue its journey (or output).

Flushing the buffer means to empty the buffer immediately, even if it is not full.
Use cases:


The input and output facilities of Java include:


Abstract Classes are the Foundation of the IO Package

From the very first release of Java, java.io.InputStream and java.io.OutputStream were abstract classes to support streams of bytes from sources such as keyboard input or a file on disk.

http://docs.oracle.com/javase/8/docs/api/java/io/InputStream.html

http://docs.oracle.com/javase/8/docs/api/java/io/OutputStream.html

 

For character streams, the second release of Java added java.io.Reader and java.io.Writer, which are abstract classes for code reuse by their subclasses.

http://docs.oracle.com/javase/8/docs/api/java/io/Reader.html

http://docs.oracle.com/javase/8/docs/api/java/io/Writer.html

The java.io package has two sides: one for bytes, and one for characters. For example, PrintStream is for bytes  http://docs.oracle.com/javase/8/docs/api/java/io/PrintStream.html and PrintWriter is for characters http://docs.oracle.com/javase/8/docs/api/java/io/PrintWriter.html.


The abstract class, java.io.InputStream http://java.sun.com/javase/6/docs/api/java/io/InputStream.html, provides implementation methods for managing the bytes in an stream, such as read(byte[]  b), mark(int readlimit), reset(), skip(long n), and close(). Therefore, subclasses, such as AudioInputStream and FileInputStream have the choice to reuse the implementation or overwrite it. The advantage of having an abstract class is that provides default functionality (unlike an interface) but also allows the flexibility of customization. In this case, using abstract classes make the workflow more efficient within the developer teams of Sun Microsystems.

The out field

Somewhat like System.out.println(), the outfielder in baseball whose long-range throwing strength outputs something visible.

The java.io package ( http://java.sun.com/javase/6/docs/api/java/io/package-summary.html ) "Provides for system input and output through data streams, serialization, and the file system". The default package, java.lang, provides the basic I/O functionality.

For example, the System class ( http://java.sun.com/javase/6/docs/api/java/lang/System.html ) has a static final field, out, the "standard output stream",

http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#out

that enables us to call System.out.println(). So, the java.lang package does provide the println() method, and other packages also provide this method.

In this case, for convenience, the language breaks the general rule of encapsulating a specific type of functionality, such as input/output, in the package dedicated to that functionality.

The home base for the output stream is the io package, which includes the PrintStream class ( http://docs.oracle.com/javase/7/docs/api/java/io/PrintStream.html ) with the out field. This field has the println() method that supports sending a stream of bytes to a device (such as the console) or a file. Java follows UNIX insofar as standard error, the static err field, is the same device as standard out, and so error reporting automatically takes advantage of the print stream.

The in field

Similarly, we get System.in.read() to read from the console from java.io.InputStream http://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html. So the standard input, output, and error streams of java.io are made available to java.io.System. Standard input is given by System.in - http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#in.

. We can think of a baseball pitcher as an in-fielder who reads signals from the catcher, one little bit at a time because the pitching/catching cycle is "io-intensive".

  


binary data and unicode characters

To work with low-level binary data, use a byte stream.

At a higher level, to work with unicode characters, use a character stream. (This is similar to the file-transfer-protocol (ftp) toggle, binary, which sets the transfer operation to work with bytes instead of characters.)

A char is a convenient way to work with what the system considers to be a byte, and that's why we can convert char to byte and byte to char. Similarly, a character stream is a convenience built on top of a byte stream.

Concrete use case for an abstract class is java.io.Writer

The class java.io.Writer - http://docs.oracle.com/javase/7/docs/api/java/io/Writer.html - is abstract to provide both some functionality and some customizability.

This class

For example, within the Java APIs, a BufferedWriter, a StringWriter, and a PrintWriter might use different offsets, ways to flush, and close.

If you want to handle ALL the possible I/O exceptions, catch java.io.IOException, ( http://java.sun.com/javase/6/docs/api/java/io/IOException.html ), a direct subclass of java.lang.Exception. This subclass is the superclass of many specialized exceptions, such as java.io.FileNotFoundException.

Because IOException is not a subclass of RuntimeException, the compiler checks to see that IO exceptions are handled or the method declares it throws the exception. The architects of Java thereby encourage us to be proactive in handling possible IO problems.


Serialization and buffered strings: java.lang.StringBuffer and java.lang.StringBuilder:

http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html provides a high performance was to have a vector-like resizable and mutable structure for a String. However, if your application is multi-threaded, and it is possible that two threads might corrupt the structure, you should use a StringBuffer instead because the StringBuffer class implements the Serializable interface. http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuffer.html


Byte array and System.in.read()

To read bytes from the console, use a byte array. When we want to the characters in the byte array, we cast to char (line 19). Otherwise, a set of chars like "hello" would look like 1041011081081111310000.

The compiler "checks" for IOException, so we must do one of the following:

Note that the byte array is of a fixed size, but this is fine for many use cases, such as reading in a credit card number, which can later be parsed as integers using the parseInt method - http://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html#parseInt-java.lang.String-

Buffered Reader

A common way to work with user input is with a buffered reader. This program parses the user input to extract a numeric value. The user can choose to enter an arbitrary number of integers.

Throws declaration versus try / catch blocks

If the caller has a try / catch mechanism, the caller does not need to declare it might throw a checked exception:

The compiler prevents you from writing unreachable code. Main's catch block would be unreachable if the method that main calls already catches the only possible exception:


File I/O with FileInputStream

When we work with arrays, we have the limitation of needing to know the size of the array necessary to store the data. To read in an entire file (and not have to know its size beforehand), read in until the end-of-file (EOF) marker, negative 1 (-1). The read method of java.io.inputStream returns a single character as progresses until it returns -1 because there is nothing more to read in the file.   That negative one signal might be analogous to that loud sound from a straw you are sipping on where there is no more liquid in the glass.

http://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html#read%28%29

Note that this marker, -1, is not a character, but rather an int.

Because the read method is on an instance of type FileInputStream, line 14 calls the constructor for a file input stream. This class is a subclass of  the more general InputStream, which also has a read method. Lines 21-16 define a catch block that provides more guidance to the user than the default exception handling of the JVM. In this case, the catch block indicates the proper usage of the application.


Console class

Use the Console class so end-users can keep their password hidden from people near their computer screen.  http://download.oracle.com/javase/6/docs/api/java/io/Console.html
We do not have to construct the Console. Instead, we call the static console method of the System class, which is final. http://download.oracle.com/javase/6/docs/api/java/lang/System.html#console%28%29

For security, the readPassword method hides the password from display - http://docs.oracle.com/javase/7/docs/api/java/io/Console.html#readPassword%28%29

The program compares the two passwords on Line 28. Each password is stored in an array of bytes, and a static method of java.util.Arrays performs the comparison. This method is overloaded, so it can deal with an array of chars or Strings (other overloads too) - http://download.oracle.com/javase/6/docs/api/java/util/Arrays.html#equals%28char[],%20char[]%29 or more likely http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#equals%28java.lang.Object[],%20java.lang.Object[]%29


FileInputStream and FileOutputStream

A common use for file IO is to update text, such as a name. Here, we use a file input stream to demonstrate substitution. Lines 64-65 replace each blank space with a hyphen and write this substitution to the destination file. To run this program, type java Hypen mySourceFile myDestinationFile


Compare two files

We can compare two files, byte-by-byte, using the read method on a file input stream.

http://download.oracle.com/javase/6/docs/api/java/io/FileInputStream.html#read%28%29

Lines 48-49 call this method for the content of each file, and line 50 performs the comparison. 

If I run: java FileComparison myLetters.txt myLetters2.txt
The output is: original says: g but second file says: z


FileReader and BufferedReader

Here the input stream is well matched to the input buffer.

Here, the buffer is too small for efficient input:

A small buffer can still work, but it takes almost bit-by-bit patience and persistence:

BufferedReader br = new BufferedReader(new InputStreamReader(System.in);

Think of the straw as System.in, the fluid moving throw the straw as an input stream, and the throat receiving a buffered mouthful/gulp. (An input stream is almost an input straw.)

If you know that the file to input or output is not binary but contains characters, use an instance of FileReader, which works with Unicode and supports all the major human languages, including Chinese, Japanese, and Arabic. A FileReader is a subclass of InputStreamReader ( http://java.sun.com/javase/6/docs/api/java/io/InputStreamReader.html ), which works with characters instead of bytes. The FileReader  can be constructed from a file name (see Line 7 in the example below).

In the example below, line 8 wraps the FileReader inside a BufferedReader which holds a significant amounts of bytes, typically 1024 bytes (1 kilobyte), in a single buffer. The size of the buffer matters in some circumstances. For example, a small device, such as a cell phone screen, might need a smaller buffer than, say, a powerful server running servlets that processes large amounts of data rapidly. Java provides two constructor signatures for a buffered reader, one of which allows us to set a custom size:

BufferedReader(Reader in)

Creates a buffering character-input stream that uses a default-sized input buffer. http://download.oracle.com/javase/6/docs/api/java/io/BufferedReader.html#BufferedReader%28java.io.Reader%29

BufferedReader(Reader in, int sz)

Creates a buffering character-input stream that uses an input buffer of the specified size.
http://download.oracle.com/javase/6/docs/api/java/io/BufferedReader.html#BufferedReader%28java.io.Reader,%20int%29

Typically, a server running servlets has a buffer of 8 kilobytes, but, in performance tuning, some might use a 16 kilobyte buffer - http://java.sun.com/developer/technicalArticles/Servlets/servletapi/ - but note that the Servlet API is part of the Enterprise Edition, not the Standard Edition.

The new IO package of the fourth version of Java, java.nio, added new classes for buffering - http://download.oracle.com/javase/6/docs/api/java/nio/package-summary.html. For example, java.nio provides a class for each primitive types (except boolean), and these are subclasses of Buffer - http://download.oracle.com/javase/6/docs/api/java/nio/Buffer.html

Buffering provides efficiency because we can read in, say, 4000 unicode (2 bytes each) characters instead of 80 characters. Without buffering, each invocation of read() or readLine() could fetch at most one line of characters. Assuming you want more than one or two characters, a buffered reader is useful for reading characters from the console or a file. Efficiency: reducing network roundtrips is useful because network latency can easily exceed JVM processing time. The throughput might increase 5000%.

This version includes try, catch, and finally blocks. The typical use case for finally is clean up resources, as we do here by closing the file reader.

This version has more robust exception handling and also indicates which classes its uses from the io.package instead of importing *.*

The ready() method

The java.io.Reader class provides a ready() method, which can be used to ensure that time is not wasted trying to read an empty buffer.
http://download.oracle.com/javase/6/docs/api/java/io/Reader.html#ready%28%29

Note that the InputStream class does not have a ready()method and is not a superclass of an InputStreamReader:
http://download.oracle.com/javase/6/docs/api/java/io/InputStream.html


Buffered reader and System.in

Use input and output to echo input as output to screen:

Allow the input-output cycle to shut itself off when the user says to stop.


Getting a numeral from the keyboard

This example uses the constructor of java.lang.Character in Line 12 to get a single character object, then a string, then test whether the string representation of the character itself a representation of an integer. The parseInt method accepts a String as input, and outputs the corresponding integer value. The Character class is an object wrapper for a primitive char - http://download.oracle.com/javase/6/docs/api/java/lang/Character.html

    Note: Chapter 12 explains autoboxing, which allows you to avoid explicit use of an object wrapper for a primitive.

A more sophisticated use of a buffered reader with integer parsing:


Try with resources

a feature discussed at http://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html

This program is a useful utility: it writes a text file that contains the names of the files in a zip archive.



FileWriter and write()

You can also write a Unicode file from the lines the user types at the console by creating an instance of java.io.FileWriter (Line 17) and calling the signature of the write method that takes a String as input (Line 33) - http://download.oracle.com/javase/6/docs/api/java/io/Writer.html#write%28java.lang.String%29

This program uses the compareTo() method of the String class - http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#compareTo%28java.lang.String%29


Copying a File to Another File: Stream versus Channel

Two ways to copy a file are using streams and using channels. Streams are part of the standard io package. This program uses the java.io.FileOutputStream.write method (http://download.oracle.com/javase/6/docs/api/java/io/FileOutputStream.html#write%28byte[],%20int,%20int%29) to output a buffer of 1024 bytes to the FileOutputStream until there are not more bytes to write. The JVM knows there are no more bytes when the file input stream returns  -1.

What user error might be expect to see in relation to line 27? What is the usage requirement to run the program successfully?

Channels are part of the the java.nio (new I/O) package - http://download.oracle.com/javase/6/docs/api/. Channels were added for:

If you work with databases, this advanced topic might be valuable with large tables. Classes in this package can also take advantage of a memory mapped file outside the JVM, which means the operating system directly uses hard disk space for paging a large file as if it were loaded into RAM.


Parsing out new line and carriage feed

Lines 33 to 47 are a do while loop that does not do anything except allow us to read paste the newline character and the carriagefeed character. This way, we get the character that represents the letter key the user typed before hitting the Enter key. Different operating systems represent the user typing Enter or Return in different ways:

CR+LF Windows, DOS
CR MacOS up to OS-9
LF UNIX


Quiz: Session 10

  1. Java's built-in stream for output  ____________.__________ is in the package java.________, and we used it in Week 1, "HelloWorld.java".

  2. System.out.println() and System.out.print()write to standard output. What is the equivalent for reading from standard input?

  3. java.io.Writer is an abstract class that is extended by java.io.BufferedWriter and java.io.FileWriter. Why should java.io.Writer be an abstract class instead of an interface? Instead of a regular class?

  4. What does read() return at the end of the file?

  5. A finally block is useful for closing a connection to a database, but is it also useful for closing a File?

  6. What is the fully-qualified name of the Exception most general to the io package?

  7. java.io.BufferedReader and java.io.Console have a method that allows you to read an entire line of text from the keyboard, which is more efficient than reading each character one-by-one. The name of this method is: ________________

  8. The following code
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in);
    converts keyboard bytes into a non-buffered_____________ and finally into a buffered _____________

  9. Is it possible to compare two files, character-by-character, by using a non-buffered FileInputStream? (See FileComparison.java)

  10. Can write(byte[] b, int offset, int lenOfByteArray) on a FileOutputStream be used to write out an entire file? (See CopyFileWithStream.java)

  11. Why is System.out static?

  12. Why is System.out final?

  13. The architecture of the java.io is dualistic, the 1.0 part for ________ and the 1.1 part for __________.

=====================

  1. Java's build-in stream for output System.out is in the package java.io, and we used it in Week 1, "HelloWorld".
  2. Standard input is the keyboard. Standard output is the monitor. System.out.println() and System.out.print()write to standard output. What is the equivalent for reading from standard input?
    System.in.read()
  3. java.io.Writer is an abstract class that is extended by java.io.BufferedWriter and java.io.FileWriter. Why should java.io.Writer be an abstract class instead of an interface?
    To provide some default functionality, such as methods for append, close, flush, write, while allowing subclasses to have specialization for close, flush, and write. Closing a file might be different than closing a buffer. If it were a regular class, we would not have the architect's guidance for the abstract methods, and subclasses might not follow any standard. Consider that it is a subclass of java.io.Reader that knows how to read from an audiofile.
  4. What does read() return at the end of the file?
    If no byte is available because the end of the stream has been reached, the value -1 is returned. This value cannot be cast to a char.
  5. A finally block is useful for closing a connection to a database, but is it also useful for closing a File? Yes.
  6. What is the fully-qualified name of the Exception most general to the io package?
    IOException
  7. java.io.BufferedReader and java.io.Console have a method that allows you to read an entire line of text from the keyboard, which is more efficient than reading each character one-by-one. The name of this method is _______.
     readline. For details, see http://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html#readLine()
  8. The following code
     BufferedReader br = new BufferedReader(new InputStreamReader(System.in);
    converts keyboard bytes into a non-buffered stream and finally into a buffered stream. This sequence of embedding is common called w__________.
    wrapping on constructor within another constructor.
  9. Is it possible to compare two files, character-by-character, by using a non-buffered FileInputStream? (See FileComparison.java)
    Yes.
  10. Can write(byte[] b, int offset, int lenOfByteArray) on a FileOutputStream be used to write out an entire file? (See CopyFileWithStream.java)
    Yes.
  11. Why is System.out static?
    So that it is available for use with having to call a constructor.
  12. Why is System.out final?
    So that its functionality is known and cannot be tampered with, which makes it use predictable and safe.
  13. The architecture of the java.io is dualistic, the 1.0 part for ___bytes_____ and the 1.1 part for ____characters______.