Reading files
Opening a file, reading it line by line, and why every file operation can fail.
Finished reading?
Mark this session so you can track where you are.
Opening a file, reading it line by line, and why every file operation can fail.
Finished reading?
Mark this session so you can track where you are.
Last session you read what a person typed at the keyboard. But most of the data a program needs is not typed live by a human sitting there. It is already sitting on disk in a file: a list of names, yesterday's sales, a saved game. This session is about opening such a file and reading it, one line at a time.
FileReader wrapped in a BufferedReader.readLine() and stop at the end using its null signal.IOException.null from object references. That is everything this session stands on. We will read text only, and we will not yet learn the full machinery for handling errors; we save that for a session of its own.A Scanner on the keyboard is fine when a person is present to answer. But think about a program that prints a report from ten thousand sales records. Nobody is going to type ten thousand numbers. Those records live in a file, written there earlier, perhaps by another program. Reading from a file means your program can work with data that already exists and that outlives any single run.
A plain text file is just a sequence of characters with newline markers splitting it into lines. When you open notes.txtin any editor and see three lines, the file on disk is really one long ribbon of characters with two invisible “new line” marks in it. Reading the file means walking along that ribbon from the start to the end.
open("notes.txt") and then looped for line in f:. Java has the same idea, but it is more explicit. You build the reader by hand, you ask for each line, and you are responsible for closing the file. That extra ceremony is the price Java pays for being clear about exactly what is happening.To read a text file in Java you typically use two classes stacked together. A FileReader knows how to open one named file and pull characters out of it. A BufferedReader wraps around a FileReader and adds two gifts: it reads in efficient chunks, and it gives you a method that hands back a whole line at a time. You wrap one inside the other:
import java.io.FileReader;import java.io.BufferedReader; public class Main { public static void main(String[] args) { // Open the file, then wrap it for line-by-line reading. FileReader file = new FileReader("notes.txt"); BufferedReader br = new BufferedReader(file); // ... read from br here ... }}(setup only, nothing printed yet)
Read that wrapping out loud: “a buffered reader, reading from a file reader, reading from notes.txt”. The FileReader is the part touching the disk. The BufferedReader is the friendly layer on top that you actually talk to. People usually collapse the two lines into one:
BufferedReader br = new BufferedReader(new FileReader("notes.txt"));(setup only, nothing printed yet)
The method you call on a BufferedReader is readLine(). Each call returns the next line of the file as a String, without the newline mark on the end. Call it again and you get the line after that. The important part is what happens when there are no more lines: readLine() returns null. That nullis the file's way of saying “I am empty now, the end has been reached”. This is the end of file, often shortened to EOF.
readLine()gives you the next line as aString, ornullwhen the file is used up. Thatnullis your signal to stop reading.
Reading a fixed three-line file by calling readLine() three times makes the behaviour concrete. Suppose notes.txt contains exactly these three lines:
Buy milkWalk the dogCall Maya(this is the input file, not a program)
import java.io.FileReader;import java.io.BufferedReader;import java.io.IOException; public class Main { public static void main(String[] args) throws IOException { BufferedReader br = new BufferedReader(new FileReader("notes.txt")); System.out.println("First line: " + br.readLine()); System.out.println("Second line: " + br.readLine()); System.out.println("Third line: " + br.readLine()); System.out.println("Fourth call: " + br.readLine()); br.close(); }}First line: Buy milk Second line: Walk the dog Third line: Call Maya Fourth call: null
The first three calls hand back the three lines. The fourth call has nothing left to give, so it returns null, and printing null shows the literal word null. Notice two new pieces of ceremony that we will explain next: the strange throws IOException after the method header, and the br.close() at the bottom.
readLine() return "", the empty string, which is a real value, not null. Only true end of file gives you null. So you stop on null, never on "".When you open a file, the operating system hands your program a kind of ticket: a live connection to that file, called a file handle. The system has only a limited number of these tickets. If you open files and never give the tickets back, your program leaks them, and eventually the system refuses to open any more. Closing a reader returns the ticket. close()is how you say “I am done with this file”.
There is a second reason closing matters, and it bites harder when writing than reading: a buffer may still hold data that has not been pushed all the way through. Closing flushes it out. For now, the habit to build is simple. If you open it, you must close it.
br.close() as the last line looks tidy, but it has a hole. If anything goes wrong on a line above it, the program may jump away and never reach that close() at all. The file stays open. The next section fixes exactly this.Java has a dedicated shape for “use this thing, then close it no matter what”. It is called try-with-resources. You declare the reader inside parentheses right after the word try, and Java guarantees the reader is closed for you when the block ends, whether the block finishes normally or something fails partway through.
import java.io.FileReader;import java.io.BufferedReader;import java.io.IOException; public class Main { public static void main(String[] args) throws IOException { try (BufferedReader br = new BufferedReader(new FileReader("notes.txt"))) { System.out.println("First line: " + br.readLine()); } // By the time we reach here, br has already been closed for us. }}First line: Buy milk
Look closely at the first line of the try. The reader is created inside the parentheses try ( ... ), not inside the body. That placement is the whole trick. It tells Java “this is a resource I am borrowing; close it the moment this block is done”. You write no close() yourself. There is no way to forget it and no way for a failure to skip it.
br.try, where you read from br.br.close().br.close() yourself at the end.try ( ... ) and you are safe.Declare a reader inside
try ( ... )and Java closes it for you, every time, no matter what. Prefer this shape for every file you open.
Reading a keyboard rarely surprises you. Reading a file is different, because the file lives outside your program and outside your control. The file might not exist. It might exist but you might not have permission to read it. The disk might be unplugged mid-read. Java does not let you pretend these cannot happen. Any operation that touches a file is officially allowed to fail with an IOException, where IO stands for input and output.
Because of that, Java forces you to acknowledge the possibility. In every example so far you saw the same small piece of syntax on the method header:
public static void main(String[] args) throws IOException { // file reading lives here}(syntax shape only)
The phrase throws IOExceptionis a promise label. It says “the code inside might run into an IO problem, and I am choosing not to deal with it here; I am passing that responsibility upward”. For now, writing throws IOException on main is enough to make a file-reading program compile and run. You also need the import import java.io.IOException; at the top so Java knows what that name means.
throws is the quick option, not the careful one. The careful way is to catchthe failure and respond to it, perhaps printing “sorry, that file is missing” instead of crashing. That whole topic, try/catch and the family of exceptions, gets a full session very soon: exception handling. The try-with-resources shape you learned here will fit neatly into it. Just before that, you will use these same readers in reverse to write files. For today, throws IOException on main is all you need.Now the real goal. You almost never want exactly three lines; you want every line, however many there are, and the file decides how many. That is a loop, and the stopping condition is the null from readLine(). The classic pattern reads a line, checks it against null, and only enters the loop body if there was really a line:
import java.io.FileReader;import java.io.BufferedReader;import java.io.IOException; public class Main { public static void main(String[] args) throws IOException { try (BufferedReader br = new BufferedReader(new FileReader("notes.txt"))) { int count = 0; String line = br.readLine(); while (line != null) { count = count + 1; if (line.length() > 10) { System.out.println(line); } line = br.readLine(); } System.out.println("The file notes.txt has " + count + " lines."); } }}Walk the dog The file notes.txt has 3 lines.
Walk through it with the same three-line notes.txt. Before the loop, line is set to the first line, "Buy milk". The loop checks: not null, so enter. count becomes 1. "Buy milk" is 8 characters, not more than 10, so nothing prints. Then the last line of the body reads the next line, "Walk the dog". Loop again: not null, count becomes 2, and "Walk the dog" is 12 characters, more than 10, so it prints. Read again: "Call Maya". Loop: count becomes 3, 9 characters, no print. Read again: null. The loop condition is now false, so we fall out and print the count, 3.
readLine() appears twice: once just before the loop, and once at the bottom of the loop body. The first read primes the pump so the very first check has something to look at. The read at the bottom advances to the next line for the next check. Skip either one and the loop either misses the first line or never moves forward and runs forever.readLine() moves forward and hands back the next line; there is no peeking. A subtle bug is calling it an extra time inside the loop, for instance writing if (br.readLine() != null) to test and then br.readLine() again to use the line. That reads two different lines and quietly skips every other one. Read each line exactly once, into a variable, and work with that variable.Strip away the example and a file-reading program is always the same four moves.
try ( ... ) so it closes itself: new BufferedReader(new FileReader(name)).readLine(), doing something with each line.readLine() returns null; that is the end of the file.throws IOException so Java knows file work can fail.That pattern reads a configuration file, a list of names, a log, a saved score table: anything that is text, one record per line. The data changes; the shape does not.
Try each one yourself first, then open the answer.
readLine() return when it reaches the end of the file, and how is that different from reading a blank line in the middle of the file?new BufferedReader(new FileReader("notes.txt")), which of the two readers actually touches the file on disk, and what does the other one add?try ( ... ) safer than calling br.close() as the last line of the method?notes.txt, a program opens it in a try-with-resources block and calls br.readLine() exactly twice, printing each result. What does it print, and what happens to the file afterward?throws IOException on its header, when a method that only reads the keyboard with Scanner often does not?Take these away. They continue exactly what we just did.
count after each pass of the loop, and explain why the blank line still counts.milk. Use a String method you already know to test each line. Decide what your program should print if the file has no such line.scores.txt where every line is a single whole number, and prints the largest number found. Assume the file has at least one line. Describe in one sentence why you must convert each line from a String before comparing the numbers.throws IOException on it, and what the better alternative will be once you have learned the next-but-one topic.