Lecture 5A File processing Richard Gesick
Outline Objectives Defining File Streams Reading Data Files Generating a Data File Problem Solving Applied: Data Filters – Modifying an HTML File Error Checking Numerical Technique: Linear Modeling* Problem Solving Applied: Ozone Measurements*
Objectives Develop problem solutions in C++ that: Open and close data files for input and output. Read data from files using common looping structures. Check the state of an input stream. Recover from input stream errors. Apply the numerical technique of linear modeling.
Copyright © 2012 Pearson Education, Inc. Defining File Streams Copyright © 2012 Pearson Education, Inc.
Standard Input and Output C++ defines the cin and cout objects to represent the standard input (keyboard) and standard output (console). C++ also defines the standard error stream, cerr. Standard input is a kind of general input stream class; standard output and error streams are special kinds of general output stream.
Other kinds of Streams Standard C++ library also defines special kinds of streams for file input and output. #include <ifstream> #include <ofstream>
Disk Files for I/O #include <fstream> your variable (of type ifstream) (of type ofstream) disk file “myInfile.dat” “myOut.dat” executing program input data output data
Disk I/O To use disk I/O Access #include <fstream> Choose valid identifiers for your file streams and declare them Open the files and associate them with disk names
Disk I/O, cont... Use your file stream identifiers in your I/O statements(using >> and << , manipulators, get, ignore) Close the files
Stream Object Hierarchy Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. The ifstream Class The ifstream (input file stream) class is derived from istream (input stream class). Thus ifstream inherits the input operator and member functions eof() and fail() defined in istream. The ifstream class also defined specialized methods specific to working with files. Copyright © 2012 Pearson Education, Inc.
Opening a File Opening a file Places a file reading marker at the very beginning of the file, pointing to the first character in the file
Copyright © 2012 Pearson Education, Inc. Input File Streams Each data files used for input must have an ifstream object associated with it: ifstream sensor1; sensor1.open(“sensor1.dat”); Or just – ifstream sensor1(“sensor1.dat”); Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Avoiding Bugs If opening the named file fails, the fail error bit is set, and all statements to read from the file will be ignored. NO error message will be generated but the program will continue to execute. Check to be sure that the open was successful. fail() method of istream returns false. Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Input File Example ifstream sensor1; sensor1.open("sensor1.dat"); if ( sensor1.fail() ) //open failed { cerr << "File sensor1.dat could not be opened"; exit(1); //end execution of the program } Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. The ofstream Class The ofstream (output file stream) class is derived from ostream (output stream class). Thus ofstream inherits the output operator and member functions ostream. The ofstream class also defined specialized methods specific to working with files. Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Output File Streams Each data files used for output must have an ofstream object associated with it: ofstream sensor1; sensor1.open(“balloon.dat”); Or just – ofstream sensor1(“balloon.dat”); Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Output File Modes By default, opening a file for output in this way will either create the file if it doesn’t already exist or overwrite a previously existing file. If you wish to simply append new content to the previously existing file, you may open the file in append mode: sensor1.open(“balloon.dat”, ios::append); Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. More on File Objects The close() methods for both input and output stream objects should be called when the program is done with the file. The method(s) will be called automatically when the program exits. When you use a string class to represent the file name, you must use the c_str() method of string to provide the appropriate type for use in input and output open/constructor methods. Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Reading Data Files Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. File Formats To read a file, some knowledge about the contents of the file are needed: Name of the file. Order and data types of the values stored in the file. Three common structures: First-line in file contains number of lines/records in the file. Trailer signal/sentinel signal used to indicate the last line/record in the file. End-of-file used directly to indicate last line/record in the file. Copyright © 2012 Pearson Education, Inc.
Specified Number of Lines sensor1.dat 10 0.0 132.5 0.1 147.2 0.2 148.3 0.3 157.3 0.4 163.2 0.5 158.2 0.6 169.3 0.7 148.2 0.8 137.6 //open input file ifstream sensor1(“sensor1.dat”); //read the number of data entries int numEntries; sensor1 >> numEntries; //read every row double t, y; for (int i = 0; i < numEntries; i++) { sensor1 >> t >> y; //do something with the data } Copyright © 2012 Pearson Education, Inc.
Trailer/Sentinel Signal sensor2.dat 0.0 132.5 0.1 147.2 0.2 148.3 0.3 157.3 0.4 163.2 0.5 158.2 0.6 169.3 0.7 148.2 0.8 137.6 -99 -99 //open input file ifstream sensor2(“sensor2.dat”); //read every row double t, y; do { sensor1 >> t >> y; if (t < 0 || y < 0) break; //do something with the data } while (! sensor1.eof()); Copyright © 2012 Pearson Education, Inc.
Run Time File Name Entry #include <string> // Contains conversion function c_str ifstream inFile; string fileName; // Prompt: cout << “Enter input file name: “ << endl; cin >> fileName; // Convert string fileName to a C string type inFile.open(fileName.c_str());
Copyright © 2012 Pearson Education, Inc. EOF-based File Input sensor3.dat 0.0 132.5 0.1 147.2 0.2 148.3 0.3 157.3 0.4 163.2 0.5 158.2 0.6 169.3 0.7 148.2 0.8 137.6 //open input file ifstream sensor3(“sensor3.dat”); //read every row double t, y; while (! sensor3.eof()) { sensor1 >> t >> y; if (sensor3.fail()) break; //do something with the data } Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Generating a Data File Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Writing Files After opening an output file, writing to a file is no different from writing to standard output. Must decide on a file format. Sentinels can be hard to choose to avoid conflict with valid data. Knowing in advance how many lines/records are in the file is sometimes difficult or impractical. Usually best to use eof-style file format where file structure contains only valid information. Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Error Checking Copyright © 2012 Pearson Education, Inc.
Opening a File Opening a file Associates the C++ identifier for your file with the physical(disk) name for the file If the input file does not exist on disk, open is not successful If the output file does not exist on disk, a new file with that name is created If the output file already exists, it is erased
Stream Fail State When a stream enters the fail state, Further I/O operations using that stream have no effect at all The computer does not automatically halt the program or give any error message
Stream Fail State Possible reasons for entering fail state include: Invalid input data (often the wrong type) Opening an input file that doesn’t exist Opening an output file on a disk that is already full or is write-protected
Copyright © 2012 Pearson Education, Inc. Error Checking In addition to indicating if an error occurs when opening a file, fail() may also be used to detect other types of errors. Errors can occur when trying to read from an input stream when the data on the input stream isn’t of the appropriate type. Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Whitespace Whitespace is defined as blank, tab, newline, form feed, and carriage return characters. Used to separate data when reading from an input stream. For example, to read 2 integers from an input stream, there must be whitespace between the two numbers. C++ ignores the whitespace to read the numbers. If any other characters are encountered, the error (fail) state is set. Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Input Stream Errors Do not generate notification of errors! Only set the error state. Reads will not affect the state of the input buffer or alter variables. However the program will continue executing! Must “clear” the error state before additional input may be read (using istream’s clear() method). Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Stream State Indicated by a set of state flags: badbit, failbit, eofbit, goodbit. Events may alter the stream state: Event badbit failbit eofbit goodbit Initialization of a stream 1 Failure to open a file Unexpected data encountered End of file encountered Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Stream Class Methods Method Description bool bad() Returns true iff badbit is set. bool eof() Returns true iff eofbit is set. bool fail() Returns true iff failbit is set. bool good() Returns true iff goodbit is set. void clear(iostate flag = goodbit) Sets the state flags. iostate rdstate() Returns the value of the state flags. Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Data Filters – Modifying an HTML File Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Data Filters – Modifying an HTML File Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Data Filters – Modifying an HTML File Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Data Filters – Modifying an HTML File Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Data Filters – Modifying an HTML File Copyright © 2012 Pearson Education, Inc.
Numerical Technique: Linear Modeling* Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Linear Modeling Linear modeling is the name given to the process that determines the linear equation (y = mx + b) that is the best fit to a set of data points. Linear regression does this by minimizing the squared distance between the line and the data points. Copyright © 2012 Pearson Education, Inc.
Copyright © 2012 Pearson Education, Inc. Example Copyright © 2012 Pearson Education, Inc.
Solving for Linear Model Parameters Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Ozone Measurements* Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Ozone Measurements* Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Ozone Measurements* Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Ozone Measurements* Copyright © 2012 Pearson Education, Inc.
Problem Solving Applied: Ozone Measurements* Copyright © 2012 Pearson Education, Inc.