Text and Binary File Processing 程式設計 潘仁義 CCU COMM
C File I/O Overview(1/3) C Library User Program OS File System File
C File I/O Overview(3/3) 開啟檔案 存取檔案 關閉檔案 - stdio.h 與檔案相關之函式- FILE *fopen( const char *filename, const char *mode); 存取檔案 int fgetc( FILE *fp ); int fputc( int c, FILE *fp ); int fprintf( FILE *fp, const char *format, … ); int fscanf( FILE *fp, const char *format, … ); size_t fread(void *ptr, size_t size, size_t n, FILE *fp ); size_t fwrite(const void *ptr, size_t , size_t n, FILE *fp ); char *fgets( char *s, int n FILE *fp ); int fputs( const char *s, FILE *fp); int fseek( FILE *fp, long offset, int whence); long ftell( FILE *fp ); Int feof(FILE *fp); … 關閉檔案 int fclose( FILE *fp); - stdio.h 與檔案相關之函式-
12.1 Input and Output Files, Review text file a named collection of characters saved in secondary storage input (output) stream continuous stream of character codes representing textual input (or output) data (FILE *) stdin system file pointer for keyboard’s input stream (FILE *) stdout, stderr system file pointers for screen’s output stream
TABLE 12.2 Placeholders for printf Format Strings Used for Output of %c a single character %s a string %d an integer (in base 10) %o an integer (in base 8) %x an integer (in base 16) %f a floating-point number %e a floating-point number in scientific notation %E %% a single % sign
TABLE 12.1 Meanings of Common Escape Sequences new line ‘\t’ tab ‘\f’ form feed (new page) ‘\r’ return (go back to column 1 of current output line) ‘\b’ backspace 語言
TABLE 12.4 Comparison of I/O with Standard Files and I/O with User-Defined File Pointers Line Functions That Access stdin and stdout Functions That Can Access Any Text File 1 scanf(“%d”, &num); fscanf(infilep,“%d”, &num); 2 printf(“Number=%d\n”,num); fprintf(outfilep, “Number=%d\n”,num); 3 ch=getchar(); ch=getc(infilep); 4 putchar(ch); putc(ch,outfilep);
Figure 12.1 Program to Make a Backup Copy of a Text File
Figure 12.1 Program to Make a Backup Copy of a Text File (cont’d) 可以換成feof( inp)嗎? 不行,還是有點不一樣
Figure 12.2 Input and Output Streams for File Backup Program
File Open Mode
12.2 Binary Files Binary Files (random access file) sizeof Formatted Text files contain variable length records must be accessed sequentially, processing all records from the start of file to access a particular record Binary Files (random access file) a file containing binary numbers that are the computer’s internal representation of each file component contain fixed length records can be accessed directly, directly accessing the record that is required Binary files are appropriate for online transaction processing systems, e.g. airline reservation, order processing, banking systems, sizeof operator that finds the number of bytes used for storage of a data type
The Data Hierarchy Bit - smallest data item Value of 0 or 1 Byte – 8 bits Used to store a character Decimal digits, letters, and special symbols Field - group of characters conveying meaning Example: your name Record – group of related fields Represented a struct or a class Example: In a payroll system, a record for a particular employee that contained his/her identification number, name, address, etc. File – group of related records Example: payroll file Database – group of related files
In a Random Access File … Data Data unformatted (stored as "raw bytes") in random access files All data of the same type (ints, for example) use the same memory All records of the same type have a fixed length Data not human readable
Access individual records without searching through other records Random Access Access individual records without searching through other records Instant access to records in a file Data can be inserted without destroying other data Data previously stored can be updated or deleted without overwriting. Implemented using fixed length records Sequential files do not have fixed length records
Random Access a File -- fread () fread --Transfer bytes from a file to a location in memory Function fread requires four arguments ret = fread(buffer, size, num, myptr); the number of objects read buffer: Address of first memory cell to fill size: Size of one value num: Maximum number of elements to copy from the file into memory myptr: File pointer to a binary file opened in mode “rb” using function fopen How to distinguish error and EOF? feof ( ), ferror( )
Random Access a File – fwrite() fwrite - Transfer bytes from a location in memory to a file fwrite( &number, sizeof( int ), 1, myPtr ); &number - Location to transfer bytes from sizeof( int ) - Number of bytes to transfer 1 - For arrays, number of elements to transfer In this case, "one element" of an array is being transferred myPtr - File to transfer to or from
Random Access a File – fwrite() (II) Writing structs fwrite( &myObject, sizeof (struct myStruct), 1, myPtr ); sizeof - Returns size in bytes of object in parentheses To write several array elements Pointer to array as first argument Number of elements to write as third argument
Figure 12.3 Creating a Binary File of Integers
Access Data Randomly in a Random Access File fseek Sets file position pointer to a specific position fseek( myPtr, offset, symbolic_constant); myPtr - pointer to file offset - file position pointer (0 is first location) symbolic_constant - specifies where in file we are reading from SEEK_SET - seek starts at beginning of file SEEK_CUR - seek starts at current location in file SEEK_END - seek starts at end of file ftell Return the current position in a stream ftell( myptr) SEEK_SET SEEK_CUR SEEK_END origin: 目前存取位置 位移後 offset file
12.3 SEARCHING A DATABASE database 本節請自修, 寫作業時可參考 a vast electronic file of information that can be quickly searched using subject headings or keywords 本節請自修, 寫作業時可參考
12.4 COMMON PROGRAMMING ERRORS (1/2) Remember to declare a file pointer variable (type FILE *) for each file you want to process fscanf, fprintf, getc and putc must be used for text I/O only fread and fwrite are applied exclusively to binary files getc, fscanf and fprintf take the file pointer as their first argument getc, putc, fread and fwrite take the file pointer as their last argument
12.4 COMMON PROGRAMMING ERRORS (2/2) Opening a file for output by calling fopen with a second argument of “w” or “wb” typically results in a loss of any existing file whose name matches the first argument Binary files cannot be created, viewed, or modified using an editor or word processor program Depending on OS, a read cannot follow a write, a write cannot follow a read, except using a fseek(fp, 0, SEEK_CUR) between them
Ch.12 之番外篇 懶惰的C語言 想知道檔案的大小嗎? +模式的陷阱 讀取失敗了, 再來看原因是什麼 (別的語言不一定要等到讀失敗) getc( )的EOF, fread的傳回0 用feof( ), ferror( )可分辨是 檔案結束 或是 讀取錯誤 想知道檔案的大小嗎? fseek( ) + ftell( ) stat(), fstat( ), struct stat {... st_size;…}; +模式的陷阱 交互讀寫可能會有奇怪的現象 man fopen Reads and writes may be intermixed on read/write streams in any order, and do not require an intermediate seek as in previous versions of stdio. This is not portable to other systems, however; ANSI C requires that a file positioning function intervene between output and input, unless an input operation encounters end-of-file.