Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Original by Donald Acton; Changes by George Tsiknis Unix I/O Related Text Sections: 2nd Ed: 10.1 - 10.3 and 10.5-10.10 1st Ed: 11.1 - 11.3 and 11.5-11.10.

Similar presentations


Presentation on theme: "© Original by Donald Acton; Changes by George Tsiknis Unix I/O Related Text Sections: 2nd Ed: 10.1 - 10.3 and 10.5-10.10 1st Ed: 11.1 - 11.3 and 11.5-11.10."— Presentation transcript:

1 © Original by Donald Acton; Changes by George Tsiknis Unix I/O Related Text Sections: 2nd Ed: 10.1 - 10.3 and 10.5-10.10 1st Ed: 11.1 - 11.3 and 11.5-11.10

2 Unit 9 2 Learning Outcomes At the end of this unit you should be able to:  write a program that transfers data from one standard input to a file or from a file to standard output using Unix I/O calls  write a program that transfers data from one file to another using Unix I/O calls  write the same programs as above using C’s standard I/O functions  identify situations in which performance improves if we replace Unix I/O calls with C I/O functions and vice versa

3 Unit 9 3 The Role of Unix I/O I/O : the process that copies data between memory and external devices File system works at the block level Applications work at the byte level Unix I/O converts the byte level access to block level operations Why we study it?  helps understand how I/O functions provided by programming languages work  high level I/O functions provided by programming languages are not suitable for some applications  helps understand other system concepts Application Unix I/O File System Disk Drive File System Layering

4 Unit 9 4 Unix I/O API Some of the most common Unix I/O API functions used by applications are:  open()  close()  read()  write()  lseek()

5 Unit 9 5 Opening Files Opening a file informs the kernel that an application wants to access a file Allows the kernel to set aside resources Call: int open( char* filename, int flags ) Example: char* path; // file name... int source_fd; if ((source_fd = open(path, O_RDONLY)) < 0) { perror("Open source failed:"); exit(2); }

6 Unit 9 6 Opening cont’d Flags indicating access type:  O_RDONLY : read only  O_WRONLY : write only  O_RDWR: read/write  O_CREAT: create the file if doesn’t exist  O_APPEND: write at end  etc. Can also combine the options by (bitwise- inclusive) or'ing them  i.e. O_WRONLY | O_APPEND

7 Unit 9 7 Opening cont’d Open returns a small integer called a file descriptor A file descriptor is used by Unix to identify files that are opened by the current process Application passes this value back to the kernel in subsequent requests to work with a file Each process created starts with three open files and descriptors:  0: standard input (stdin)  1: standard output (stdout)  2: standard error (stderr) contains constants STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO for them

8 Unit 9 8 Closing Files Call: int close( int fd ) Closing a file tells the kernel it may free resources associated with managing the file close returns 0 if OK, -1 if error Example: int rc; if ((rc = close(source_fd)) < 0){ perror("close"); exit(10); }

9 Unit 9 9 Reading Files Call: int read( int fd, void *buf, int n) Each open file has a notion of a current position in the stream of bytes read() copies at most n bytes from the current file position to buf and updates the file position read() returns the number of bytes read  returns -1 if error  returns 0 if end-of-file (EOF) occurs read may return fewer bytes than requested (short reads)

10 Unit 9 10 Read Example char buf[512]; int chars_read; chars_read = read(source_fd, buf, sizeof(buf)); while (chars_read > 0) { // Do something chars_read = read(source_fd, buf, sizeof(buf)); } if (chars_read < 0) { perror("Reading error:"); exit(5); }

11 Unit 9 11 Writing Files Call: int write( int fd, void *buf, int n) write() copies at most n bytes from buf to the file position and updates position Returns the number of bytes written  returns -1 if error In some cases, it is possible to have fewer bytes written than requested (short writes) without having an error  when writing to a network socket (later)

12 Unit 9 12 Writing Example while (chars_read > 0) { if (write(stdout, buf, chars_read) < chars_read) { perror("Write problems:"); exit(4); } // Do another read and work }

13 Unit 9 13 Seek Call long lseek(int fd, long offset, int whence) Causes the logical position in the file to change  i.e. where the next read or write will commence from whence determines how position will change:  SEEK_SET : pointer is set to offset bytes.  SEEK_CUR: pointer is set to its current location plus offset.  SEEK_END: pointer is set to the size of the file plus offset.  SEEK_HOLE: returns the offset of next hole (contiguous range of bytes with value of zero) greater than or equal to the supplied offset  SEEK_DATA: pointer is set to start of the next non-hole region greater than or equal to the supplied offset

14 Unit 9 14 Seek example long new_offset; new_offset = lseek(fd, 2346, SEEK_CUR); new_offset = lseek(fd, 10, SEEK_SET); new_offset = lseek(fd, 25, SEEK_END);

15 Unit 9 15 Unix I/O Example Simple program that copies contents of file named by argument 1 to file named by argument 2 (i.e. the cp command) cs213copy fname1 [fname2]

16 Unit 9 16 Pseudo Code open argument 1 for input open argument 2 for output (if present) if arg 2 is present then connect stdout to this file read from input while read succeeds  write to stdout  read from input

17 Unit 9 17 Example: Unix Copy Command #include int main(int argc, char **argv) { if (argc <= 1) { printf("Usage: cs213cp source_file [destination_file]\n"); exit(1); } int source_fd; if ((source_fd = open(argv[1], O_RDONLY)) < 0) { perror("Open source failed:"); exit(2); }

18 Unit 9 18 Example (cont’) int dest_fd; if (argc > 2) { if ((dest_fd = open(argv[2], O_WRONLY | O_CREAT, 0600)) < 0) { perror("Destination open failed:"); int rc; if ((rc = close(source_fd)) < 0) { perror("close"); exit(10); } exit(3); } dup2(dest_fd, STDOUT_FILENO); }

19 Unit 9 19 Example (cont’) char buf[512]; int chars_read; chars_read = read(source_fd, buf, sizeof(buf)); while (chars_read > 0) { if (write(STDOUT_FILENO, buf, chars_read) < chars_read) { perror("Write problems:"); exit(4); } chars_read = read(source_fd, buf, sizeof(buf)); } if (chars_read < 0) { perror("Reading error:"); exit(5); }

20 Getting File Metadata In Linux, most of the file information that is in an inode, is stored in a structure called stat: struct stat { dev_t st_dev;/* Device. */ ino_t st_ino;/* File serial number.*/ mode_t st_mode;/* File mode. */ nlink_t st_nlink;/* Link count. */ uid_t st_uid;/* User ID of the file's owner.*/ gid_t st_gid;/* Group ID of the file's group.*/ dev_t st_rdev;/* Device number, if device. */ off_t st_size;/* Size of file, in bytes. */ blksize_t st_blksize;/* Optimal block size for I/O. */ blkcnt_t st_blocks;/* Number 512-byte blocks allocated. */ time_t st_atim;/* Time of last access. */ time_t st_mtim;/* Time of last modification. */ time_t st_ctim;/* Time of last status change. */ }; Unit 9 20

21 Getting File Metadata (cont) To retrieve info about a file, an application can use the following functions which are defines in : int stat(const char* filename, struct stat * buf); or int fstat(int fd, struct stat * buf); Both functions fill in buff with the file info from its stat structure In stat the file is given my its file name, while fstat accepts a file descriptor Unit 9 21

22 Unit 9 22 Unix I/O and Devices By making everything appear to be a file, the kernel can provide a single simple interface for performing I/O to a variety of devices Recall the basic operations are:  Opening and closing files –open() and close()  Changing the current file position –lseek()  Reading and writing files –read() and write()

23 Unit 9 23 Adding Other Devices Most devices tend to be producers or consumers of streams of data and fit UNIX I/O API model described Mouseproducer Joystickproducer Keyboardproducer Displayconsumer Audio deviceconsumer Tapeboth

24 Unit 9 24 New Devices Disk UNIX I/O Application File System Disk Drive KeyboardTerminalTapeAudio

25 Unit 9 25 The Kernel’s View of Open Files Calls to routines like open(), socket(), pipe(), etc. return file descriptors A file descriptor is just a small integer that is used by Unix to identify files that are opened by the current process When this “integer” is passed back to the kernel via calls like read() or write() the kernel manipulates the opened “file” the descriptor corresponds to

26 Unit 9 26 The Kernel’s View of Open Files (cont) Each process has its own separate descriptor table The file descriptor is just an index into this table Each entry in the descriptor table identifies an entry in a shared system wide open file table Each time open() succeeds, it creates an entry in the open file table

27 Unit 9 27 Open File Table Shared by all processes Each entry in the open file table contains:  a pointer to an entry in the v-node table that corresponds to that file  current position in the file, and  reference count of its usage close() decrements count v-node – virtual inode, i.e. a cache of an inode  contains the info in the file’s stat structure  may contain pointers to buffers/caches for the file/device  identifies legal operations on a file/device

28 Unit 9 28 The Kernel View of Open Files fd 0 fd 1 fd 2 fd 3 fd 4 Descriptor table (one table per process) Open file table (shared by all processes) v-node table (shared by all processes) File pos refcnt=1... stderr stdout stdin File access... File size File type File A Adapted from: Computer Systems: A Programmer’s Perspective The above is one struct in the open file table

29 Unit 9 29 Sharing Files At this point we have  File descriptors  The open file table  V-nodes It is relatively easy to explain what happens when file sharing results from:  open calls in the same process  open calls in different processes  fork()

30 Unit 9 30 Actions on open() fd 0 fd 1 fd 2 fd 3 fd 4 Descriptor table (one table per process) Open file table (shared by all processes) v-node table File pos refcnt=1... File pos refcnt=1... stderr stdout stdin File access... File size File type File access... File size File type File A File B fd = open("B",…) Adapted from: Computer Systems: A Programmer’s Perspective

31 Unit 9 31 Same File Different Process Descriptor table (one table per process) Open file table (shared by all processes) v-node table File pos refcnt=1... File pos refcnt=1... fd 0 fd 1 fd 2 fd 3 fd 4 stderr stdout stdin File access... File size File type File A fd = open("A",…) fd 0 fd 1 fd 2 fd 3 fd 4 stderr stdout stdin Adapted from: Computer Systems: A Programmer’s Perspective File B File access... File size File type...

32 Unit 9 32 Same File Same Process Descriptor table (one table per process) Open file table (shared by all processes) v-node table File pos refcnt=1... File pos refcnt=1... fd 0 fd 1 fd 2 fd 3 fd 4 stderr stdout stdin File access... File size File type File A fd = open("A",…); Adapted from: Computer Systems: A Programmer’s Perspective

33 Unit 9 33 Close() Empty fd 0 fd 1 fd 2 fd 3 fd 4 Descriptor table (one table per process) Open file table (shared by all processes) v-node table (shared by all processes) File pos refcnt=1... File pos refcnt=1... stderr stdout stdin File access... File size File type File access... File size File type File A File B close(4); refcnt=0

34 Unit 9 34 I/O Redirection deas:> ls > /tmp/out The above causes standard output (file descriptor 1) to be set to /tmp/out fd 0 fd 1 fd 2 fd 3 fd 4 Process file descriptor table stderr stdout stdin File pos refcnt=4 terminal File access... File size File type File access... File size File type File pos refcnt=1... /tmp/out refcnt=3... Adapted from: Computer Systems: A Programmer’s Perspective

35 Unit 9 35 dup2 The Unix system call dup2, which has the form dup2(fd, newfd) copies fd to newfd in the descriptor table. a b fd 0 fd 1 fd 2 fd 3 fd 4 b b fd 0 fd 1 fd 2 fd 3 fd 4 dup2(4,1) Adapted from: Computer Systems: A Programmer’s Perspective

36 Unit 9 36 dup2 example Process file descriptor table File pos terminal File access... File size File type File access... File size File type File pos... /tmp/out... int fd = open("/tmp/out",…); dup2(fd,1); close(fd); refcnt=1 fd 0 fd 1 fd 2 fd 3 fd 4 refcnt=0 refcnt=2

37 Unit 9 37 Pipe and fork fd 0 fd 1 fd 2 fd 3 fd 4 Parent v-node table stderr stdout stdin fd 0 fd 1 fd 2 fd 3 fd 4 Child stderr stdout stdin KeyboardTerminalpipe1pipe2 pipe() fork() dup2(3,1) close(4) dup(3, 0) close(3) close(4) close(3) Data ParentChild

38 Unit 9 38 How can we Improve Performance? Given what we know, are there interesting things we can do at the application layer to speed things up? A system call is several orders of magnitude more expensive than a function call Application Unix I/O File System Disk Drive File System Layering

39 Unit 9 39 Caching in the Application Applications can use caching to improve performance just like the kernel Most I/O has both  Spatial locality  Temporal locality Application level functions in the Standard I/O library of C take advantage of this and define I/O as a stream  use a buffer to keep the part of the file recently used  update the buffer when is needed All these functions are declared in the header stdio.h We'll call this standard I/O(stdio) or stream I/O to distinguish it from Unix I/O Unix I/O File System Disk Drive File System Layering Buffered I/O Application

40 Unit 9 40 STDIO Each Unix I/O call has a corresponding stdio call which is a standard C library function  open()  fopen()  close  fclose()  read()  fread()  write()  fwrite() Instead of returning a file descriptor fopen() returns a FILE * The FILE struct contains:  actual file descriptor  pointer to a buffer (called stream buffer)  position in buffer  other bookkeeping information

41 Unit 9 41 How it works - writes When fwrite() is called bytes are copied to the stream buffer If the stream buffer fills during the fwrite()  write() is called to “write” the stream buffer  stream buffer is cleared

42 Unit 9 42 fwrite() Buffer Buffer offset fd Kernel boundary write() Cached File Block

43 Unit 9 43 How it works - reads When fread() is called bytes are copied from the stream buffer to the application designated location If the stream buffer empties during the fread()  read() is called to refill the stream buffer  position in stream buffer is reset

44 Unit 9 44 fread() Buffer Buffer offset fd Kernel boundary read() Cached File Block

45 Unit 9 45 Analysis Costs over doing a system call  Need extra buffer space  One extra set of copies  Bookkeeping to ensure the stream buffer exactly matches real file location  I/O to random locations can be inefficient Advantage over system call  If application I/O requests much less data than underlying buffer holds then greatly reduces the number of system calls  System calls are very expensive Common practice:  use standard I/O everywhere except with sockets (networks)  use Unix I/O in networks

46 Summary Unix provide a small number of system calls for applications to deal with files and devices:  open(0,close(),read(),write(),lseek()  functions stat(),fstat() for file information Unix uses three data structures to maintain open files:  a descriptor table in each process  the open file table shared by all processes  the v-node table shared by all processes These structures allow easy file sharing and redirection The standarc I/O library is implemented on top of Unix I/O and provides buffered I/O which is good for most applications except networking Unit 9 46


Download ppt "© Original by Donald Acton; Changes by George Tsiknis Unix I/O Related Text Sections: 2nd Ed: 10.1 - 10.3 and 10.5-10.10 1st Ed: 11.1 - 11.3 and 11.5-11.10."

Similar presentations


Ads by Google