Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 346 – Chapter 12 File systems –Structure –Information to maintain –How to access a file –Directory implementation –Disk allocation methods  efficient.

Similar presentations


Presentation on theme: "CS 346 – Chapter 12 File systems –Structure –Information to maintain –How to access a file –Directory implementation –Disk allocation methods  efficient."— Presentation transcript:

1 CS 346 – Chapter 12 File systems –Structure –Information to maintain –How to access a file –Directory implementation –Disk allocation methods  efficient use, quick access –Managing free space –Efficiency, performance –Recovery

2 Structure File system is usually built on top of disks –Medium can be rewritten in place –Relatively easy to move to another place on disk Purpose of file system –Provide a user interface to access files –Define a mapping between logical files and space on a secondary storage device FS have several levels/layers of abstraction & functionality, e.g. 4 –Logical file system –File organization module –Basic file system –I/O control

3 Layers Logical file system –Maintain file’s metadata: inside a “file control block” aka “inode” –Directory data structure File-organization module –Translates between logical and physical data blocks of a file. In other words it knows everybody’s real address. –e.g. logical block numbers might always start 0 Basic file system –Manipulate specific sectors on disk. –Maintain buffers for file I/O I/O control –Device drivers give machine-language commands to device to accomplish the file I/O. –(Different file systems can use the same device drivers.)

4 FS information On disk: Boot (control) block = first block on a volume. Give inst on how to load the OS. Volume control block = “superblock” –Statistics about the volume: # of blocks, their size, how many are free and which ones Directory data structure: point to each file File control block (inode) for each file (contains what info?) In memory: Which volumes are currently mounted Cache of recently accessed directories (faster access) Which files are currently open –Per process –System-wide Buffers holding currently processing file I/O

5 Opening file open( ) system call passes file name to logical FS See if anyone else already has this file opened. –How? –What if it is? If not already open, search the directory If found, –copy file control block (inode) to system-wide open file table –Set pointer in process’ open file table (Why not the inode?) –Also in process’ table: dynamic stuff like initialize current location within file, whether opened for read or write, etc. Should we copy this to inode also? open( ) returns file descriptor (i.e. pointer to per-process table entry). Use this for future I/O on this file.

6 Multiple file systems We generally don’t have 1 file system in charge of the entire disk Disks usually have partitions… Raw partition –Where you don’t want/need to have files –Ex. Swap space; information related to backups Boot partition – should be treated special / separate –Contains program to ask user which OS to boot –Multiple OS can give rise to different FS Use “virtual file system” to manage multiple FS –Hide from user the fact  > 1 FS

7 Virtual FS See Figure 12.4 Purpose: act as an interface between the logical FS the user interacts with, and the actual local/remote file system Defines essential object types, for example –File metadata, e.g. inode –Info about an open file –Superblock: info about an entire file system –Directory entries For each type of object, set of operations defined, to be implemented by individual FS –Ex. For a file: open, read, write, …

8 Directory rep’n A question of which data structure to use Linear list? –Essentially an array of pointers (we point to the data blocks) –Advantage / Disadvantage? Other data structures are possible: any good? –Sorted list –Binary search tree; B-tree –Hash table

9 1. Contiguous allocation Advantage – few seeks needed on disk –Ex. We would like a file to reside entirely in one track if possible If you know disk address of first block, and length of file, you know where entire file is. Problems –Where to put a new file: dynamic storage allocation: best fit, worst fit, first fit –External fragmentation –Can’t predict a file deletion that would give you a better fit –Don’t know size of brand new file –Preallocating extra space: internal fragmentation Can compact (defragment) files. Tedious operation. File “extents”: a modification to contiguous scheme

10 2. Linked allocation File is linked list of disk blocks (Figure 11.6) File’s directory entry points to first & last blocks (in addition to maintaining other file attributes) Avoids disadvantages of contiguous allocation –No fragmentation, don’t need to know size in advance, … Criticism –Linked list inefficient to access data “directly” as opposed to sequentially. Ex. Editor requests to go to 3 millionth line. –What if 1 of the pointers becomes damaged? –Minor overhead from the pointer in each block. Can define “clusters” of the file to be contiguous blocks, but this suffers some fragmentation.

11 File allocation table Located at start of disk Table has an entry for each disk block –Has fixed size and is indexed by disk number –Purpose of these table entries is to point to next block in a file, like emulating a linked list with an array File’s directory entry contains the starting block number. –See Figure 11.7 Performance problem: –Need to do 2 seek operations every time you go to a new block in the file. Why? Direct access with FAT is faster than pure linked allocation. Why?

12 3. Indexed allocation The file on disk begins with an index block Index block contains pointers to the various disk blocks containing the actual data of the file. When file first created, all pointers set to null. One by one, they get initialized as file grows. File’s directory entry contains block number of the index block. See Figure 11.8 If all blocks on disk are exactly 1KB in size, how big of a file can we support using this scheme?

13 Bigger files Linked indexed allocation: Can continue pointers in another index block. In general, can have a linked list of index blocks. –How big a file can we have with 2 linked index blocks? Multilevel indexed allocation –Begin with a first-level index block. Entries contain addresses of second-level index blocks. –Each second-level index block has pointers to actual file data. –How big a file can we have? Direct & indirect indexed allocation –File’s directory entry can hold several block numbers itself. –Followed by: single indirect block, double indirect block, triple indirect block. See figure 11.9

14 Free space As long as volume is mounted, system maintains free- space “list” of unused blocks. Question is how to represent this info. Bit vector: how much space? Keep it in memory? Collect all free blocks into linked list. –We don’t typically traverse this list. Just grab/insert one. Grouping technique –First “free” block used to store addresses of n – 1 actual free blocks. Last address stores location of another indirect block of free addresses. Counting: store address along with number of contiguous free blocks

15 Efficiency Where should inodes be on the disk? All in one place or scattered about? Using linked allocation (treating data blocks as a linked list on disk) –How to keep a lid on the number of nodes in a list? –How to reduce internal fragmentation? File metadata may include the last time file accessed –How expensive is this operation? Response/alternative? Size of pointers (locations holding address) Should the system’s global tables (process, open files) be fixed or variable length?

16 Performance Some techniques to optimize disk usage Disk controller: store contents of a whole track –Why? What is necessary to accomplish this? Buffer cache and page cache –“Cache” a file’s data blocks / physical pages of virtual memory Caching the pages may be more efficient: –Pages can be individually larger than individual data blocks –Fewer computational steps to do virtual memory access than interfacing with the file system. Not too efficient to employ both kinds of caches –“Double caching problem” with memory-mapped I/O: data first arrives into a page cache because the device is paged… and then copied to/from buffer cache

17 Performance (2) Do you want writes to be synchronous or asynchronous? –Pass a parameter to open( ) system call to specify which you want. –Which is better typically? –When one is preferred over the other? Page replacement: default policy like LRU may be bad in some situations –Consider sequential access to a file: Remove a page as soon as the next one is in use. Request the next few pages in advance.

18 Recovery Need to protect from –Loss of data –Inconsistency (corruption) of data – resulting from what? Consistency checking –Scan file metadata, see if it all makes sense –See if data blocks match a file correctly: traverse all pointers –Check free block list –What if something is wrong? –Is some information more critical than others? What extra protection to give? Log transactions: tell which operations are pending, not complete Full vs. incremental backups

19 Network file system Principles –Each machine has its own file system –Client-server relationships may appear anywhere –Sharing only affects the client To access remote directory, mount it –A remote directory is inserted in place of an existing (empty) directory whose contents now becomes hidden. –It will then look & behave like part of your local file system –Supports user mobility: access your files anywhere in the network Protocol –Server has list of valid file systems that can be made available, and access rights (for each possible client)


Download ppt "CS 346 – Chapter 12 File systems –Structure –Information to maintain –How to access a file –Directory implementation –Disk allocation methods  efficient."

Similar presentations


Ads by Google