File Systems (1). Readings r Silbershatz et al: 10.1,10.2, 11.1-11.5.

File Systems (1)

Readings r Silbershatz et al: 10.1,10.2, 11.1-11.5

Files r Named collection of related information recorded on secondary storage m Logical unit of storage on a device m e.g., helloworld.c, resume.doc r Can contain programs (source, binary) or data r Files have attributes: m Name, type, location, size, protection, creation time etc

File Naming r Files are named r Even though files are just a sequence of bytes, programs can impose structure on them m Files with a certain standard structure imposed can be identified using an extension to their name m Application programs may look for specific file extensions to indicate the file’s type m But as far as the operating system is concerned its just a sequence of bytes

File Types – Name, Extension

File Attributes Possible file attributes

Basic File Operations r Create r Write r Read r Delete r Others m A reposition within the file, append, rename

Basic File Operations r For write/read operations, the operating system needs to keep a file position pointer for each process r All these operations require that the directory structure be first searched for the target system

Directories r File systems use directories to keep track of files r Directory operations m File search m File creation m File deletion m Directory listing m File renaming m File system traversal r Directories are just another type of file.

Logical to Physical View of Files r We just briefly described the logical (user) view of files and directories r Files are stored on secondary storage. r The file system is the mapping from the logical view to the files on secondary storage r File systems require their own algorithms and data structures to support the mapping

File System r There are many file systems r Unix uses the UNIX file system (UFS) r Windows uses FAT, FAT32 and NTFS (Windows NT File system) r Linux file system is known as the extended file system with versions denoted by ext2 and ext3 r Google created its own file system to meet its needs.

File System Structures

File System Overview r File system requires several on-disk and in- memory structures for its implementation r The structures vary depending on the OS and the file system but there appears to be similar general principles

File Control Block r Information about a file may be maintained in File control block (FCB): r One FCB per file r A FCB is associated with a unique identifier r Consists of details about a file

A Typical File Control Block

Directory r A directory structure is associated with each file system r This is used to organize files

Directory Implementation r Linear list of file names with pointer to the data blocks. m simple to program m time-consuming to execute r Hash Table – linear list with hash data structure. m decreases directory search time m collisions – situations where two file names hash to the same location m fixed size

Memory Structures r Directory-Structure Cache: This holds directory information about recently accessed directories r System-wide open file table: Contains a copy of the FCB of each open file r Per-process open-file table: Contains a pointer to the appropriate entry in the system-wide open-file table r Buffers for holding file-system blocks when they are being read from disk or written to disk

File Creation r To create a new file, an application uses the file system r Creation requires the allocation of a new File Control Block (FCB) r The system reads the appropriate directory into memory and updates it with the new file name.

In-Memory File System Structures r The figure is not just for the read file operation but also for other operations e.g., open, write

File Opening r What happens when an open() is called in an application program? r The OS searches the system-wide open- file table to see if the process is in use by another process

File Opening r If the file is in use by another process? m A per-process open-file table entry is created pointing to the existing system-wide open-file table entry r If the file is not in use by another process? m Search the directory structure for the given file name m When found the FCB is copied into a system- wide open-table m A per-process open-file table entry is created pointing to the existing system-wide open-file table entry

File Opening r The open() call returns a pointer (file descriptor) to the appropriate entry in the per-process file-system table r All file operations are performed via this pointer r The entry in the process file-system table may consist of additional information e.g., m Read, write information

File Opening r When a process closes the file, the per- process table entry is removed r A count associated with the system-wide entry’s open count is decremented r In some operating systems, the same structures are used for network programming m Unix: Sockets r Now let us look a file-system layout

File System Layout r File systems usually are stored on disks r Most disks can be divided up into partitions m Independent file systems (for different operating systems) on each partition r Sector 0 of the disk is the Master Boot Record (MBR) r The end of the MBR contains the partition table m Gives the start and end of each partition m One of the partitions is marked as active

File System Layout r The MBR program reads in and executes the code in the MBR r This first thing that is determined is the active partition m The first block of the active partition is read in (boot block) m This program loads the OS contained in that partition

File System Layout Unix File System The Unix Superblock has the root directory

Allocation Methods

r Many files are stored on the same disk r The main problem is how to allocate space so that m Disk space is used effectively and m Files can be access quickly

Allocation Methods r Files are sequences of bytes m Granularity of file I/O is bytes r Disks are arrays of sectors (512 bytes) m Granularity of disk I/O is sectors m File data must be stored in sectors

Allocation Methods r File systems may also define a block size m Block size usually consists of a number of sectors m Contiguous sectors are allocated to a block r File systems view the disk as an array of blocks m Must allocate blocks to file m Must manage free space on disk

Allocation Methods r Approaches to allocating blocks to a file m Contiguous Allocation m Linked List Allocation m Linked List Allocation using Index m Index Allocation

Contiguous Allocation Start and length are stored in the FCB

Contiguous Allocation r Store each file on a contiguous set of disk blocks r Assume a disk of 1-KB blocks m 50 KB file is allocated 50 consecutive blocks r The file’s FCB consists of m Disk address of the first block m Number of blocks in the file r The directory entry often includes file name and the disk address of the first block

Contiguous Allocation r Sequential file access m File system remembers the disk address of the last block reference m When necessary the next block is read r Direct access m For direct access to block i of a file that starts with block b, access block b+i

Contiguous Allocation r Advantages: m Easy to implement m Read performance is excellent

Contiguous Allocation (a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed

Contiguous Allocation r Disadvantages m Fragmentation m Will need periodic compaction (time-consuming) m Will need to manage free lists m If new file is put at end of disk No problem m If new file is put into a “hole”... Have to know a file’s maximum possible size... at the time it is created!

Contiguous allocation r Good for CD-ROMs, DVDs m All file sizes are known in advance m Files are never deleted

Linked List Allocation

r With linked allocation, each file is a linked list of disk blocks r The disk blocks may be scattered anywhere on the disk r The FCB contains pointers to the first and last blocks of the file r Each block contains a pointer to the next block

Linked List Allocation r Advantage m No fragmentation (except internal fragmentation) r Disadvantage m Random access is slow To get to block n, the operating system has to start at the beginning and read the n-1 blocks prior to it

Linked List Allocation Using Index r Take the pointer word from each disk block and put it in a table in memory r The chains are terminated with a special marker (-1). r The table in main memory is called a FAT (File Allocation Table)

Linked List Allocation Using Index File-Allocation Table

Linked List Allocation using Index Linked list allocation using a file allocation table in RAM Fast Random Access

Linked List Allocation using Index

Linked List Allocation Using Index r Advantage m Chain must still be followed to find a given offset within the file, but the chain is entirely in memory No disk references are needed r Disadvantage m Entire table must be in memory all the time m What if you have a 20-GB disk and a 1-KB block size? The table needs 20 million entries, one for each of the 20 million disk blocks Each entry is a minimum of 3 bytes which means 60 MB of memory r This technique was used in MS-DOS and Windows- 98

Indexed Allocation r Linked allocation does not support efficient direct access m Pointers to the blocks are scattered with the blocks themselves r Indexed allocation addresses this problem with bringing all pointers together into an index block

Indexed Allocation r Each file has its own index block r An index block is an array of disk-block addresses r The i th entry in the index block points to the i th block of the file r The directory structure contains the addresses of index blocks

Example of Indexed Allocation

Index Allocation r Every file must have an index block m Index block should be as small as possible m Too small  may not be able to hold enough pointers for a large file m Too large  Waste space

Index Allocation Mechanisms r Linked List m Link several index blocks m An index block might contain: A small header giving the name Set of the first 100 disk-block addresses The next address is nil or a pointer to another index block

Index Allocation Mechanisms r Multilevel index m Use a first-level index block to point to a set of second-level of index blocks m To access a block, the OS uses the first-level index to find a second-level index block m The second-level index block is used to find the desired data block

Index Allocation Mechanisms r Combined m First index block Contains N pointers to direct blocks – contain addresses of disk blocks that contain file data –Good for small files Contains 3 pointers to indirect blocks –Single indirect block: index block with addresses of blocks with data –Double indirect block: Contains the address of a block that contains the addresses of blocks that contain pointers to the actual data blocks –Triple indirect block

Combined Scheme: UNIX (4K bytes per block) FCB – in Unix this is called i-Node

Index Allocation Mechanisms r Combined m First index block Contains N pointers to direct blocks – contain addresses of disk blocks that contain file data –Good for small files Contains 3 pointers to indirect blocks –Single indirect block: index block with addresses of blocks with data –Double indirect block: Contains the address of a block that contains the addresses of blocks that contain pointers to the actual data blocks –Triple indirect block

Index Allocation r What happens when a file needs more blocks? m Reserve the last disk address for the address of a block containing more disk block addresses

I-node An I-node is Unix’s FCB

The UNIX I-node entries Structure of an I-Node: Attributes and addresses of disk blocks

Indexed Allocation r Advantage m Only the index block needs to be in memory when the corresponding file is open r Disadvantage m Updating the structure is more complex

Entry Lookup r Opening a file requires that the OS needs the pathname r The OS uses the pathname supplied by the user to locate the directory entry r How can we locate the root directory (the start of all paths)?

Entry Lookup r In Unix systems m The superblock (among other things) has the location of the i-node which represents the root directory r Once the root directory is located a search through the directory tree finds the desired directory entry r The directory entry provides the information needed to find the disk blocks for the requested file

Entry Lookup r This discussion focuses on Unix-related file systems r When a file is opened, the file system must take the file name supplied and locate its disk blocks r Let’s see how this is done for the path name /usr/ast/mbox

Looking up for an entry The steps in looking up /usr/ast/mbox

Entry Lookup r First the system locates the root directory m There is information for each file and directory within the root directory m A directory entry consists of the file name and i-node number r The file system looks up the first component of the path, usr, to find the i- node for /usr which is i-node 6 r From this i-node the system locates the directory for /usr/ which is in block 132

Looking up for an entry The steps in looking up /usr/ast/mbox

Entry Lookup r The system then searches for ast within the /usr directory which is block 132 r The entry gives the i-node for /usr/ast/ r From this i-node the system can find the directory itself and lookup mbox m The i-node for this file is then read into memory and kept there until the file is closed

Free Space Management

r Limited disk space m Need to reuse the space from deleted files r Bitmap m One bit for each block on the disk m If the block is free the bit is 1 else the bit is 0 m Makes it easy to find a contiguous group of free blocks m Large disks make for large bitmaps that must be kept in memory m Used in Apple machines

Free Space Management r Linked List m Link together all the free disk blocks m Keep a pointer to the first free block in a special location on the disk and cache it in the memory m Not efficient since traversal means each block should be read; this requires substantial I/O m However, traversal is not done that often m Optimization: Stores the addresses of N free blocks in the first free block;

Free Space Management r Indexing m Treat free blocks as a file m This allows for indexing r Use a combination

Summary r We have examined how files are implemented and how one particular implementation is used to support some of the file operations

File Systems (1). Readings r Silbershatz et al: 10.1,10.2, 11.1-11.5.

Similar presentations

Presentation on theme: "File Systems (1). Readings r Silbershatz et al: 10.1,10.2, 11.1-11.5."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

File Systems (1). Readings r Silbershatz et al: 10.1,10.2, 11.1-11.5.

Similar presentations

Presentation on theme: "File Systems (1). Readings r Silbershatz et al: 10.1,10.2, 11.1-11.5."— Presentation transcript:

Similar presentations

About project

Feedback