Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 11: Implementing File Systems

Similar presentations


Presentation on theme: "Chapter 11: Implementing File Systems"— Presentation transcript:

1 Chapter 11: Implementing File Systems
File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery (skip 11.8, 11.9) Objectives To describe the details of implementing local file systems and directory structures To discuss block allocation and free-block algorithms and trade-offs

2 11.1 File-System Structure
Disk characteristics for storing multiple files Can be rewritten in place Can access directly any block of information it contains File structure Logical storage unit Collection of related information File system resides on secondary storage (disks) Allow the data in disk to be stored, located, and retrieved easily How the file system should look to the user How to map the logical file system to the physical secondary storage devices File system organized into layers File control block (FCB) – storage structure consisting of information about a file

3 Layered File System System calls like create( ), open( ), close( )
Manages metadata information Translates logical block addresses to physical block addresses Device driver

4 11.2 File-System Implementation
On-disk and in-memory structures are used to implement a file system On disk A boot control block A volume control block A per-file FCB: file permissions, ownership, size, and location of the data blocks In UNIX File System, it is called inode In Windows NTFS, it is stored as a record in master file table A directory structure In Unix File System, this include file names and associated inode numbers In NTFS, it is stored in the master file table

5 11.2 File-System Implementation
In-memory information is used for file-system management and performance improvement via caching In-memory mount table In-memory directory-structure cache System-wide open-file table A copy of the FCB for each open file Per-process open-file table A pointer to the appropriate entry in system-wide open-file table In Unix, it is called a file descriptor In Windows, it is called a file handler

6 A Typical File Control Block

7 In-Memory File System Structures
The following figure illustrates the necessary file system structures provided by the operating systems. Figure 11-3(a) refers to opening a file.

8 In-Memory File System Structures
Figure 11-3(b) refers to reading a file.

9 Partitions and Mounting
A disk can be sliced into multiple partitions. A volume can span multiple partitions on multiple disks (RAID, Section 12.7) Raw disk: no file system. Used in Unix swap space, and database management systems Boot information has its own format, and is usually a sequential series of blocks, loaded as an image into memory Allow dual-booted for installing multiple OS The root partition, containing the OS kernel and other system files, is mounted at boot time Other volumes can be automatically mounted at boot time or manually mounted later OS maintains a mount table for mounted file systems Skip

10 11.3 Directory Implementation
Linear list of file names with pointer to the data blocks. simple to program but time-consuming to execute To create a new file, directory must be searched to be sure that no existing file has the same name. To delete a file, we search the directory for the named file, then release the space allocated to it To reuse the directory entry, several options Mark the entry as unused by Assigning it s special name Or with a used-unused bit Attach it to a list of free directory entries Copy the last entry in the directory into the freed location and decrease the length of the directory Disadvantage: finding a file requires a linear search Make a list sorted would complicate the creating and deleting of files

11 Directory Implementation
Hash Table – linear list stores the directory entries with a hash table The hash table takes a value computed from the file name and returns a pointer to the file name in the linear list decreases directory search time Some provisions must be made for collisions – situations where two file names hash to the same location Difficulties: fixed size (because it is a table) The dependence of the hash function on that size Alternatively, a chained-overflow hash table can be used instead each hash entry is a linked list instead of an individual value Collisions resolved by adding the new entry to the linked list

12 11.4 Allocation Methods An allocation method refers to how disk blocks are allocated for files. How to allocate space to these files so that disk space is utilized effectively and files can be accessed quickly Three major methods Contiguous allocation Linked allocation Indexed allocation

13 Contiguous Allocation
Each file occupies a set of contiguous blocks on the disk Simple – only starting location (block #) and length (number of blocks) are required Random access (next page) Problems Dynamic storage-allocation problem First-fit, best-fit, worst-fit Repacking off-line or on-line Determining how much space is needed for a file when it is created. If we allocate too little space to a file, it cannot be extended Pre-allocation may be inefficient

14 Contiguous Allocation
Mapping from logical to physical Q (Quotient) Logical Address/512 R (Remainder) Block to be accessed = Q + starting address Displacement into block = R

15 Contiguous Allocation of Disk Space

16 Extent-Based Systems Many newer file systems (I.e. Veritas File System) use this modified contiguous allocation scheme Extent-based file systems allocate disk blocks in extents A contiguous chunk of space is allocated initially If that amount is not large enough later, another chunk of contiguous space, called extent, is added A file consists of one or more extents. The location of a file’s blocks is recorded as a location and a block count, plus a link to the first block of the next extent

17 Pointer to the next block
Linked Allocation Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. The directory contains for each file a pointer to the first and last blocks of the file Pointer to the next block Contents of a block

18 Linked Allocation (Cont.)
Simple – need only starting address Free-space management system – no waste of space No random access Mapping Q (Quotient) Logical Address/511 R (Remainder) Block to be accessed is the Q-th block in the linked chain of blocks representing the file. Displacement into block = R + 1 File-allocation table (FAT) – disk-space allocation used by MS-DOS and OS/2.

19 Linked Allocation Disadvantages: 1. Can be used effectively only for
sequential-access files 2. The space required for the pointers. Solution: collect blocks into clusters, and allocate clusters rather than blocks. Cost is increased internal fragmentation. 3. Reliability: disaster if pointers were lost or damaged. Solution: doubly linked lists or store file name and relative block number in each block. 10 16 25 1

20 Linked Allocation Linear list of file names with pointer to the data blocks simple to program time-consuming to execute Hash Table – linear list with hash data structure decreases directory search time collisions – situations where two file names hash to the same location fixed size

21 File-Allocation Table (MS-DOS and OS/2)
Unused block: a 0 table value Allocating a new block to a file: Finding the first 0-valued table entry and replacing the previous end-of-file value with the address of the new block. The 0 table entry is then replaced by the end-of-file value. end-of-file

22 Indexed Allocation Brings all pointers together into the index block
Each file has its own index block Logical view. index table

23 Example of Indexed Allocation

24 Indexed Allocation Need index table Random access
Dynamic access without external fragmentation, but have overhead of index block With only 1 block for index table and a block size of 512 words, mapping from logical to physical in a file of maximum size of 256K (=512 * 512) words. Q (Quotient) Logical Address/512 R (Remainder) Q = displacement into index table R = displacement into block

25 Indexed Allocation If the index block is too small, it will not be able to hole enough pointers for a large file. Mechanisms to handle this issue: Linked scheme The last word in the index block is nil (for a small file) or is a pointer to another index block (for a large file) Multilevel index Use first-level index block to point to a set of second-level index blocks, which point to the file blocks Combined scheme Example: In Unix File System, for the 15 pointers of the index block in the file’s inode The first 12 point to data of the file The next three pointers point to (single, double, triple) indirect blocks

26 Indexed Allocation – Mapping (Cont.)
Mapping from logical to physical in a file of unbounded length (block size of 512 words). Linked scheme – Link blocks of index table (no limit on size). Q1 LA / (512 x 511) R1 Q1 = block of index table R1 is used as follows: Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file:

27 Indexed Allocation – Mapping (Cont.)
Two-level index (maximum file size is 5123) LA / (512 x 512) Q1 R1 outer-index index table file Q1 = displacement into outer-index R1 is used as follows: Q2 R1 / 512 R2 Q2 = displacement into block of index table R2 displacement into block of file:

28 Combined Scheme: UNIX (4K bytes per block)

29 Performance Before selecting an allocation method, we need to know how the system would be used Contiguous allocation requires only one access to get a disk block. Linked allocation is only good for sequential access. Some system supports both, but require the declaration of the type of access in file creation. Keeping index block in memory requires considerable space. The performance of indexed allocation depends on the index structure (how many level), on the size of the file, and on the position of the block desired. Some system uses contiguous allocation for small files and automatically switching to an index allocation if the file grows large Many other optimizations are in use It is reasonable to add (hundreds of) thousands of instructions to save a few disk-head movements

30 11.5 Free-Space Management
Bit vector (n blocks) 1 2 n-1 1  block[i] free 0  block[i] occupied bit[i] =  The first non-0 word (one word may be 8, 16, or 32 bits) is scanned to find the first 1 bit, which is the location of the first free block. Its block number calculation is: (number of bits per word) *(number of 0-value words) +offset of first 1 bit

31 Free-Space Management
Bit map requires extra space Example: block size = 29 bytes = 512 bytes disk size = 230 bytes (1 gigabyte) n = 230/29 =221 bits (or 218 bytes = 28 K bytes = 256K bytes) block size = 1024 bytes = 210 bytes disk size = 40 GB =10 * 232 bytes n =10*232 / 210 = 10* 222 bits (or 10*219 bytes = 5*2*219 bytes = 5 * 220 bytes = 5M bytes) Clustering the blocks in groups of four reduces this number to 64KB Easy to get contiguous files

32 Free-Space Management
Linked list (free list) Linking together all the free disk blocks, keeping a pointer to the first free block in a special location on the disk and cache it in memory. The first block contains a pointer to the next free disk block, and so on. Cannot get contiguous space easily No waste of space

33 Free-Space Management
Grouping A modification of the linked free-list Stores the address of n free blocks in the first free block. The first n-1 of these blocks are actually free The last block contains the address of another n free blocks, and so on. The addresses of a large number of free blocks can be found quickly now. Counting Keeps the address of the first free block and the number n of free contiguous blocks that follow the first block. Each entry in the free space list then consists of a disk address and a count. Useful when space is allocated with the contiguous-allocation algorithm or through clustering

34 Free-Space Management (extra material)
Need to protect: Pointer to free list Bit map Must be kept on disk Copy in memory and disk may differ Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk Solution: Set bit[i] = 1 in disk Allocate block[i] Set bit[i] = 1 in memory

35 11.6 Efficiency and Performance
Efficiency dependent on: disk allocation and directory algorithms types of data kept in file’s directory entry In UNIX, the file system’s performance is improved by pre-allocating the inodes and spreading them across the volume, and by keeping a file’s data blocks near that file’s inode block to reduce seek time. BSD UNIX varies the cluster size as a file grows to reduce internal fragmentation. Consideration of kept data in a file’s directory entry Last write date, last access date Size of pointers in the allocation list will limit the length of a file Need to plan for changing technology (FAT from 12 to16 to 32) Early Solaris needs to reboot the system to change system table sizes

36 Efficiency and Performance
Most disk controllers include local memory to form on-board cache that is large enough to store entire tracks at a time Buffer cache – separate section of main memory for blocks that will be used shortly A page cache caches pages rather than disk blocks using virtual memory techniques free-behind and read-ahead – techniques to optimize sequential access Free-behind removes a page from the buffer as soon as the next page is requested With read-ahead, a requested page and several subsequent pages are read and cached improve PC performance by dedicating section of memory as virtual disk, or RAM disk

37 Page Cache Memory-mapped I/O uses a page cache
Routine I/O through the file system uses the buffer (disk) cache unified virtual memory: Many OS’s use page caching to cache both process pages and file data This leads to the following figure

38 I/O Without a Unified Buffer Cache
A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O double caching I/O Without a Unified Buffer Cache I/O using a Unified Buffer Cache

39 Issues Whether writes to the file system occur synchronously or asynchronously Synchronous writes: the calling routine must wait for the data to reach the disk drive before it can proceed Asynchronous writes: done the majority of the time. Data are stored in the cache, and control return to the caller. Metadata writes can be synchronous. Interactions among the page cache, the file system, and the disk drivers When data are written to a disk file, the pages are buffered in the cache, and the disk driver sorts its output queue according to disk address, to minimize disk-head seeks and to write data at times optimized for disk rotation Thus, output to the disk through the file system is often faster than input for large transfers Skip: line 3 to 17 of p. 485

40 10.7 Recovery Consistency checking – compares data in directory structure with data blocks on disk, and tries to fix inconsistencies fsck in UNIX, chkdsk in MS-DOS A special program is run at reboot time to check for and correct disk inconsistencies The allocation and free-space management algorithms dictate what type of problems the checker can find and how successful it will fix them Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape, other magnetic disk, optical) Full back and incremental backup Full back and each day back up all files that have changed since the full backup Full back may have to saved forever Recover lost file or disk by restoring data from backup


Download ppt "Chapter 11: Implementing File Systems"

Similar presentations


Ads by Google