Download presentation
Presentation is loading. Please wait.
Published byClarence Cannon Modified over 9 years ago
1
File System Implementation 1 Chapter 9. File System Implementation Introduction System V File System Berkeley Fast File System Temporary File System Special-purpose File Systems Old Buffer Cache
2
File System Implementation 2 Introduction Two local general-purpose file systems –System V file system (s5fs) –Berkeley fast file system (FFS) S5fs –original UNIX file system FFS –introduced in 4.2BSD Vnode/vfs –integrated version of FFS is known as UNIX file system (ufs)
3
File System Implementation 3 System V File System On-disk layout BSinode listdata blocks boot areasuperblock Boot area –contains code required to bootstrap Superblock –contains attributes and metadata of the file system
4
File System Implementation 4 System V File System (cont) Inode list –linear array of inodes –one inode for each file –size of inode is 64 bytes –inode list has a fixed size limits the maximum number of files the partition can contain
5
File System Implementation 5 S5fs Directories Contains fixed size records of 16 bytes First two bytes: inode number Next fourteen bytes: filename Limits –65535 files per disk partition –14 characters per filename
6
File System Implementation 6 S5fs Inodes On-disk inode and In-core inode –struct dinode, struct inode FieldSize (bytes)Description di_mode di_nlinks di_uid di_gid di_size di_addr di_gen di_atime di_mtime di_ctime struct dinode 2 4 39 1 4 File type, permission, etc. number of hard links to file owner UID owner GID size in bytes array of block addresses generation number time of last access time file was last modified time inode was last changed
7
File System Implementation 7 S5fs Inodes (cont) type (4 bits)ugsrwxrwxrwx suid sgidstickyownergroupothers di_mode: indirect double indirect triple indirect inode block array 0 1 2... 10 11 12 disk Disk block:
8
File System Implementation 8 S5fs Superblock Metadata about the file system –The kernel reads the superblock when mounting the file system and stores it in memory until the file system is unmounted Contains the following information –size in blocks of the file system –size in blocks of the inode list –number of free blocks and inodes –free block list, free inode list does not keep free list completely in the superblock
9
File System Implementation 9 S5fs Kernel Organization In-core inodes –struct inode –contains all the fields of the on-disk inode, and some additional fields, such as –vnode the i_vnode field of the inode contains the vnode of the file –Device ID of the partition containing the file –Inode number of the file
10
File System Implementation 10 S5fs Kernel Organization (cont) –Flags for synchronization and cache management –Pointers to keep the inode on a free list –Pointers to keep the inode on a hash queue The kernel hashes inodes by their inode numbers, so as to locate them quickly when needed –Block number of last block read
11
File System Implementation 11 S5fs Kernel Organization (cont) hash queue 0 hash queue 1 hash queue 2 hash queue 3 i_number = 40 i_number = 268 i_number = 1056 i_number = 8 inode free list i_number = 73 i_number = 17 i_number = 593 i_number = 11 i_number = 199 i_number = 27 i_number = 103 i_number = 86
12
File System Implementation 12 S5fs Inode Lookup Lookuppn( ) –in the file-system-independent layer –performs pathname parsing –parses one component at a time, invoking VOP_LOOKUP operation –when searching an s5fs directory, translates to a call to s5lookup( ) function s5lookup( ) –Check the directory name lookup cache In case of a cache miss, it reads the directory one block at a time, searching the entries for the specified file name
13
File System Implementation 13 S5fs Inode Lookup (cont) –If the directory contains a valid entry for the file, s5lookup( ) obtains the inode number from the entry –Calls iget( ) to locate that inode and initializes the vnode –Finally, iget( ) returns a pointer to the inode to s5lookup( ). s5lookup( ), in turn, returns a pointer to the vnode to lookuppn( )
14
File System Implementation 14 S5fs File I/O read and write system calls –accept a file descriptor (the index returned by open) File descriptor –used as an index into the descriptor table to obtain the pointer to the open file object (struct file) –the kernel obtains the vnode pointer from the file structure Before starting I/O –the kernel invokes VOP_WRLOCK operations to serialize access to the file
15
File System Implementation 15 S5fs File I/O (cont) The kernel then invoke VOP_READ or VOP_WRITE operation –This results in a call to s5read( ) or s5write( ) In case of s5read( ) –s5read( ) translates the starting offset to the logical block number –it then reads the data one page at a time by mapping the block into the kernel virtual address space and calling uiomove( ) to copy the data into user space
16
File System Implementation 16 S5fs File I/O (cont) uiomove( ) calls the copyout( ) routine to perform the actual data transfer if the page is not in memory, copyout( ) will generate a page fault the page fault handler will invoke VOP_GETPAGE operation on its vnode in s5fs, VOP_GETPAGE is implemented by s5getpage( ) the calling process sleeps until the I/O completes –s5read( ) returns when all data has been read –the system-independent code unlocks the vnode, advanced the offset pointer in the file structure, and returns to the user
17
File System Implementation 17 Allocating and Reclaiming Inodes An inode remains active as long as its vnode has a non-zero reference count When the count drops to zero, the file- system-independent code invokes the VOP_INACTIVE operation which frees the inode When an inode becomes inactive, the kernel puts it on the free list, but does not invalidate it
18
File System Implementation 18 Analysis of s5fs Simple design introduces problems in –reliability, performance, functionality Reliability –superblock contains vital information about the entire file system Performance –s5fs groups all inodes together at the beginning of the file system accessing a file requires reading the inode then the file data, causes a long seek on the disk e.g. ls -l causes a random disk access pattern
19
File System Implementation 19 Analysis of s5fs (cont) –Disk block allocation is also suboptimal After the file system has been used for a while, the order of blocks in the free block list becomes completely random This slows down sequential access operations on files, since logically consecutive block may be very far apart on the disk –Restricting of file names to 14 characters
20
File System Implementation 20 Berkeley Fast File System Address many limitation of s5fs Hard disk structure –platter, disk head, track, sector, cylinder –head seek, rotational latency FFS on-disk organization –FFS divides the partition into one or more cylinder groups, each containing a small set of consecutive cylinders This allows UNIX to store related data in the same cylinder group to minimize disk head movement
21
File System Implementation 21 Berkeley FFS (cont) –Superblock is divided into two structures FFS superblock contains information about the entire file system, it does not change unless the file system is rebuilt Each cylinder group has a data structure describing summary information about that group, including the free inode and free block lists. Each cylinder group contains a duplicate copy of the superblock FFS maintains there duplicates at different offsets in each cylinder group in such as way that no single track, cylinder, or platter contains all copies of the superblock
22
File System Implementation 22 FFS Blocks Blocks and Fragments –FFS allows each block to be divided into one or more fragments –The number of fragments per block may be set to 1, 2, 4, or 8, allowing a lower bound of 512 bytes, the same as the disk sector size –An FFS is composed entirely of complete blocks, except for the last block, which may contain one or more consecutive fragments –This scheme reduces space wastage, but requires occasional recopying of file data
23
File System Implementation 23 FFS Disk Allocation Allocation policies –FFS aims to colocate related information on the disk and optimize sequential access –1. Attempt to place the inodes of all files of a single directory in the same cylinder group –2. Create each new directory in a different cylinder group from it parent, so as to distribute data uniformly over the disk –3. Try to place the data blocks of the file in the same cylinder group as the inode
24
File System Implementation 24 FFS Disk Allocation (cont) –4. To avoid filling an entire cylinder group with one large file, change the cylinder group when the file size reaches 48Kbytes and again at every megabyte –5. Allocate sequential blocks of a file at rotationally optimal positions Rotational optimization tries to determine the number of sectors to skip so that the desired sector is under the disk head when the read is initiated.
25
File System Implementation 25 FFS Functionality Enhancements Long file names –maximum size of the filename is 255 characters Symbolic links, and atomic rename( ) 7 4 2 ‘f’ ‘1’ 0 0 14 8 5 ‘f’ ‘i’ ‘l’ ‘e’ ‘2’ 0 0 0 7 24 2 ‘f’ ‘1’ 0 0 padding inode number allocation size name length name plus extra space (a) initial state(b) after deleting file2 FFS Directory
26
File System Implementation 26 Analysis of FFS Substantial performance gains –read throughput 29Kbyte/sec in s5fs 221Kbytes/sec in FFS CPU utilization: 11% 43% –write throughput 48Kbytes/sec 142 Kbytes/sec CPU utilization: 29% 43% Disk space wastage –half a block per file in s5fs –half a fragment per file in FFS more space is required to monitor the free blocks and fragments
27
File System Implementation 27 Analysis of FFS (cont) Modern SCSI disks do not have fixed size cylinders –FFS is oblivious to this Overall, FFS provides great benefits –wide acceptance 4.3BSD added two types of caching to speed up name lookups
28
File System Implementation 28 Temporary File Systems Basic concepts –Many utilities and applications extensively use temporary files to store results of intermediate phases of execution –The synchronous updates are really unnecessary for temporary files, because they are not meant to be persistent –Addressed by using RAM disks, which provide file systems that reside entirely in physical memory (dedicating a large amount of memory) –RAM disks are implemented by a device driver that emulates a disk
29
File System Implementation 29 Temporary File Systems (cont) Two implementations –Memory File System (mfs) –tmpfs File System mfs –Developed by UC Berkeley –Entire file system is built in the virtual address space of the process that handled the mount operation –This process does not return from the mount call, but remains in the kernel, waiting for I/O requests to the file system
30
File System Implementation 30 Temporary File Systems (cont) –Each mfsnode, which is the file-system- dependent part of the vnode, contains the PID of the mount process, which now functions as an I/O server –The pages of the mfs files compete with all other processes for physical memory –Using a separate process to handle all I/O requires two context switches for each operation –The file system still resides in a separate address space, which means we still need extra in-memory copy operations
31
File System Implementation 31 Temporary File Systems (cont) tmpfs file system –Developed by Sun Microsystems –Combined the powerful facilities of the vnode/vfs interface and the new VM architecture –tmpfs is implemented entirely in the kernel –All file metadata is stored in non-paged memory, dynamically allocated from the kernel heap –The data blocks are in paged memory and are represented using the anonymous pages facility in the VM subsystem
32
File System Implementation 32 Temporary File Systems (cont) –Each page is mapped by an anonymous object (struct anon), which contains the location of the page in physical memory or on the swap space –The tmpnode, which is the file-system- dependent object for each file, has a pointer to the anonymous map (struct anon_map) for the file –Pages can be swapped out by the paging system and compete for physical memory
33
File System Implementation 33 Temporary File Systems (cont) –Advantages of tmpfs does not use a separate I/O server and thus avoids wasteful context switches holding the metadata in unpaged kernel memory eliminates the memory-to-memory copies and some disk I/O the support for memory mapping allows fast, direct access to file data
34
File System Implementation 34 Locating tmpfs pages struct vnode struct tmpnode struct anon_map struct anon page page in memory swap area on disk
35
File System Implementation 35 Special-Purpose File Systems The specfs file system –Provides a uniform interface to device files –The primary purpose of specfs is to intercept I/O calls to device files and translate them to calls to the appropriate device driver routines The /proc file system –Provides an elegant and powerful interface to the address space of any process The processor file system –Provides an interface to the individual processors on a multiprocessor machine
36
File System Implementation 36 Old Buffer Cache Background –Traditional UNIX systems use a dedicated area in memory called block buffer cache to cache blocks accessed through file system –Backing store of a cache is the persistent location of the data –A cache can be write-through or write-behind –write-through cache writes out modified data to the backing store immediately –write-behind: modified blocks are simply marked as dirty, and written to the disk at a later time
37
File System Implementation 37 Old Buffer Cache (cont) Advantages –Reduce disk traffic and eliminate unnecessary disk I/O –Synchronizes access to disk blocks through the locked and wanted flags Disadvantages –The write-behind nature of the cache means the data may be lost if the system crashes –Reducing disk access greatly improves performance, but the data must be copied twice disk buffer, then buffer user address space e.g. cache wiping problem
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.