Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 11.2 File System Implementation – Part 2.

Similar presentations


Presentation on theme: "Chapter 11.2 File System Implementation – Part 2."— Presentation transcript:

1 Chapter 11.2 File System Implementation – Part 2

2 11.2/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 11: File System Implementation Chapter 11: File System Implementation Chapter 11.1 File-System Structure File-System Implementation Directory Implementation Chapter 11.2 Allocation Methods Chapter 11.3 Free-Space Management Recovery Log-Structured File System

3 11.3/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts 11.4 Allocation Methods An allocation method refers to how disk blocks are arranged that store file data (records). There are three primary approaches: Contiguous allocation Linked allocation Indexed allocation

4 11.4/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Contiguous Allocation of Disk Space Each file occupies a set of contiguous blocks on the disk Blocks occupy a linear ordering, and disk head movements (a disk seek), are only to next sectors on track or to the next track within cylinder, etc. Number of disk seeks is therefore minimal since all blocks are kept together. Directory entry typically has address of first block and the number of blocks only. This is all that is needed. File access is very straightforward. For sequential access, the file system keeps track of the last block referenced and can readily read the next block (see FCB format). For random access to some specific block, given that we want block i and we typically start at block b, we can go very quickly to block b + i. Biggest problem: file growth. Is totally new space required or other mechanism? Ahead. Extents may help, but still a significant problem… Let’s look and see what a file might look like…

5 11.5/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Contiguous Allocation of Disk Space - Visual Can easily see starting block number and number of blocks for each file. See ‘count’ starts at 0 on the disk. ‘Mail’ starts at block 19 for six blocks. All allocations are contiguous! Note: there are holes! This is simplistic, however.

6 11.6/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Contiguous Allocation of Disk Space Finding Space – allocation schemes : Both first fit and best fit work pretty well, with first fit generally a bit better. (We will see how the system keeps track of available blocks ahead…)  Worst fit is undesirable in terms of time and storage utilization. All contiguous allocation schemes have external fragmentation issues. Could be a major or minor problem in managing an overall disk resource. Down Side. Generally all installations have a downtime during low system usage where the disk can be compacted and external fragments brought together during a disk compaction activity. Can be done off-line – generally best. Users get a ‘warning’ of imposing ‘non-availability’ like at 3am, etc. Save your files, the system will not be available for a while. Disks can be ‘reorganized’ and garbage collected… We have ‘periodic maintenance’ and ‘system saves’ and compaction…… More later…

7 11.7/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Extent-Based Systems How much space is needed for the file? Oftentimes we do not know! Lots of times, files cannot be extended ‘in place.’ So, what to do? Can take system offline, allocate more space; move the data, and then restart the system Very costly in run time. We often overestimate required space – can be very wasteful, especially if all the ‘required’ newly requested space is really not used / needed. Can find a totally larger space, copy the file into the new space and release old space. But this involves down time, possibly rerunning a process, and other management considerations. Some systems use extent-based file systems and they allocate disk blocks in extents An extent is a contiguous block of disks A file consists of a basic allocation plus one or more extents. IBM uses a SPACE parameter: A process requests an original allocation of say 10 tracks and 2 possible extents of one track each. Ten are allocated and two are held in reserve and used if needed. Extents are ‘linked in’ as needed.

8 11.8/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Linked Allocation of Disk Space Here, in linked allocation, we no longer have problems with contiguous allocation scheme. Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. Directory will point to the first block, and each block points to the next block. (of course, links take some of the space in the block) For a New file: create a new entry in the directory – no final size is needed. Pointer is set to null and each request requires the space management system to find a block and link it in. No external fragmentation, and file can grow. Disk need not be compacted due to this kind of allocation. Major Disadvantage: Cannot be used for random access – only sequential access. We must follow the pointers until we find the desired block. Not efficient if we need a direct-access capability.  Also pointers do take up some space, if one adds them up!

9 11.9/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Linked Allocation of Disk Space - Clusters Lots of times clusters of blocks are allocated. If so, the pointers will occupy much less space, and efficiency is improved because the cluster of blocks are located in contiguous locations. But, of course, this means there’s a possibility of external fragmentation. Clusters are nevertheless used in most systems. There are a lot of inherent dangers is present in a linked allocation:  dropping a pointer. Could link into a protected area Could link into some other file Could simply lose your data!!! Potential Solution - often used: have a doubly-linked list Potential Solution2 – store the file name and relative block number in each block – but this requires more space! And these links add up! So there are issues with linked allocation. Let’s see what linked allocation looks like….

10 11.10/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Linked Allocation - Visual Note: Starting location only is stored in the directory. All else is linked! Why might you think that in addition to the starting link, only the last link is stored in directory??

11 11.11/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Linked Allocation with File Allocation Table. Many disks use a FAT (File Allocation Table), which is a data structure on disk and located at the beginning of each volume. The directory has one entry per file, and this entry points into the FAT for a particular file reference. (The FAT is indexed by block number) The FAT entry contains the address of the ‘next’ block in the file for random access. Final block in the table has a special end of file mark. (See next slide)  Remember: linked allocation only permits sequential access! Unused blocks in the FAT have a 0 table value. When more space is needed for the linked file, the file management system finds an available block (value 0 in the FAT) and moves that block number to the previous block’s EOF value. (simply a singly-linked list…) Downslide: This scheme may result in a lot of disk head movement, which definitely slows things down. Solution: Cache the FAT for sure. Advantage: random-access is greatly improved because any block can be accessed via the FAT access, particularly if the FAT is in cache, if we know the block number.

12 11.12/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts File-Allocation Table - Visual Indexed by block number.

13 11.13/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Indexed Allocation of Disk Space In linked allocation, we don’t have the external fragmentation problem and we don’t have the size declaration problem, but we also do not have direct access capability without the FAT because the pointers to the blocks are within the blocks and hence must be retrieved. Indexed Allocation brings all pointers (links) together into the index block. Each file has its own index built as an array of block addresses. To access a block, we use the index,  search the index for a hit, and  hit (if present) will point to the disk location for that block.

14 11.14/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Indexed Allocation of Disk Space Indexed allocation supports direct access w/no external fragmentation. Any free block will suffice when a block needs to be added to the file. Pointer overhead is more than linked allocation because we actually have a separate file: the index. This index itself will occupy at least one block of disk storage. (Of course, it can be cached during use – and generally is.) So how large should the index block be? Want it to be small, since every indexed file will have one, but we want a sufficient number of entries to support large file access. Want it to be large? Might need to link several index blocks. Several implementations of this, as we shall see.

15 11.15/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Example of Indexed Allocation - Visual Shows recods in block 19 as well as unused space…

16 11.16/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Structure of the Index Block Linked Scheme: usually one-block long, but we can link blocks (that is, several ‘indices’) for particularly large files. (very large files.) Multilevel index: First index block may only be a set of pointers to a second level index block. These in turn point to the data blocks. IBM uses this organization for its indexed sequential files, which it calls Key Sequenced Data Sets (KSDS).  It calls the outermost block the index set, followed by the sequence set followed by the data themselves organized into what they call control areas and control intervals…  Note: a two-level index would allow a file size of up to 4GB (with 4K blocks). Combined Scheme: (used by Unix) keeps the first set of pointers of the index block in the file’s inode This scheme involves a number of direct and indirect blocks and we will not spend time on this one.

17 11.17/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Indexed Allocation – Mapping (Cont.)  outer-index index table file General mappings with multiple indices Some systems have ‘coarse indices followed by ‘fine’ indices, etc….

18 11.18/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts INDEX COMPONENT …... INDEX SET SEQUENCE SET CONTROL INTERVALS CONTROL AREA... DATA COMPONENT

19 11.19/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts I1I2 S1S2S3 D1D2D3D4 9/S1 62 S2 FREE 3 D1 9 D2 36 D3 62 D4 13593536424362 CONTROL INTERVALS CONTROL AREAS INDEX SET SEQUENCE SETS KEY VALUES EXTREMELY EXAGGERATED!!

20 11.20/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Performance Choice of an allocation methods is largely dependent upon how the data needs to be accessed. Contiguous Allocation – requires only one access to get to the data block. Keep initial address in memory and calculate disk addresses from there. Linked Allocation – keep the address of the next block in memory and can read it directly. Major disadvantage – no random access, and access to a specific block might well require multiple reads to get ‘to’ that record. Some systems that require direct access use a contiguous allocation scheme and linked allocation for sequential access. These accesses must be declared when the file is created. Sequential files will be linked Direct access files will be contiguous and can support both direct access and sequential access, such as indexed sequential file organizations.

21 11.21/40 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Performance - 2 Indexed Allocation – If index is in memory, accesses are quick. Retaining the index in memory does require space; but often in cashe. If space is available, then this is good. If space is not available, then the index and the data require two I/Os – and this is not desirable.  For multiple index blocks, more reads might be needed. Performance using indexed allocation depends on the index structure, the size of the file, and the position of the block desired.  Caching the index file(s) is significantly helpful if space is available. There are a number of other approaches at optimization. Your book cites that oftentimes it is not unreasonable to add thousands of extra instructions to the operating system to save just a few disk-head movements. “Furthermore, this disparity is increasing over time, to the point where hundreds of thousands of instructions reasonably could be used to optimize head movements.” Discuss.

22 End of Chapter 11.2


Download ppt "Chapter 11.2 File System Implementation – Part 2."

Similar presentations


Ads by Google