Chapter 11 – File-System Implementation (Pgs 461-499 )

File System Structure  Files are predominantly stored on Disks 1. Can be rewritten in place 2. All blocks directly accessible (c.f., CD)  But really... A. Persistence B. Accessibility C. Writeability D. Access time

Layered Systems Application(s) Files, Directories: File System OS – File Manager Device Driver, Interrupt Handlers Device + Hardware

File Representation  FCB: File Control Block – the OS representation of a file  Same as PCB representation of a process  Inode – FCB in Unix

Disk Organisation  Boot control block: typically block 1, sector 1, track 1, platter 1 – boot information  Volume control block: superblock – partition details (block size, number of blocks, blocks free, location of free block list)  Directory structure: Root directory "\" in a known location  FCBs: Inodes/Data for each actual file  Data blocks: Contents of the files

OS Data  Mount table – what partitions are mounted?  Directory cache  Open file table (system wide)  Per-Process open file table  IO Buffers Aside: Many OS/FS treat a directory as another kind of file

Disks  Are divided into sections called partitions or volumes  Partitions may contain a file system ("cooked") or store "raw" data directly  E.g., page swap partition has no file system  Boot sector typically stores the boot loader  The boot loader accesses the root partition (of OS selected) which contains the OS and its root (always mounted) file system  Other partitions are mounted as required

Logical File System  Model of File System managed by OS and visible to programs/programmers  Example: Linux Components  inode: an individual file  FILE: an open file  superblock: a file system  dentry: a directory entry  Directories: May be implemented as:  Lists: Sorted, Unsorted, B-Tree  Hash Tables

Contiguous Allocation  Disk blocks are linearly ordered  Files occupy continuous sets of blocks  Problem occurs when files are deleted, shortened, or moved creating spaces on the disk  Exactly the same issue as fitting a process into memory (best fit, first fit, etc.)  Compaction removes spaces, but creates extra work  Generally a bad idea for general purpose file systems, but may be useful for specialised OSs

Linked Allocation  Files are assigned a (potentially scattered) set of available disk blocks  A tiny portion of each block is used to store a pointer (address) of the next block  Need a second pointer to support "rewind" in a file  Slow file access because a block must be read, and then the pointer used to schedule the next read

Indexed Allocation  Use the first block to store a list (an "index") of all the blocks used  Index may waste space, but data blocks do not need pointers  Multi-level or linked approaches can be used for large files that need more than one index block  Access is faster than linked allocation, but still requires reads from many different disk locations  Indices can be cached in memory to improve performance

Free-Space Management  Generally need to know if a disk block is being used or is available  Could use a bitmap stored on the disk, with one bit per block  1 TeraB disk (with 4K blocks) requires 32 MegaB bitmap  Relatively fast and simple

Other Approaches  Linked list of free blocks  Can use the empty blocks to store the pointers to the next empty block  Very space efficient -- Only need to store one pointer to the first empty block  Very simple, but time consuming to allocate large numbers of blocks  Can "group" the pointers into a single block for efficiency, and have the last pointer on the block point to the next group of empty blocks

Compression  In run-length compression, we store a value, followed by the number of occurrences of that value – saves lots of space if long "runs" exist  We can compress the free space map by storing pairs of values: A free block, and the number of consecutive free blocks that follow it  This compressed version has as many entries as there are memory holes

Efficiency and Performance  We generally desire a file system to be as small and fast as possible  However, what works best is often a factor of how it will be used and factors such as:  Disk size  Other physical properties (heads, platters, etc.)  Average file size  Read:Write ratio  Number of IO buffers  Amount of RAM available for caching tables & indices, use of cache for disk blocks as well as pages (Unified Virtual Memory)  Synchronous vs. Asynchronous access requirements  Viability of "Read-Ahead"  Redundancy requirements

Recovery  Lost data in RAM (except newly generated data not saved on disk) is usually recoverable in the event of errors, bugs, power-failures, etc. – reload it from disk  Disk data must be better protected so that errors and failures can be recovered from

Causes  Memory contents lost (power failure, crash) before disk can be updated... particularly with cached index or free space tables  Disk block failure (hardware fault)  Write failure (power loss, system crash)  Bugs in the OS, corruption of FS by applications

Consistency Checking  fsck (unix) and chkdsk (windows) checks all the tables and structures on a disk for consistency.  I.e., does the free space + used space indicated by directories = all the available space?  Can be run at mount, at boot, via chron, etc.  Can be supplemented with change flags stored on disk, access/update timestamps, etc.

Journalled (logged) FS  All disk transactions are written first to a log  Log may be stored on a different disk for redundancy  Log tends to store a considerable amount of data for a non-trivial time period  If inconsistency is found, each log entry is checked to see if it was performed  Of course, if the log is corrupted, then we are still in trouble  Uses database transaction techniques

Duplication Techniques  Modems split their EPROM in half and duplicate things so there are two copies  If one is corrupted, the other is used  Can use similar approaches with disks, but is very wasteful of space  Can also do limited duplication and avoid overwriting data until the disk is full  Complete duplication to another disk is the only possible backup in the event of a hardware failure that renders the disk inoperable

NFS  The location of a file system shouldn't really matter to the user (except that non-local data may take longer to access)  Various different protocols are available  File storage in the "Cloud" is really just a trendy term for a networked file system on a WAN (usually the Internet)

Networked File Systems  Require:  Mount Protocol  Access Protocol (for specific FS items)  Naming Protocol – to allow local vs. non-local paths to be mapped  Possible format changes to facilitate local hardware and OS needs – but this is often seen as an application-level concern

To Do:  Finish Assignment 2 (Due next week)  Complete Lab 6 (last required lab)  Read Chapter 11 (pgs 461-499; this lecture)

Chapter 11 – File-System Implementation (Pgs 461-499 )

Similar presentations

Presentation on theme: "Chapter 11 – File-System Implementation (Pgs 461-499 )"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 11 – File-System Implementation (Pgs 461-499 )

Similar presentations

Presentation on theme: "Chapter 11 – File-System Implementation (Pgs 461-499 )"— Presentation transcript:

Similar presentations

About project

Feedback