Download presentation
Presentation is loading. Please wait.
1
Filesystems – Metadata, Paths, & Caching Vivek Pai Princeton University
2
2 Diskgedanken Assuming you back-up and restore files, what factors affect the time involved? How are these factors changing? What issues affect the rates of change? How is total backup time changing over the years? What is Occam’s razor?
3
3 Today’s Overview Quiz recap Finish up metadata, reliability A little discussion of mounting, etc Move on to performance
4
4 Quiz 1 Observations I’m disappointed Quizzes not yet graded, but… Most people did poorly on question 1 Lots of dimensional analysis Lots of sleepers, chatting, weird faces Very little (too little) feedback in general Open question – looking for a methodical approach
5
5 Occam’s Razor From William of Occam (philosopher) “entities should not be multiplied unnecessarily” Often reduced to other statements “one should not increase, beyond what is necessary, the number of entities required to explain anything” “Make as few assumptions as possible” “once you have eliminated all other possible explanations, what remains must be the answer”
6
6 A Reasonable Approach Disk size: 40GB (20-80GB common) File size: 10KB (5-20KB common) Access time: 10ms (5-20ms common) Assume 1 seek per file (reasonable) 100 files = 1MB, each access.01 sec So, 40GB at 1MB/s = 40K sec = 11+ hours
7
7 Changes Over Time Disk density doubling each year Seek time dropping < 10% File size growing slowly Results # of files grows faster than access time reduction Backup time increases
8
8 Most Common Answer Disk size / maximum transfer rate In other words, read sectors, not files Can this be done? Yes, if you have access to “raw” disk Which means that you have “root” permission And that the system has raw disk support Faster than file-based dump/restore No concept of files, however What happens if you restore to a disk with a different geometry?
9
9 Linked Files (Alto) File header points to 1st block on disk Each block points to next Pros Can grow files dynamically Free list is similar to a file Cons random access: horrible unreliable: losing a block means losing the rest File header null...
10
10 Contiguous Allocation Request in advance for the size of the file Search bit map or linked list to locate a space File header first sector in file number of sectors Pros Fast sequential access Easy random access Cons External fragmentation Hard to grow files
11
11 Single-Level Indexed Files or Extent-based Filesystems A user declares max size A file header holds an array of pointers to point to disk blocks Pros Can grow up to a limit Random access is fast Cons Clumsy to grow beyond limit Periodic cleanup of new files Up-front declaration a real pain File header Disk blocks
12
12 217 File Allocation Table (FAT) Approach A section of disk for each partition is reserved One entry for each block A file is a linked list of blocks A directory entry points to the 1st block of the file Pros Simple Cons Always go to FAT Wasting space 619 399 foo 217 EOF FAT 0 399 619
13
13 Multi-Level Indexed Files (Unix) 13 Pointers in a header 10 direct pointers 11: 1-level indirect 12: 2-level indirect 13: 3-level indirect Pros & Cons In favor of small files Can grow Limit is 16G and lots of seek What happens to reach block 23, 5, 340? 1 2 data...... 11 12 13 data....................................
14
14 Reliability In Disk Systems Make sure certain actions have occurred before function completes Known as “synchronous” operation Ex: make sure new inode is on disk & that the directory has been modified before declaring a file creation is complete Drawback: speed Some ops easily asynchronous: access time Some filesystems don’t care: Linux ext2fs
15
15 Recovery After Failure Need to ensure consistency Does free bitmap match tree walk? Do reference counts in inodes match directory entries? Do blocks appear in multiple inodes? This kind of recovery grows with disk size Clean shutdown – mark as such, no recovery
16
16 Reducing Synchronous Times Write to a faster storage Nonvolatile memory – expensive, requires some additional OS/firmware support Write to a special disk or section – logging Only have to examine log when recovering Eventually have to put information in place Some information dies in the log itself Write in a special order Write metadata in a way that is consistent but possibly recovers less
17
17 Challenges Unix filesystem has great flexibility Extent-based filesystems have speed Seeks kill performance – locality Bitmaps show contiguous free space Linked lists easy to search How do you perform backup/restore?
18
18 Bigger, Faster, Stronger Making individual disks larger is hard Throw more disks at the problem Capacity increases Effective access speed may increase Probability of failure also increases Use some disks to provide redundancy Generally assume a fail-stop model Fail-stop versus Byzantine failures
19
19 RAID ( Redundant Array of Inexpensive Disks ) Main idea Store the error correcting codes on other disks General error correcting codes are too powerful Use XORs or single parity Upon any failure, one can recover the entire block from the spare disk (or any disk) using XORs Pros Reliability High bandwidth Cons The controller is complex RAID controller XOR
20
20 Synopsis of RAID Levels RAID Level 0: Non redundant (JBOD) RAID Level 1: Mirroring RAID Level 2: Byte-interleaved, ECC RAID Level 3: Byte-interleaved, parity RAID Level 4: Block-interleaved, parity RAID Level 5: Block-interleaved, distributed parity
21
21 Did RAID Work? Performance: yes Reliability: yes Cost: no Controller design complicated Fewer economies of scale High-reliability environments don’t care Now also software implementations
22
22 RAID’s Real Benefit Partly addresses the failure problem Backup/restore less of an issue Failed disk “rebuilt” at sector level Lower performance during rebuild, but system still on-line Still not perfect Geographic problems Failure during rebuild
23
23 Namespace Basically, the filesystem hierarchy Provides a convenient way of accessing things Files Devices Pseudo-“filesystems” In Unix, a nice, consistent namespace No “drive names”
24
24 A Sample File Tree / bin/boot/proc/usr/ home/local/ mariah/vivek/
25
25 What If You Have Two Disks? / bin/boot/proc/usr/ home/local/ mariah/vivek/
26
26 As Mariah’s Files Grow? / bin/boot/proc/usr/ home/local/ mariah/vivek/
27
27 Mount Points / bin/boot/proc/usr/ home/local/ mariah/vivek/
28
28 Mount Points Original directories get “hidden” Traversal is transparent to user OS keeps track of various disks (devices) But what happens with big disks? Partition (split) them into several logical devices – easier to manage, safer, etc Home directories in one partition, startup- related files/programs in another, etc
29
29 Paths Each process has “current directory” Convenient shorthand Paths that start with “/” are absolute Paths without “/” are relative to current directory Path lookup is potentially expensive It’s also repetitive Amenable to caching Metadata cache from assigned reading
30
30 Finding Paths In Unix, directory contains inode # If two directories contain same #, file is accessible via different paths (and names) Adding another name into the filespace is called “linking” (via ‘ln’ command) But the directory is a file What happens if a directory gets linked?
31
31 Consider The Following / bin/boot/proc/usr/ home/local/ mariah/vivek/
32
32 Various Solutions Only allow “root” to link to directory Can still be useful Hopefully root knows when to do it Limit the number of iterations Pick some “large” maximum Terminate traversal after that Detect loops Cost? Utility?
33
33 Does It “Do What You Want” I create ~vivek/work/cal/now/mtgs I create a link to it via ~vivek/mtgs The month advances, and ~vivek/work/cal/now/mtgs becomes ~vivek/cal/Sep01/mtgs Create new ~vivek/work/cal/now/mtgs To what does ~vivek/mtgs point?
34
34 Symbolic Link Created via “ln –s” command Dynamically interpreted each use Does not cause a standard directory entry to target. Instead Link is a file containing the file/path May be stored in inode if link is short Standard looping rules apply
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.