Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storage Systems CSE 598d, Spring 2007 Lecture 13: File Systems March 8, 2007.

Similar presentations


Presentation on theme: "Storage Systems CSE 598d, Spring 2007 Lecture 13: File Systems March 8, 2007."— Presentation transcript:

1

2 Storage Systems CSE 598d, Spring 2007 Lecture 13: File Systems March 8, 2007

3 What is a file system? Overloaded term –Related set of data –Software that manages this data Storage, retrieval, consistency, reliability, … Typically part of OS but not always A virtualization layer with the following LCD –Notion of file and directory –File: A named data store –Directory: A collection of files and directories –A set of operations to manipulate files and directories

4 File System Basics Sits between user applications and device driver(s) Meta-data: Additional data used for describing the real data Specific to disk-based storage –Volume, partition – Superblock Part of a volume where file system stores some volume-wide meta-data

5 File System Goals Ease of use –Conforming to desired/popular user interface –Hiding hardware ugliness from user Efficiency and fairness –Both temporal and spatial –Scheduling, caching, data layout Robustness –Reliability –Consistency

6 File System Goals Ease of use –Conforming to desired/popular user interface –Hiding hardware ugliness from user Efficiency and fairness –Both temporal and spatial –Scheduling, caching, data layout Robustness –Reliability –Consistency Security –Simple: ACLs –More: Encryption, … Longevity Power!

7 File System Goals Ease of use –Conforming to desired/popular user interface –Hiding hardware ugliness from user Efficiency and fairness –Both temporal and spatial –Scheduling, caching, data layout Robustness –Reliability –Consistency Security –Simple: ACLs –More: Encryption, … Longevity Power!

8 Robustness What/where are the points of vulnerability in the I/O path/hierarchy? –The disk itself Unrecoverable failures happen eventually –Backup important data regularly - not a file system concern ECC, RAID, etc. help identify/recover from some errors Power failure can erase contents of disk cache –Battery powered disk cache –Rest of the system Power failure: Volatile RAM Software crash during an update to the disk –Only updates at block granularity are atomic If a partial update occurs, file system needs to bring the system to a correct state after a re-boot –Data structures are consistent

9 Ensuring correctness after a crash Tedious –Dependent on the size of the file system –Can take hours! Not always possible! Some_fileFS On Disk time Crash!! Are we OK? file size data, add a byte

10 Ensuring correctness after a crash Tedious –Dependent on the size of the file system –Can take hours! Not always possible! Some_fileFS On Disk time Crash!! Are we OK? last mod time data, change a byte Solution: Journaling

11 Journaling Based on the concept of a transaction –Developed in the database community –ACID properties Atomicity: all or none Consistency: transaction does not break integrity constraints Isolation: appears isolated from all other operations; serializable Durability: Once done, a transaction persists; will survive system failures Key idea: Ensure the following is a transaction: –The complete set of modifications made to on-disk data structures made as part of an operation –Most important property: Ensure the above is atomic with respect to failures How to achieve this? –Write-ahead logging Introduce additional step where all changes comprising the atomic operation are written to a log or journal before getting propagated to the disk –The log could itself be on the disk –Question: Can the log be in-memory?

12 Journaling (example) Typical Steps –Do modifications in buffer cache: not necessary Why do this? –Then write them in the log: necessary step –Then write them where they are supposed to be Example: Creation of a file –What happens upon a crash depending upon when it occurred A ---- C: Transaction is not written to log: as if never happened C ---- D: Partially written transaction, considered incomplete: as if never happend D --- F: Replay log: transaction will succeed after restart –Key requirement: idempotence After F --- irrelevant with respect to this transaction

13 Journaling file systems The devil is in the details! What is the impact on performance? Will read papers next week Other related upcoming topics –We will move on to distributed/networked file systems next –Transactions become tricky Recall 2-P and 3-P commit protocols –Consistency semantics are another important issue What is the right semantic and how should it be implemented

14 Consistency Issues What is consistency? Why/When is it of concern? Consistency concerns arise in any computing system where multiple versions of a data item may exist –Spatial: Caching, replication –Temporal: Multiple entities modifying a data item simultaneously Consistency is concerned with defining and realizing what is the allowable ways of data manipulation Why is consistency of interest in the domain of file systems? –A hierarchy of caches/buffers exist –A file system may allow multiple entities (say processes) to manipulate a file simultaneously

15 Consistency Model The consistency model specifies a contract between programmer and system, wherein the system guarantees that if the programmer follows the rules, memory will be consistent and the results of memory operations will be predictable First defined in the context of memory but generalizable to filesystems, databases, Web, … Unfortunately, there is no single hierarchy that could be used to classify the strictness of a consistency model –A good paper to read is by Mosberger titled “Memory Consistency Models”; uploaded on the course page OPTIONAL BUT HIGHLY RECOMMENDED READING

16 Next Class Consistency models: Brief Overview Network storage introduction –NAS vs SAN –DAFS –Some relevant technology and systems innovations FC, Smart NICs, RDMA, … Log-structured file systems in detail Have a good break!


Download ppt "Storage Systems CSE 598d, Spring 2007 Lecture 13: File Systems March 8, 2007."

Similar presentations


Ads by Google