Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bridging the Information Gap in Storage Protocol Stacks

Similar presentations


Presentation on theme: "Bridging the Information Gap in Storage Protocol Stacks"— Presentation transcript:

1 Bridging the Information Gap in Storage Protocol Stacks
Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin, Madison

2 State of Affairs File System Storage System Namespace, Files,
Metadata, Layout, Free Space Block Based, Read/Write Parallelism, Redundancy Interface

3 Problem Information gap may cause problems
Poor performance Partial stripe write operations Duplicated functionality Logging in file system and storage system Reduced functionality Storage system lacks knowledge of files Time to re-examine the division of labor

4 Our Approach Enhance the storage interface
Expose performance and failure information Use information to provide new functionality On-line expansion Dynamic parallelism Flexible redundancy Informed LFS Exposed RAID

5 Outline ERAID Overview I·LFS Overview Functionality and Evaluation
On-line expansion Dynamic parallelism Flexible redundancy Lazy redundancy Conclusion

6 ERAID Goals Backwards compatibility
Block-based interface Linear, concatenated address space Expose information to the file system above Regions Performance Failure Allow file system to utilize semantic knowledge

7 ERAID Regions Region Regions can be added to expand the address space
Contiguous portion of the address space Regions can be added to expand the address space Region composition RAID: One region for all disks Exposed: Separate regions for each disk Hybrid ERAID

8 ERAID Performance Information
Exposed on a per-region basis Queue length and throughput Reveals Static disk heterogeneity Dynamic performance and load fluctuations ERAID

9 ERAID Failure Information
Exposed on a per-region basis Number of tolerable failures Reveals Static differences in failure characteristics Dynamic failures to file system above ERAID X RAID1

10 Outline ERAID Overview I·LFS Overview Functionality and Evaluation
On-line expansion Dynamic parallelism Flexible redundancy Lazy redundancy Conclusion

11 I·LFS Overview Log-structured file system
Transforms all writes into large sequential writes All data and metadata is written to a log Log is a collection of segments Segment table describes each segment Cleaner process produces empty segments Why use LFS for an informed file system? Write-anywhere design provides flexibility Ideas applicable to other file systems

12 I·LFS Overview Goals Exploits ERAID information to provide
Improve performance, functionality, and manageability Minimize system complexity Exploits ERAID information to provide On-line expansion Dynamic parallelism Flexible redundancy Lazy redundancy

13 I·LFS Experimental Platform
NetBSD 1.5 1 GHz Intel Pentium III Xeon 128 MB RAM Four fast disks Seagate Cheetah 36XL, 21.6 MB/s Four slow disks Seagate Barracuda 4XL, 7.5 MB/s

14 I·LFS Baseline Performance
Four slow disks: 30 MB/s Four fast disks: 80 MB/s

15 Outline ERAID Overview I·LFS Overview Functionality and Evaluation
On-line expansion Dynamic parallelism Flexible redundancy Lazy redundancy Conclusion

16 I·LFS On-line Expansion
Goal: Expand storage incrementally Capacity Performance Ideal: Instant disk addition Minimize downtime Simplify administration I·LFS supports on-line addition of new disks

17 I·LFS On-line Expansion Details
ERAID: Expandable address space Expansion is equivalent to adding empty segments Start with an oversized segment table Activate new portion of segment table

18 I·LFS On-line Expansion Experiment
I·LFS immediately takes advantage of each extra disk

19 I·LFS Dynamic Parallelism
Goal: Perform well on heterogeneous storage Static performance differences Dynamic performance fluctuations Ideal: Maximize throughput of the storage system I·LFS writes data proportionate to performance

20 I·LFS Dynamic Parallelism Details
ERAID: Dynamic performance information Most file system routines are not changed Aware of only the ERAID linear address space Reduces file system complexity Segment selection routine Aware of ERAID regions and performance Chooses next segment based on current performance

21 I·LFS Static Parallelism Experiment
Simple striping limited by the rate of the slowest disk I·LFS provides the full throughput of the system

22 I·LFS Dynamic Parallelism Experiment
I·LFS adjusts to the performance fluctuation

23 I·LFS Flexible Redundancy
Goal: Offer new redundancy options to users Ideal: Range of mechanisms and granularities I·LFS provides mirrored per-file redundancy

24 I·LFS Flexible Redundancy Details
ERAID: Region failure characteristics Use separate files for redundancy Even inode N for original files Odd inode N+1 for redundant files Original and redundant data in different sets of regions Flexible data placement within the regions Use recursive vnode operations for redundant files Leverage existing routines to reduce complexity

25 I·LFS Flexible Redundancy Experiment
I·LFS provides a throughput and reliability tradeoff

26 I·LFS Lazy Redundancy Goal: Avoid replication performance penalty
Ideal: Replicate data immediately before failure I·LFS offers redundancy with delayed replication Avoids replication penalty for short-lived files

27 I·LFS Lazy Redundancy ERAID: Region failure characteristics
Segments needing replication are flagged Cleaner acts as replicator Locates flagged segments Checks data liveness and lifetime Generates redundant copies of files

28 I·LFS Lazy Redundancy Experiment
I·LFS avoids performance penalty for short-lived files

29 Outline ERAID Overview I·LFS Overview Functionality and Evaluation
On-line expansion Dynamic parallelism Flexible redundancy Lazy redundancy Conclusion

30 Comparison with Traditional Systems
On-line expansion Yes Dynamic parallelism (heterogeneous storage) Yes, but with duplicated functionality Flexible redundancy No, the storage system is not aware of file composition Lazy redundancy No, the storage system is not aware of file deletions

31 Conclusion Introduced ERAID and I·LFS
Extra information enables new functionality Difficult or impossible in traditional systems Minimal complexity 19% increase in code size Time to re-examine the division of labor

32 Questions?


Download ppt "Bridging the Information Gap in Storage Protocol Stacks"

Similar presentations


Ads by Google