A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988

A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988
-David A. Patterson, Garth Gibson, and Randy H. Katz Presenter: Craig Bergstrom

Outline Motivation: The Pending I/O Crisis
A Potential Solution, Caveats A Better Solution, Raid Levels 1-5 Conclusion and Evaluation

Assumptions The only failures of concern are failures of the disk-head assembly during normal operation. Times-to-failure are independent, and exponentially distributed. Minimum user request of data is of a sector, and sector is small relative to a track.

Problem: The Pending I/O Crisis
CPU speeds are increasing very quickly. (Joy’s law) Primary memory speeds are increasing very quickly. Although secondary storage size is increasing sufficiently fast, its speed is not keeping pace with the rest of the system. (Seek time)

Hard Disk Architecture

Secondary I/O Access Time
Seek Time: Time it takes to position the head over the track to be read/written (measured in ms) Rotational Latency: Time from when the read/write head is over the right track until data begins being transferred from the magnetic disk. (For 7200rpm, worst case ~ 8.33ms) Also internal and external data transfer times. (very fast)

Latency vs. Bandwidth Taken from: “Latency Lags Bandwidth” by David A. Patterson. Communications of the ACM October/2004

Amdahl’s Law S = initial time / improved time = Relative Speedup F = Portion of time spent on improved components K= Speedup achieved in improved components If current systems spend 10% of their time in I/O and everything else increases by factor of 10x, effective speedup is only 5x. If everything else increases by 100x, effective speedup is only 10x.

Solution: Arrays of Disks
Motivation: Consumer disks offer a high performance/price ratio.

Problem: Reliability MTTFarray = MTTFdisk/N
Where N is the number of disks in the array This means that the MTTF of an array of 75 inexpensive Conners CP3100’s will be 30,000 hours / 75 = 400 hours.

Solution: Redundant Arrays
We add redundancy to recover in the event of failure of a single component. Break disks into groups, each with a pool of disks for redundancy. When one disk in the group fails, we use the redundant information to replicate that which has been lost. Periodic replacement of failed disks MTTR = Mean Time To Repair

RAID Level 1: Mirroring Raid level 1 mirrors data across two disks, creating a redundant copy. Tandem: doubles the number of controllers for fault tolerance. Every write to a data disk is also a write to a check disk. (increased overhead) A read costs essentially the same as with a single disk.

RAID Level 2: Hamming Code for EEC
Computer architect designers added redundant chips to correct single errors and detect double errors in groups of chips. These same principals can be applied to secondary storage by bit-interleaving data across disks and adding enough check bits to correct errors. Poor performance on small reads/writes. 10 data disks require 4 check disks, 25 data disks require 5 check disks.

RAID Level 3 Single Check Disk Per Group
Most check disks in level 2 are used to determine which disk failed. Often, disk controllers provide the same functionality. If we know which disk failed, then we only need one check disk per group. Poor performance on small reads/writes.

RAID Level 4 Parallel Reads
Interleave sectors instead of bits. This sector by sector interleaving allows us to read multiple sectors at a time. However, writing still requires exclusive access to the check disk which results in poor write performance.

RAID Level 5 No Single Check Disk
To avoid exclusive writes, we spread check data across all of the data disks.

Raid Levels 2-5 Performance
All figures for a group size of 10 disks.

Performance Relative To SLED

Conclusions RAID is a cost effective option to achieve improved size and performance for secondary storage I/O.

Evaluation Overall, This was an excellent argument supporting the use of RAID instead of Single Large Expensive Disks. The credibility of the argument is slightly tainted by the fact that faults having to do with cabling and packaging are ignored. Some of the particulars may be outdated due to the fact that this paper was published in 1988.

Citations Patterson, David A., “Garth Gibson, and Randy H. Katz. “A Case for Redundant Arrays of Inexpensive Disks (RAID)” Proceedings of the 1998 ACM SIGMOD international conference on Management of data. (1993): Patterson, David A., “Latency Lags Bandwidth”, Communications of the ACM. Volume 47. (2004): Chen, Peter M., et. al. “RAID: High-Performance, Reliable Secondary Storage” ACM Computing Surveys. Oct 29, <

A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988

Similar presentations

Presentation on theme: "A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988

Similar presentations

Presentation on theme: "A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988"— Presentation transcript:

Similar presentations

About project

Feedback