Presentation is loading. Please wait.

Presentation is loading. Please wait.

Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.

Similar presentations


Presentation on theme: "Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides."— Presentation transcript:

1 Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides

2 The challenge  Disk transfer rates are improving, but CPU performance is increasing faster  Hungry for data  Use many disks in parallel to improve data rate  bandwidth not latency  Striping  Striping reduces reliability  10 disks have 1/10th the MTBF (mean time between failures) of one disk  Must address reliability too 2

3 Reliability  Multiple failures at same time are extremely rare  SATA MTBF ~ 600,000 hours (up to ~1.5 Million)  To improve reliability, add error correction data  Adds redundancy too  So:  Performance from striping  Reliability from error correction  which steals back a bit of the performance gain 3

4 RAID  A RAID = Redundant Array of Inexpensive Disks  or Redundant Array of Independent Disks  Disks are small and cheap, so it’s easy to put lots of disks (10s to 100s) in one box for increased storage, performance, and availability  Data plus some redundant information is striped across the disks in some way  How striping is done is key to performance and reliability  Animated tutorial 1, Animated Tutorial 2 Animated tutorial 1Animated Tutorial 2 9/14/20154

5 Some RAID tradeoffs  Granularity  fine-grained: stripe each file over all disks  high throughput for the file  limits transfer to 1 file at a time  course-grained: stripe each file over only a few disks  limits throughput for 1 file  allows concurrent access to multiple files  Redundancy  uniformly distribute redundancy information on disks  avoids load-balancing problems  concentrate redundancy information on a small number of disks  partition the disks into data disks and redundancy disks 9/14/20155

6 RAID Level 0: Non-Redundant Striping  RAID Level 0 is a non-redundant disk array  Files are striped across disks, no redundant info  High (single file) read throughput  Best write throughput (no redundant info to write)  Any disk failure  data loss  What is lost?  Is this concept used elsewhere? 9/14/20156

7 RAID Level 1: Mirrored Disks  Files are striped across half the disks, and mirrored to the other half  2x space expansion  Reads:  Read from either copy  Writes:  Write both copies  On failure, just use the surviving disk 9/14/20157 What is the effect on performance? How many simultaneous disk failures can be tolerated?

8 RAID Levels 2, 3, and 4: Striping + Parity Disk  RAID 2, 3, and 4 use Error Correcting Code (ECC) aka parity  parity calculated per byte each byte (XOR parity)  A large read accesses all the data disks  A single block read accesses only one disk (RAID 4)  A write updates one or more data disks plus the parity disk  Recovers from single disk failures (How?)  Better ECC  higher failure resistance  more parity disks 9/14/20158

9 Refresher: What’s parity?  To each byte, add a bit set so that the total number of 1’s is even  Any single missing bit can be reconstructed  Why does memory parity not work quite this way?  Think of ECC as just being similar but fancier (more capable) 9/14/20159 101101101

10 RAID 3 Parity  XP = X3 (+) X2 (+) X1 (+) X0  (+) = XOR  Assume X1 fails  Detect  Regenerate missing bit  X1 = X3 (+) X2 (+) XP (+) X0  Replace drive 1 and regenerate whole drive 9/14/2015 © 2007 Gribble, Lazowska, Levy, Zahorjan 10 How detect?

11 Raid 3?  Looks great in theory  “In theory, there is no difference between theory and practice. In practice there is.”  Requires synchronously rotating disks  And a specialized controller  In practice implement an approximation  But not much used 9/14/2015 © 2007 Gribble, Lazowska, Levy, Zahorjan 11

12 RAID Level 5  RAID Level 5 uses block interleaved distributed parity  Like parity scheme, but distribute the parity info (as well as data) over all disks  for each block, one disk holds the parity, and the other disks hold the data  Significantly better performance  parity disk is not a hot spot 9/14/201512...

13 RAID Performance comparison RAID type Common Name DescriptionRel. Cost Disks Relative Data Availability Large Read Data Transfer Speed 1 Large Write Data Transfer Speed Random Read Request Rate Random Write Request Rate 0Data Striping User data distributed across the disks in the array. No check data Nlower than single disk very high 1MirroringUser data duplicated on N separate disks. (N is usually 2). Check data is second copy (as in Figure 20) Nhigher than RAID Level 3, 4, 5; lower than RAID 2, 6 higher than single disk (up to 2x) Slightly lower than single disk up to N x single disk similar to single disk 0+1Striped Mirrors User data striped across M separate pairs of mirrored disks. Check data is second copy 2Mhigher than RAID Level 3, 4, 5; lower than RAID 2, 6 much higher than single disk higher than single disk much higher than single disk higher than single disk 2User data striped across N disks. Hamming code check data distributed across m disks, (m is determined by N). N+mhigher than RAID Level 3, 4, 5 highest of all listed types highest of all listed types approximately 2x single disk 3RAID 3, Parallel Transfer Disks with Parity Synchronized disks Each user data block distributed across all data disks. Parity check data stored on one disk. N+1much higher than single disk; comparable to RAID 2, 4, 5 highest of all listed types approximately 2x single disk 4Independent disks User data distributed as with striping. Parity check data stored on one disk. N+1much higher than single disk; comparable to RAID 2, 3, 5 similar to disk striping slightly lower than disk striping similar to disk striping significantly lower than single disk 5RAID 5, “RAID” Independent disks User data distributed as with striping; Parity check data distributed across disks. N+1much higher than single disk; comparable to RAID 2, 3, 4 slightly higher than disk striping slightly lower than disk striping Slightly higher than disk striping Significantly lower than single disk; higher than RAID 4 6RAID 6As RAID Level 5, but with additional independently computed distributed check data. N+2highest of all listed types slightly higher than RAID 5 lower than RAID 5 slightly higher than RAID 5 lower than RAID 5

14 Raid – learn more  http://www.aaa-datarecovery.com/raid_tutorial.htm http://www.aaa-datarecovery.com/raid_tutorial.htm  Homework: 1. Describe clearly in your own words the difference between Raid 1-0 and Raid 0-1 2. Compare Raid 10 to Raid 5 In your answers address performance (read and write), costs, reliability 3. Calculate the XOR parity bit for 10110101 (show work) 9/14/2015 © 2007 Gribble, Lazowska, Levy, Zahorjan 14

15 Typical Implementation 9/14/201515 Disks Controller OS

16 9/14/201516 RAID 0,1 0+1 vs 1+0? Which one more robust? Rebuild?

17 9/14/201517 RAID 5

18 Final Issues  If you’re running a RAID level with redundancy, do you need backup?  What’s the difference between RAID and backup?  Does RAID provide “sufficient” reliability?  What if you’re Amazon.com? 9/14/201518

19 Heading for 5 nines  5 nines = 99.999% uptime = ?? Minutes per year down? (Bell telephone)  Tier I Single path for power and cooling distribution, no redundant components, 99.671% availability  Tier II Single path for power and cooling distribution, redundant components, 99.741% availability  Tier III Multiple power and cooling distribution paths, but only one path active, redundant components, concurrently maintainable, 99.982% availability  Tier IV Multiple active power and cooling distribution paths, redundant components, fault tolerant, 99.995% availability 9/14/2015 © 2007 Gribble, Lazowska, Levy, Zahorjan 19

20 9/14/2015 © 2001 The Uptime Institute 20 Uptime Institute (link)link


Download ppt "Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides."

Similar presentations


Ads by Google