Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6.

Similar presentations


Presentation on theme: "Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6."— Presentation transcript:

1 Sean Traber CS-147 Fall 2009

2  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6 RAID Level 5  7.9.7 RAID Level 6  7.9.8 RAID DP (Double Parity)  7.9.9 Hybrid RAID Systems

3  Originally stood for Redundant Arrays of Inexpensive Disks.  The term “inexpensive” was too ambiguous and was assumed to be referring to cost which was not true.  The term “inexpensive” was actually referring to the slight performance hit required to make the data storage reliable.  The accepted acronym now uses “independent” instead of “inexpensive”.

4  Based on the 1988 paper “A Case for Redundant Arrays of Inexpensive Disks,” by David Patterson, Garth Gibson, and Randy Katz of U.C. Berkeley.  They coined the term RAID and defined five types of RAID (called levels)  These were the definitions of the levels numbered 1 through 5.  Definitions for levels 0, 6, and DP came later.  Many vendors offer enterprise-class storage systems that are not protected by RAID, these systems are often referred to as JBOD or Just a Bunch Of Disks.

5  Not a true RAID implementation, lacks the “R”  Very quick for Read/Write performance.  Not a good way to store sensitive data.  If one disk fails, all the data is lost.  Less reliable than single disk systems.  Array of 5 disks each with a design life of ~ 50,000 hours (6 years) gives the entire system an expected design life of 50,000/5 = 10,000 hours (14 months).  As the number of disks increases, the probability of failure increases until it reaches near certainty.

6  Best used for data that is non-critical, or data that rarely changes and is backed up frequently.

7  Also known as disk mirroring.  Writes all data twice, once to the main drives, and once to a set of drives called a mirror set or shadow set.  Inefficient use of space, requires twice as much space as the amount of data to store.  Highest fault tolerance of any RAID system.  Read performance can as much as double on multi- threaded systems that support “split seeks”  Reads from the disk arm that is closest to the target sector.

8  Best suited for transaction-oriented, high-availability environments and applications requiring a high fault tolerance, such as accounting and payroll.

9  Cuts down on the number of extra drives needed to protect the data.  Takes data striping to the extreme, writing only 1 bit per strip, instead of in arbitrary size blocks.  This requires a minimum of 8 surfaces to write to.  Generates Hamming code for data protection and stores it on the extra drives.  Cuts the number of extra drives needed from n to log(n).

10  Hamming code generation is time consuming, and therefore RAID-2 is too slow for most commercial implementations.  RAID-2 is best known just for serving as the theoretical bridge between RAID-1 and RAID-3.

11  Like RAID-2, stripes data one bit at a time across all the data drives.  Uses only one extra drive to store simple parity bits.  Parity is calculated as follows (for even parity) Parity = b 0 XOR b 1 XOR b 2 XOR b 3 XOR b 4 XOR b 5 XOR b 6 XOR b 7 OR Parity = ( b 0 + b 1 + b 2 + b 3 + b 4 + b 5 + b 6 + b 7 ) MOD 2  If a drive fails, the data on the failed drive can be re- created by using the above equation, and substituting the parity bit in for the failed bit.

12  Another “theoretical” RAID level, like RAID-2 it would offer poor performance if implemented.  Uses one extra parity disk, like RAID-3 but writes data in blocks of uniform size instead of one bit at a time.  Let’s take a stripe spanning 5 drives (4 data, 1 parity)  To write to strip 3 we have to: 1. Read the data currently occupying strip 3, and the parity strip. 2. XOR the old data with the new data to get the new parity. 3. Write the new data to strip 3. 4. Update the parity strip.  If there were write requests pending for strips 1 & 4, each would have to wait until the previous write completed since each need the updated state of the parity.  Previous levels could do multiple writes concurrently as no strips data was dependant upon the data on any other strip.

13  This essentially turns the parity drive into a large bottleneck, which negates the performance gains of multiple disk systems.  Due to its expected poor performance, there are no commercial implementations of RAID-4.

14  Similar to RAID-4, but the parity strips are spread throughout the array. This allows for concurrent writes as long as the requests involve different sets of disk arms for both parity and data.  This doesn’t guarantee that writes can be performed concurrently, but RAID-5’s write performance is considered to be “acceptable”.  This design offers the best read performance of all the parity models, but also requires the most complex disk controllers of all the levels.

15  RAID-5 has been a commercial success because it offers the best protection for the least cost, and is more commonly used than any of the other RAID systems.

16  The previous levels of RAID could tolerate at most 1 disk failure at a time.  However, disk failures tend to come in clusters for 2 reasons: 1. If they were manufactured at approximately the same time, they will reach the end of their expected useful life at around the same time. 2. Drive failures are often caused by events such as power surges, which hit all drives at the same instant, causing the weakest one to fail first, then the next weakest, and so on.  The larger the drives are, the longer it takes to rebuild one from the parity, which leaves the system vulnerable to a second drive crashing and rendering the entire array useless.

17  RAID-6 attempts to protect against multiple-disk failure by using two sets of error-correction strips for each row of drives.  One strip holds the parity, and the other stores Reed- Solomon error-correcting codes.  Although RAID-6 offers good protection against the entire array becoming useless while a single failed disk is rebuilt, it is not often used commercially because it has 2 performance drawbacks. 1. There is sizeable overhead involved in generating Reed-Solomon code. 2. It requires twice as many read/write operations to update the error- correcting codes.

18  Most commonly the DP stands for Double Parity, sometimes called Diagonal Parity.  Protects against two-disk failures by making sure that each data block is protected by two linearly independent parity functions.

19  RAID DP can offer much more reliable protection of arrays containing many disks the simple parity protection of RAID-5 can.  The second parity gives much better performance than the Reed-Solomon correction of RAID-6.  The write performance is still poorer than that of RAID-5 due to the need for dual reads and writes, so there is a tradeoff of performance for reliability.

20

21  Most systems have different needs for different operations, and therefore benefit from using a variety of RAID levels.  For example operating system files are very important and therefore you may be willing to use the extra space to have a full mirror copy of those files using RAID-1.  RAID-5 is sufficient for most data files, and temp files can benefit from the speed of RAID-0.  RAID schemes can be combined to form a “new” RAID level. RAID-10 combines the striping of RAID-0 with the mirroring of RAID-1 to give the best possible read performance while providing the best possible availability, but is extremely expensive.


Download ppt "Sean Traber CS-147 Fall 2009.  7.9 RAID  7.9.1 RAID Level 0  7.9.2 RAID Level 1  7.9.3 RAID Level 2  7.9.4 RAID Level 3  7.9.5 RAID Level 4  7.9.6."

Similar presentations


Ads by Google