Presentation is loading. Please wait.

Presentation is loading. Please wait.

RAID.

Similar presentations


Presentation on theme: "RAID."— Presentation transcript:

1 RAID

2 Not all exponential growth is created equal
RAM capacities have quadrupled every 2-3 years. CPU performance has doubled every 2 years. Disk capacity has improved by 50% a year. Disk transfer rates have improved by 20% a year. Disk access times have improved by 10% a year. Although Solid State Drives (SSDs), have improved the performance of storage. Still, storage hasn't been keeping up with the improvement in the rest of computer technology.

3 What electronic technology has lagged even worst than disk performance?
1. Screen Resolution 2. Battery Capacity 3. Network Transfer Rates 4. Programmer Attention Spans

4 Redundant Arrays of Inexpensive Disks
RAID is a combination of software and hardware meant to improve disk performance. Sometimes, the letter I in RAID means "independent" now-a-days. The main goal of RAID is to even out the widely different rates of performance improvement of disks against those in memory and microprocessors.

5 Data Striping Data striping means breaking up contiguous data that would normally go on a single disk. The data is distributed to many disks, either by byte (a) or by block (b), see left. With each read or write, all the disks participate. Hence the data access can be parallelized across disks. To the OS (and DBMS), the storage looks like a single, huge, superfast disk.

6 Mirroring / Shadowing To make your life confusing, the concept of using replicates (copies of the data) are called mirroring or shadowing in the RAID literature. But they mean exactly the same thing. Why have disk replicates? The mean-time-to-failure (approximate lifespan) of a hard drive is about 20 years. But if you have 1000 drives in your datacenter, the mean- time-to-failure for at least one drive failure is 60 days. You probably don't want to lose important data every 60 days. So you have redundancy / replicates / mirrors / shadows / copies, so that you can recover in the event of failure.

7 Using an external hard drive for full disk backup (e. g
Using an external hard drive for full disk backup (e.g. Apple's Time Machine) is which of the following? 1. Replication 2. Mirroring 3. Shadowing 4. Fanboying

8 Detecting Failure and Recovering
In order to detect failure (especially failure smaller than an entire disk exploding), many systems employ error-correcting codes involving parity bits or specialized codes, like Hamming codes. The implementation of such systems is beyond the scope of this class. In the event of a failure, the damaged system is often physically replaced, and the new system is filled with the data that was "lost" by the other replicates of the data.

9 Load Balancing Where do you put your replicates?
Simplest option is to replicate the entire disk onto another entire disk. However, spreading out the replicates across many disks is better for load balancing (if many requests are for the same data).

10 Why would distributing a replicate across multiple disks improve performance?
1. It doesn't 2. If the data is popular, many disks could serve up different parts. 3. It is safer to have a replicate broken into pieces 4. Don't do it, splinching might result.

11 RAID Levels Different raid organizations were defined based on different combinations of the two factors of granularity of data interleaving (striping) and pattern used to compute redundant information. Raid level 0 has no redundant data and hence has the best write performance at the risk of data loss, but it does use striping to improve read/write performance. Raid level 1 uses mirrored disks. Raid level 2 uses memory-style redundancy by using Hamming codes, which contain parity bits for distinct overlapping subsets of components. Level 2 includes both error detection and correction. Raid level 3 uses byte-level data striping and a single parity disk relying on the disk controller to figure out which disk has failed. Raid levels 4 and 5 use block-level data striping, with level 5 distributing data and parity information across all disks. Raid level 6 applies the so-called P + Q redundancy scheme using Reed-Soloman codes to protect against up to two disk failures by using just two redundant disks.

12 RAID Level Comparison Different raid organizations are being used under different situations Raid level 1 (mirrored disks) is the easiest for rebuild of a disk from other disks It is used for critical applications like logs Raid level 2 uses memory-style redundancy by using Hamming codes, which contain parity bits for distinct overlapping subsets of components. Level 2 includes both error detection and correction. Raid level 3 (single parity disks relying on the disk controller to figure out which disk has failed) and level 5 (block-level data striping) are preferred for Large volume storage, with level 3 giving higher transfer rates. Most popular uses of the RAID technology currently are: Level 0 (with striping), Level 1 (with mirroring) and Level 5 with an extra drive for parity. Design Decisions for RAID include: Level of RAID, number of disks, choice of parity schemes, and grouping of disks for block-level striping.

13 Popular RAID Levels The most common RAID levels in use are:
RAID Level 1 (Complete Mirroring Of Disks) shown in (a). RAID Level 5 (Striping of data across disks with distributed parity) shown in (b). It is also common to nest and combine RAID levels.

14 Weakness of RAID Non-independent disk failures - If disks are more likely to fail with use, or from wear-and- tear, or from factors that affect other disks, then many failures may occur at the same time, overwhelming the RAID system. Disk Capacity Increasing Faster Than Disk Transfer Rates - When a disk fails, it can take many hours (even days) to copy the data to a replacement disk. During this time, the RAID system's performance is reduced (and depending on the amount of redundancy, its ability to recover from other errors).

15 What RAID Level corresponds to Complete Replication across sites?
1. RAID 0 (striping) 2. RAID 1 (mirroring) 3. RAID 2-6 (striping and mirroring) 4.

16 What RAID Level corresponds to Partitioning across sites?
1. RAID 0 (striping) 2. RAID 1 (mirroring) 3. RAID 2-6 (striping and mirroring) 4.

17 What RAID Level corresponds to Partial Replication across sites?
1. RAID 0 (striping) 2. RAID 1 (mirroring) 3. RAID 2-6 (striping and mirroring) 4.


Download ppt "RAID."

Similar presentations


Ads by Google