Presentation is loading. Please wait.

Presentation is loading. Please wait.

The concept of RAID in Databases By Junaid Ali Siddiqui.

Similar presentations


Presentation on theme: "The concept of RAID in Databases By Junaid Ali Siddiqui."— Presentation transcript:

1 The concept of RAID in Databases By Junaid Ali Siddiqui

2 Some Key terms Concatenated array Concatenated array  This is an array where multiple disk drives or arrays are logically connected together, end-to-end. Data Drive Data Drive  A data drive is a disk drive that is dedicated to storing data, as opposed to parity, Hamming code, or a hot standby.

3 Logical Disk Logical Disk  This is what a RAID array is. Although the RAID array is multiple disks, it appears to the Operating System as a single disk. Physical Disk Physical Disk  A physical disk is a disk. This term is sometimes used to distinguish it from a logical disk.

4 FIGURE 1-1 Logical Drive Including Multiple Physical Drives FIGURE 1-1 Logical Drive Including Multiple Physical Drives

5 Segment size Segment size  This is the number of blocks (sometimes expressed in bytes) that are written to one disk drive, before moving on to the next disk drive in the array. Stripe size Stripe size  This is similar to Segment size, except that it is only valid for RAID-0 arrays. Many manufacturers use this term when they mean Segment size. Stripe width Stripe width  This is the number of blocks that must be written to the array, so that every data drive has had a complete segment written.

6 What is RAID? Redundant Array of Inexpensive Disks (RAID) is a storage technology used to improve the processing capability of storage systems. This technology is designed to provide reliability in disk array systems and to take advantage of the performance gains offered by an array of multiple disks over single-disk storage. Redundant Array of Inexpensive Disks (RAID) is a storage technology used to improve the processing capability of storage systems. This technology is designed to provide reliability in disk array systems and to take advantage of the performance gains offered by an array of multiple disks over single-disk storage. RAID's two primary underlying concepts are: RAID's two primary underlying concepts are:  Distributing data over multiple hard drives improves performance.  Using multiple drives properly allows for any one drive to fail without loss of data and without system downtime.

7 Types or levels of RAID RAID 0 RAID 0 RAID 1 RAID 1 RAID 2 RAID 2 RAID 3 RAID 3 RAID 4 RAID 4 RAID 5 RAID 5 Compound RAID levels Compound RAID levels

8 RAID 0 In RAID Level 0 (also called striping), each segment is written to a different disk, until all drives in the array have been written to. In RAID Level 0 (also called striping), each segment is written to a different disk, until all drives in the array have been written to.

9 Using RAID 0 Advantages Advantages  The I/O performance of a RAID-0 array is significantly better than a single disk. This is true on small I/O requests, as several can be processed simultaneously, and for large requests, as multiple disk drives can become involved in the operation. Disadvantages Disadvantages  This level of RAID is the only one with no redundancy. If one disk in the array fails, data is lost.

10 RAID 1 In RAID Level 1 (also called mirroring), each disk is an exact duplicate of all other disks in the array. When a write is performed, it is sent to all disks in the array. When a read is performed, it is only sent to one disk. This is the least space efficient of the RAID levels. In RAID Level 1 (also called mirroring), each disk is an exact duplicate of all other disks in the array. When a write is performed, it is sent to all disks in the array. When a read is performed, it is only sent to one disk. This is the least space efficient of the RAID levels.

11 Advantages Advantages RAID-1 arrays with multiple mirrors are often used to improve performance in situations where the data on the disks is being read from multiple programs at the same time. By being able to read from the multiple mirrors at the same time, the data throughput is increased, thus improving performance. The most common use of RAID-1 with multiple mirrors is to improve performance of databases. The read performance for RAID-1 will be no worse than the read performance for a single drive. If the RAID controller is intelligent enough to send read requests to alternate disk drives, RAID-1 can significantly improve read performance. Mirrored set without parity' or 'Mirroring'. Provides fault tolerance from disk errors and failure of all but one of the drives.MirroredMirroring Two (or more) disks each store exactly the same data, at the same time, and at all times. Data is not lost as long as one disk survives. Disadvantage This is the least space efficient of the RAID levels. Total capacity of the array is simply the capacity of one disk.

12 RAID 2 RAID Level 2 is an intellectual curiosity, and has never been widely used. It is more space efficient then RAID-1, but less space efficient then other RAID levels. RAID Level 2 is an intellectual curiosity, and has never been widely used. It is more space efficient then RAID-1, but less space efficient then other RAID levels. Instead of using a simple parity to validate the data, it uses a much more complex algorithm, called a Hamming Code. Instead of using a simple parity to validate the data, it uses a much more complex algorithm, called a Hamming Code.Hamming CodeHamming Code

13 Advantages Advantages A Hamming code is larger than a parity, so it takes up more disk space, but, with proper code design, is capable of recovering from multiple drives being lost. RAID-2 is the only simple RAID level that can retain data when multiple drives fail. A Hamming code is larger than a parity, so it takes up more disk space, but, with proper code design, is capable of recovering from multiple drives being lost. RAID-2 is the only simple RAID level that can retain data when multiple drives fail. Disadvantages Disadvantages The primary problem with this RAID level is that the amount of CPU power required to generate the Hamming Code is much higher then is required to generate parity. The primary problem with this RAID level is that the amount of CPU power required to generate the Hamming Code is much higher then is required to generate parity. In general, all data blocks in the stripe modified by the write, must be read in, and used to generate new Hamming Code data. Also, on large writes, the CPU time to generate the Hamming Code is much higher that to generate Parity, thus possibly slowing down even large writes. In general, all data blocks in the stripe modified by the write, must be read in, and used to generate new Hamming Code data. Also, on large writes, the CPU time to generate the Hamming Code is much higher that to generate Parity, thus possibly slowing down even large writes.

14 RAID 3 RAID Level 3 is defined as bytewise (or bitwise) striping with parity. Every I/O to the array will access all drives in the array, regardless of the type of access (read/write) or the size of the I/O request. RAID Level 3 is defined as bytewise (or bitwise) striping with parity. Every I/O to the array will access all drives in the array, regardless of the type of access (read/write) or the size of the I/O request. During a write, RAID-3 stores a portion of each block on each data disk. It also computes the parity for the data, and writes it to the parity drive. During a write, RAID-3 stores a portion of each block on each data disk. It also computes the parity for the data, and writes it to the parity drive. In some implementations, when the data is read back in, the parity is also read, and compared to a newly computed parity, to ensure that there were no errors. In some implementations, when the data is read back in, the parity is also read, and compared to a newly computed parity, to ensure that there were no errors.

15 RAID 3

16 Advantages Advantages RAID-3 provides a similar level of reliability to RAID-4 and RAID-5. RAID-3 provides a similar level of reliability to RAID-4 and RAID-5. Striped set with dedicated parity or bit interleaved parity or byte level parity. This mechanism provides an improved performance and fault tolerance similar to RAID 5, but with a dedicated parity disk Striped set with dedicated parity or bit interleaved parity or byte level parity. This mechanism provides an improved performance and fault tolerance similar to RAID 5, but with a dedicated parity disk One minor benefit is the dedicated parity disk allows the parity drive to fail and operation will continue without parity or performance penalty. One minor benefit is the dedicated parity disk allows the parity drive to fail and operation will continue without parity or performance penalty. Disadvantages Disadvantages RAID-3 also has configuration limitations. The number of data drives in a RAID-3 configuration must be a power of two. The most common configurations have four or eight data drives. RAID-3 also has configuration limitations. The number of data drives in a RAID-3 configuration must be a power of two. The most common configurations have four or eight data drives. Unfortunately, it is not possible to have multiple operations being performed on the array at the same time, due to the fact that all drives are involved in every operation. Unfortunately, it is not possible to have multiple operations being performed on the array at the same time, due to the fact that all drives are involved in every operation.

17 RAID 4 RAID Level 4 is defined as blockwise striping with parity. The parity is always written to the same disk drive. This can create a great deal of contention for the parity drive during write operations. RAID Level 4 is defined as blockwise striping with parity. The parity is always written to the same disk drive. This can create a great deal of contention for the parity drive during write operations.

18 Advantages Advantages For reads, and large writes, RAID-4 performance will be similar to a RAID-0 array containing an equal number of data disks. For reads, and large writes, RAID-4 performance will be similar to a RAID-0 array containing an equal number of data disks. The error detection is achieved through dedicated parity and is stored in a separate, single disk unit. The error detection is achieved through dedicated parity and is stored in a separate, single disk unit. Disadvantages Disadvantages For small writes, the performance will decrease considerably. To understand the cause for this, a one-block write will be used as an example. For small writes, the performance will decrease considerably. To understand the cause for this, a one-block write will be used as an example. A write request for one block is issued by a program. A write request for one block is issued by a program.  The RAID software determines which disks contain the data, and parity, and which block they are in.  The disk controller reads the data block from disk.  The disk controller reads the corresponding parity block from disk.  The data block just read is XORed with the parity block just read.  The data block to be written is XORed with the parity block.  The data block and the updated parity block are both written to disk.  It can be seen from the above example that a one block write will result in two blocks being read from disk and two blocks being written to disk

19 RAID 5 RAID Level 5 is defined as blockwise striping with parity. It differs from RAID-4, in that the parity data is not always written to the same disk drive RAID Level 5 is defined as blockwise striping with parity. It differs from RAID-4, in that the parity data is not always written to the same disk drive

20 Advantages Advantages RAID-5 has all the performance issues and benefits that RAID-4 has, except as follows: RAID-5 has all the performance issues and benefits that RAID-4 has, except as follows:  Since there is no dedicated parity drive, there is no single point where contention will be created. This will speed up multiple small writes.  Multiple small reads are slightly faster. This is because data resides on all drives in the array. It is possible to get all drives involved in the read operation.  Distributed parity requires all drives but one to be present to operate; drive failure requires replacement, but the array is not destroyed by a single drive failure. Disadvantages Disadvantages The array will have data loss in the event of a second drive failure and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive. A single drive failure in the set will result in reduced performance of the entire set until the failed drive has been replaced and rebuilt. The array will have data loss in the event of a second drive failure and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive. A single drive failure in the set will result in reduced performance of the entire set until the failed drive has been replaced and rebuilt.

21 Compound RAID levels There are times when more then one type of RAID must be combined, in order to achieve the desired effect. In general, this would consist of RAID-0, combined with another RAID level (Often RAID-1, RAID-3 and RAID-5 used with RAID-0). There are times when more then one type of RAID must be combined, in order to achieve the desired effect. In general, this would consist of RAID-0, combined with another RAID level (Often RAID-1, RAID-3 and RAID-5 used with RAID-0). The primary reason for combining multiple RAID architectures would be to get either a very large, or a very fast, logical disk. The primary reason for combining multiple RAID architectures would be to get either a very large, or a very fast, logical disk.

22 Any questions? Junaid_upesh@yahoo.com

23 Message from the presenter We spend our days waiting for the ideal path to appear in front of us but what we forget is that paths are made by walking not waiting.So always keep yourself on the right path. We spend our days waiting for the ideal path to appear in front of us but what we forget is that paths are made by walking not waiting.So always keep yourself on the right path. Thank you for your attention Thank you for your attentionReferences: http://www.accs.com/p_and_p/RAID/Recovery.html http://www.wikipedia.com http://docs.sun.com/source/817-3711-10/ch01_basics.html#14202


Download ppt "The concept of RAID in Databases By Junaid Ali Siddiqui."

Similar presentations


Ads by Google