Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.

Similar presentations


Presentation on theme: "Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California."— Presentation transcript:

1 Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California

2 Terminology RAID stands for Redundant Array of Inexpensive Disks. Supercomputer applications are today’s scientific applications. Disk drive:   Seek time   Rotational Latency   Transfer time DRAM: Dynamic Random Access Memory WORM: Write Once Read Many MTTF: Mean Time To Failure   Basic measure of reliability for non-repairable systems.   Statistical value and meant to be the mean over a long period of time and large number of units.

3 Motivation Scientific applications   Read a large volume of data, perform computation, and write a large volume of data. Examples include simulation of earth-quakes, material science, and others.   While CPUs are becoming faster, I/O has not kept pace. What is the impact of improving the performance of some pieces of a problem while leaving others the same?

4 Example Consider a scientific task that spends 10% of its time performing I/O and the remaining 90% computing. 10%90%

5 Motivation (Cont…) 10%90% 1X CPU I/OComputation

6 Motivation (Cont…) 10%90% 1X CPU I/OComputation 10%45% 2X CPU Speedup is 1.8X (Not 2X)

7 Motivation (Cont…) 10%90% 1X CPU I/OComputation 10%22.5% 4X CPU Speedup is 3X (Not 4X)

8 Motivation (Cont…) 10%90% 1X CPU I/OComputation 10%9% 10X CPU Speedup is 5X (Not 10X)

9 Motivation (Cont…) Amdahl’s law:   S = observed speedup   f = fraction of work in faster mode,   k = speedup while in faster mode 10%90% 1X CPU I/OComputation 10%9% 10X CPU Speedup is 5X (Not 10.0)

10 Motivation (Cont…) Amdahl’s law:   S = observed speedup   f = fraction of work in faster mode=0.9,   k = speedup while in faster mode=10 10%90% 1X CPU I/OComputation 10%9% 10X CPU Speedup is 5X (Not 10.0)

11 Summary At some point, the I/O bottleneck will dominate and must be addressed. How? Two alternative approaches: 1. 1. Expensive disk drives that are faster, or 2. 2. Redundant arrays of inexpensive disks. Why RAID?   Less expensive than approach 1,   Higher performance,   Scalable,   Lower power consumption,   More reliable.

12 Target Workload Scientific applications that read/write/read- modify-write large amount of data (Megabytes and Gigabytes).   Service time is dominated by transfer time.   Need for high data rate. OLTP applications that read/write/read- modify-write 4 Kilobyte blocks.   Service time is dominated by the number of disk arms, seek and rotational delays.   Need for high I/O rate.

13 RAID 1: Disk Mirroring Contents of disks 1 and 2 are identical. Redundant paths keep data available in the presence of either a controller or disk failure. A write operation by a CPU is directed to both disks. A read operation is directed to one of the disks.   Each disk might be reading different sectors simultaneously. Tandem’s architecture Controller 1Controller 2 CPU 1 Disk 1Disk 2

14 RAID 1: Disk Mirroring (Cont…) Space overhead is 100% because data is replicated. Key performance metrics:   Service time   Response time   Throughput

15 RAID 1: Disk Mirroring (Cont…) Most expensive option because space overhead is 100% as data is replicated. Key performance metrics:   Service time is a function of unit of transfer with size B   S(B) = Seek Time + Rotational Latency + (B/tfr)   Response time   RT = S(B) + queuing delays   Throughput   Number of requests serviced per unit of time. This paper uses throughput as a comparison yard stick. It assumes there are always requests waiting to be processed.

16 RAID 1: Large Requests What are the possible ways to process a read request for a large request referencing data item A? Controller 1Controller 2 CPU 1 Disk 1Disk 2 A A

17 RAID 1: Large Requests 1. 1. Process the request using one of the drives.   What is the disadvantage? Controller 1Controller 2 CPU 1 Disk 1Disk 2 A A

18 RAID 1: Large Requests 1. 1. Process the request using one of the drives.   What is the disadvantage?   Consider two requests, a read and a write, that arrive back to back.   The write request must wait for the read request to finish on Disk 1. Disk 1Disk 2 A A Read A Write B

19 RAID 1: Large Requests 2. 2. Retrieve ½ of A from each disk drive:   Distributes requests more evenly, minimizing queuing delays.   Observed seek and rotational latencies will be larger than the expected averages.   Introduce a slow down factor S, 1 ≤ S ≤ 2 Disk 1Disk 2 A1A1 A2A2 A1A1 A2A2 Read A 1 Write B Read A 2 Write B

20 RAID 1: Throughput With large reads, two requests are serviced simultaneously. When compared with 1 disk, throughput of RAID-1 is twice. With D data disks, throughput of RAID-1 is 2D. With slow-down factor, it is 2D/S. Controller 1Controller 2 CPU 1 Disk 1Disk 2

21 RAID 1: Small Reads With small reads, partitioning the request across the disks wastes bandwidth because of seek and rotational delay. Direct the request to one disk. Factor of improvement relative to 1 disk is D. Disk 1Disk 2 A Read A Write B A

22 RAID 1: Writes All write operations are directed to both disks in a mirrored pair. When compared with 1 disk, the throughput of a mirrored pair is identical. With D disks, when compared with 1 disk, the throughput is D times higher. With large writes because reads are partitioned, the rotational delay and seek time are higher than average.   With D disks, when compared with 1 disk, the throughput is D/S times higher.

23 RAID 1: R-M-W With Read-Modify-Write operations, compare RAID 1 with 1 disk for small writes: In the time required to process 3 requests with 1 disk, RAID 1 processed 4. Throughput is 4/3 higher with RAID 1. With D disks, it will be 4D/3 times higher. Disk 1Disk 2 Disk R1 W1 R2 W2 R3 W3 R2 W1 W2 R4 W3 W4 R1 W1 W2 R3 W3 W4 Modify Time Modify

24 RAID 1: Summary When compared with 1 disk:..

25 RAID 1: Disk Failures When Disk 2 fails, Disk 1 services all read and write requests. To repair Disk 2: 1. 1. replace it with a new disk drive, and 2. 2. copy data from Disk 1 to the new one: Eager versus Lazy. MTTR is the time to perform both steps. During the copy process, the performance of RAID1 might be slower than one disk. If Disk 1 fails while Disk 2 is in the process of being re- constructed then data is lost.   The likelihood of this is very small. Controller 1Controller 2 CPU 1 Disk 1Disk 2

26 RAID 2 Ignore RAID 2 because it only a stepping stone towards describing RAID 3 to 5.

27 RAID 3 to 5 Computers represent data as a sequence of 0s and 1s (bits): …01011110101010000001101001111…. Eight consecutive bits is a byte. Disks store data in 512 byte sectors. Unit of transfer is greater than one sector. Disk controllers detect failed disks.

28 RAID 3 A fixed number of data disks with one parity disk. Bits of a unit of transfer are interleaved across disks. A parity bit is computed for the G disks in a group. The parity bit is stored on the parity disk. Ideal for scientific applications with large transfers. Very low performance for applications with a small units of transfer, e.g., 4 KB.

29 RAID 3 E.g., consider a block consisting of 01011110101010000001101001111 0101 1111 0101 1010 Disk 1Disk 2Disk 3Disk 4Parity

30 RAID 3 E.g., consider a block consisting of 01011110101010000001101001111….. 0101 1111 0101 1010 Disk 1Disk 2Disk 3Disk 4Parity 0101

31 RAID 3: Large Blocks Reads A reference to a large block activates all 4 data disks in a group, reads all the bits and assembles all bits to re-construct the block. When compared with one disk, throughput with D data disks is D times higher. 01011110101010000001101001111 0101 1111 0101 1010 Disk 1Disk 2Disk 3Disk 4Parity 0101

32 RAID 3: Large Blocks Reads When disk 3 fails, reference for a large unit of transfer activates the remaining data disks and the parity disk. Parity bit is used to construct the missing bit from Disk 3. 01011110101010000001101001111 0101 1111 0101 1010 Disk 1Disk 2Disk 3Disk 4Parity 0101

33 RAID 3: Small Blocks Reads Bad news: Small reads of less than the group size, requires reading the whole group.   E.g., read of one sector, requires read of 4 sectors.   One parity group has the read rate identical to one disk. 01011110101010000001101001111 0101 1111 0101 1010 Disk 1Disk 2Disk 3Disk 4Parity 0101

34 RAID 3: Small Block Reads Given a large number of disks, say D=12, enhance performance by constructing several parity groups, say 3. With G (4) disks per group and D (say 8), the number of read requests supported by RAID 3 when compared with one disks is the number of groups (2). Number of groups is D/G. Disk 1Disk 2Disk 3Disk 4Parity 1Disk 5Disk 6Parity 2Disk 7Disk 8 … Parity Group 1Parity Group 2

35 RAID 3: Small Writes A small write requires 1. 1. Reading all sectors in a group, 2. 2. Computing the new parity group using old and new data, 3. 3. Writing the new sectors in a group along with the new parity block.   When compared with one disk, the throughput of RAID 3 with D data disks for small writes is D/2G.   For Read-Modify-Writes, read all sectors in a group, modify the group in memory, write it back. Throughput with D disks is D/G higher than one disk.

36 RAID 3: Summary


Download ppt "Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California."

Similar presentations


Ads by Google