Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.

Slides:



Advertisements
Similar presentations
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
Advertisements

RAID Redundant Arrays of Independent Disks Courtesy of Satya, Fall 99.
A CASE FOR REDUNDANT ARRAYS OF INEXPENSIVE DISKS (RAID) D. A. Patterson, G. A. Gibson, R. H. Katz University of California, Berkeley.
I/O Chapter 8. Outline Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
CS224 Spring 2011 Computer Organization CS224 Chapter 6A: Disk Systems With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide.
Lecture 4: A Case for RAID (Part 2) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.
RAID Redundant Array of Inexpensive Disks Presented by Greg Briggs.
1 Lecture 18: RAID n I/O bottleneck n JBOD and SLED n striping and mirroring n classic RAID levels: 1 – 5 n additional RAID levels: 6, 0+1, 10 n RAID usage.
Faculty of Information Technology Department of Computer Science Computer Organization Chapter 7 External Memory Mohammad Sharaf.
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
RAID Redundant Array of Independent Disks
CSCE430/830 Computer Architecture
CA 714CA Midterm Review. C5 Cache Optimization Reduce miss penalty –Hardware and software Reduce miss rate –Hardware and software Reduce hit time –Hardware.
CPE 442 io.1 Introduction To Computer Architecture CpE 442 I/O Systems.
CSE521: Introduction to Computer Architecture Mazin Yousif I/O Subsystem RAID (Redundant Array of Independent Disks)
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
1 A Case for Redundant Arrays of Inexpensive Disks Patterson, Gibson and Katz (Seminal paper) Chen, Lee, Gibson, Katz, Patterson (Survey) Circa late 80s..
CSCE 212 Chapter 8 Storage, Networks, and Other Peripherals Instructor: Jason D. Bakos.
RAID Technology. Use Arrays of Small Disks? 14” 10”5.25”3.5” Disk Array: 1 disk design Conventional: 4 disk designs Low End High End Katz and Patterson.
Computer ArchitectureFall 2007 © November 28, 2007 Karem A. Sakallah Lecture 24 Disk IO and RAID CS : Computer Architecture.
1 Storage (cont’d) Disk scheduling Reducing seek time (cont’d) Reducing rotational latency RAIDs.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
S.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
Disks CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Multimedia Information Systems Shahram Ghandeharizadeh Computer Science Department University of Southern California.
RAID Systems CS Introduction to Operating Systems.
Lecture 39: Review Session #1 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Storage Systems CSE 598d, Spring 2007 Lecture 5: Redundant Arrays of Inexpensive Disks Feb 8, 2007.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
Storage & Peripherals Disks, Networks, and Other Devices.
Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
1 Recitation 8 Disk & File System. 2 Disk Scheduling Disks are at least four orders of magnitude slower than main memory –The performance of disk I/O.
1 File System Implementation Operating Systems Hebrew University Spring 2010.
Chapter 2 Data Storage How does a computer system store and manage very large volumes of data ?
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
1 Chapter 7: Storage Systems Introduction Magnetic disks Buses RAID: Redundant Arrays of Inexpensive Disks.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
RAID SECTION (2.3.5) ASHLEY BAILEY SEYEDFARAZ YASROBI GOKUL SHANKAR.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
The concept of RAID in Databases By Junaid Ali Siddiqui.
COSC 3330/6308 Solutions to the Third Problem Set Jehan-François Pâris November 2012.
CS 6290 I/O and Storage Milos Prvulovic. Storage Systems I/O performance (bandwidth, latency) –Bandwidth improving, but not as fast as CPU –Latency improving.
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
1 Lecture 23: Storage Systems Topics: disk access, bus design, evaluation metrics, RAID (Sections )
Abstract Increases in CPU and memory will be wasted if not matched by similar performance in I/O SLED vs. RAID 5 levels of RAID and respective cost/performance.
Chapter 14: Mass-Storage Systems Disk Structure. Disk Scheduling. RAID.
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Lecture Topics: 11/22 HW 7 File systems –block allocation Unix and NT –disk scheduling –file caches –RAID.
W4118 Operating Systems Instructor: Junfeng Yang.
LECTURE 13 I/O. I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access.
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
CS Introduction to Operating Systems
A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988
Database Applications (15-415) DBMS Internals- Part I Lecture 11, February 16, 2016 Mohammad Hammoud.
RAID Redundant Arrays of Independent Disks
Module: Storage Systems
ICOM 6005 – Database Management Systems Design
Lecture 28: Reliability Today’s topics: GPU wrap-up Disk basics RAID
Disks and scheduling algorithms
Mass-Storage Systems.
Presentation transcript:

Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California

Terminology RAID stands for Redundant Array of Inexpensive Disks. Supercomputer applications are today’s scientific applications. Disk drive:   Seek time   Rotational Latency   Transfer time DRAM: Dynamic Random Access Memory WORM: Write Once Read Many MTTF: Mean Time To Failure   Basic measure of reliability for non-repairable systems.   Statistical value and meant to be the mean over a long period of time and large number of units.

Motivation Scientific applications   Read a large volume of data, perform computation, and write a large volume of data. Examples include simulation of earth-quakes, material science, and others.   While CPUs are becoming faster, I/O has not kept pace. What is the impact of improving the performance of some pieces of a problem while leaving others the same?

Example Consider a scientific task that spends 10% of its time performing I/O and the remaining 90% computing. 10%90%

Motivation (Cont…) 10%90% 1X CPU I/OComputation

Motivation (Cont…) 10%90% 1X CPU I/OComputation 10%45% 2X CPU Speedup is 1.8X (Not 2X)

Motivation (Cont…) 10%90% 1X CPU I/OComputation 10%22.5% 4X CPU Speedup is 3X (Not 4X)

Motivation (Cont…) 10%90% 1X CPU I/OComputation 10%9% 10X CPU Speedup is 5X (Not 10X)

Motivation (Cont…) Amdahl’s law:   S = observed speedup   f = fraction of work in faster mode,   k = speedup while in faster mode 10%90% 1X CPU I/OComputation 10%9% 10X CPU Speedup is 5X (Not 10.0)

Motivation (Cont…) Amdahl’s law:   S = observed speedup   f = fraction of work in faster mode=0.9,   k = speedup while in faster mode=10 10%90% 1X CPU I/OComputation 10%9% 10X CPU Speedup is 5X (Not 10.0)

Summary At some point, the I/O bottleneck will dominate and must be addressed. How? Two alternative approaches: Expensive disk drives that are faster, or Redundant arrays of inexpensive disks. Why RAID?   Less expensive than approach 1,   Higher performance,   Scalable,   Lower power consumption,   More reliable.

Target Workload Scientific applications that read/write/read- modify-write large amount of data (Megabytes and Gigabytes).   Service time is dominated by transfer time.   Need for high data rate. OLTP applications that read/write/read- modify-write 4 Kilobyte blocks.   Service time is dominated by the number of disk arms, seek and rotational delays.   Need for high I/O rate.

RAID 1: Disk Mirroring Contents of disks 1 and 2 are identical. Redundant paths keep data available in the presence of either a controller or disk failure. A write operation by a CPU is directed to both disks. A read operation is directed to one of the disks.   Each disk might be reading different sectors simultaneously. Tandem’s architecture Controller 1Controller 2 CPU 1 Disk 1Disk 2

RAID 1: Disk Mirroring (Cont…) Space overhead is 100% because data is replicated. Key performance metrics:   Service time   Response time   Throughput

RAID 1: Disk Mirroring (Cont…) Most expensive option because space overhead is 100% as data is replicated. Key performance metrics:   Service time is a function of unit of transfer with size B   S(B) = Seek Time + Rotational Latency + (B/tfr)   Response time   RT = S(B) + queuing delays   Throughput   Number of requests serviced per unit of time. This paper uses throughput as a comparison yard stick. It assumes there are always requests waiting to be processed.

RAID 1: Large Requests What are the possible ways to process a read request for a large request referencing data item A? Controller 1Controller 2 CPU 1 Disk 1Disk 2 A A

RAID 1: Large Requests Process the request using one of the drives.   What is the disadvantage? Controller 1Controller 2 CPU 1 Disk 1Disk 2 A A

RAID 1: Large Requests Process the request using one of the drives.   What is the disadvantage?   Consider two requests, a read and a write, that arrive back to back.   The write request must wait for the read request to finish on Disk 1. Disk 1Disk 2 A A Read A Write B

RAID 1: Large Requests Retrieve ½ of A from each disk drive:   Distributes requests more evenly, minimizing queuing delays.   Observed seek and rotational latencies will be larger than the expected averages.   Introduce a slow down factor S, 1 ≤ S ≤ 2 Disk 1Disk 2 A1A1 A2A2 A1A1 A2A2 Read A 1 Write B Read A 2 Write B

RAID 1: Throughput With large reads, two requests are serviced simultaneously. When compared with 1 disk, throughput of RAID-1 is twice. With D data disks, throughput of RAID-1 is 2D. With slow-down factor, it is 2D/S. Controller 1Controller 2 CPU 1 Disk 1Disk 2

RAID 1: Small Reads With small reads, partitioning the request across the disks wastes bandwidth because of seek and rotational delay. Direct the request to one disk. Factor of improvement relative to 1 disk is D. Disk 1Disk 2 A Read A Write B A

RAID 1: Writes All write operations are directed to both disks in a mirrored pair. When compared with 1 disk, the throughput of a mirrored pair is identical. With D disks, when compared with 1 disk, the throughput is D times higher. With large writes because reads are partitioned, the rotational delay and seek time are higher than average.   With D disks, when compared with 1 disk, the throughput is D/S times higher.

RAID 1: R-M-W With Read-Modify-Write operations, compare RAID 1 with 1 disk for small writes: In the time required to process 3 requests with 1 disk, RAID 1 processed 4. Throughput is 4/3 higher with RAID 1. With D disks, it will be 4D/3 times higher. Disk 1Disk 2 Disk R1 W1 R2 W2 R3 W3 R2 W1 W2 R4 W3 W4 R1 W1 W2 R3 W3 W4 Modify Time Modify

RAID 1: Summary When compared with 1 disk:..

RAID 1: Disk Failures When Disk 2 fails, Disk 1 services all read and write requests. To repair Disk 2: replace it with a new disk drive, and copy data from Disk 1 to the new one: Eager versus Lazy. MTTR is the time to perform both steps. During the copy process, the performance of RAID1 might be slower than one disk. If Disk 1 fails while Disk 2 is in the process of being re- constructed then data is lost.   The likelihood of this is very small. Controller 1Controller 2 CPU 1 Disk 1Disk 2

RAID 2 Ignore RAID 2 because it only a stepping stone towards describing RAID 3 to 5.

RAID 3 to 5 Computers represent data as a sequence of 0s and 1s (bits): … …. Eight consecutive bits is a byte. Disks store data in 512 byte sectors. Unit of transfer is greater than one sector. Disk controllers detect failed disks.

RAID 3 A fixed number of data disks with one parity disk. Bits of a unit of transfer are interleaved across disks. A parity bit is computed for the G disks in a group. The parity bit is stored on the parity disk. Ideal for scientific applications with large transfers. Very low performance for applications with a small units of transfer, e.g., 4 KB.

RAID 3 E.g., consider a block consisting of Disk 1Disk 2Disk 3Disk 4Parity

RAID 3 E.g., consider a block consisting of … Disk 1Disk 2Disk 3Disk 4Parity 0101

RAID 3: Large Blocks Reads A reference to a large block activates all 4 data disks in a group, reads all the bits and assembles all bits to re-construct the block. When compared with one disk, throughput with D data disks is D times higher Disk 1Disk 2Disk 3Disk 4Parity 0101

RAID 3: Large Blocks Reads When disk 3 fails, reference for a large unit of transfer activates the remaining data disks and the parity disk. Parity bit is used to construct the missing bit from Disk Disk 1Disk 2Disk 3Disk 4Parity 0101

RAID 3: Small Blocks Reads Bad news: Small reads of less than the group size, requires reading the whole group.   E.g., read of one sector, requires read of 4 sectors.   One parity group has the read rate identical to one disk Disk 1Disk 2Disk 3Disk 4Parity 0101

RAID 3: Small Block Reads Given a large number of disks, say D=12, enhance performance by constructing several parity groups, say 3. With G (4) disks per group and D (say 8), the number of read requests supported by RAID 3 when compared with one disks is the number of groups (2). Number of groups is D/G. Disk 1Disk 2Disk 3Disk 4Parity 1Disk 5Disk 6Parity 2Disk 7Disk 8 … Parity Group 1Parity Group 2

RAID 3: Small Writes A small write requires Reading all sectors in a group, Computing the new parity group using old and new data, Writing the new sectors in a group along with the new parity block.   When compared with one disk, the throughput of RAID 3 with D data disks for small writes is D/2G.   For Read-Modify-Writes, read all sectors in a group, modify the group in memory, write it back. Throughput with D disks is D/G higher than one disk.

RAID 3: Summary