CSE521: Introduction to Computer Architecture Mazin Yousif I/O Subsystem RAID (Redundant Array of Independent Disks)

Slides:



Advertisements
Similar presentations
RAID Redundant Arrays of Independent Disks Courtesy of Satya, Fall 99.
Advertisements

Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
What is RAID Redundant Array of Independent Disks.
1 Lecture 18: RAID n I/O bottleneck n JBOD and SLED n striping and mirroring n classic RAID levels: 1 – 5 n additional RAID levels: 6, 0+1, 10 n RAID usage.
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
RAID Redundant Array of Independent Disks
CSCE430/830 Computer Architecture
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
R.A.I.D. Copyright © 2005 by James Hug Redundant Array of Independent (or Inexpensive) Disks.
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
CSCE 212 Chapter 8 Storage, Networks, and Other Peripherals Instructor: Jason D. Bakos.
RAID Technology. Use Arrays of Small Disks? 14” 10”5.25”3.5” Disk Array: 1 disk design Conventional: 4 disk designs Low End High End Katz and Patterson.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
EECC551 - Shaaban #1 Lec # 13 Winter Magnetic Disk CharacteristicsMagnetic Disk Characteristics I/O Connection StructureI/O Connection Structure.
1 Recap (RAID and Storage Architectures). 2 RAID To increase the availability and the performance (bandwidth) of a storage system, instead of a single.
Computer ArchitectureFall 2007 © November 28, 2007 Karem A. Sakallah Lecture 24 Disk IO and RAID CS : Computer Architecture.
1 Storage (cont’d) Disk scheduling Reducing seek time (cont’d) Reducing rotational latency RAIDs.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 12: Mass-Storage Systems.
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
S.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
I/O Systems and Storage Systems May 22, 2000 Instructor: Gary Kimura.
RAID Systems CS Introduction to Operating Systems.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Redundant Array of Inexpensive Disks (RAID). Redundant Arrays of Disks Files are "striped" across multiple spindles Redundancy yields high data availability.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Storage Systems CSE 598d, Spring 2007 Lecture 5: Redundant Arrays of Inexpensive Disks Feb 8, 2007.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
Eng. Mohammed Timraz Electronics & Communication Engineer University of Palestine Faculty of Engineering and Urban planning Software Engineering Department.
Memory/Storage Architecture Lab Computer Architecture Lecture Storage and Other I/O Topics.
Storage & Peripherals Disks, Networks, and Other Devices.
Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
1 Recitation 8 Disk & File System. 2 Disk Scheduling Disks are at least four orders of magnitude slower than main memory –The performance of disk I/O.
CSE 321b Computer Organization (2) تنظيم الحاسب (2) 3 rd year, Computer Engineering Winter 2015 Lecture #4 Dr. Hazem Ibrahim Shehata Dept. of Computer.
Redundant Array of Independent Disks
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
Two or more disks Capacity is the same as the total capacity of the drives in the array No fault tolerance-risk of data loss is proportional to the number.
L/O/G/O External Memory Chapter 3 (C) CS.216 Computer Architecture and Organization.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
1 Chapter 7: Storage Systems Introduction Magnetic disks Buses RAID: Redundant Arrays of Inexpensive Disks.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
RAID COP 5611 Advanced Operating Systems Adapted from Andy Wang’s slides at FSU.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
CE Operating Systems Lecture 20 Disk I/O. Overview of lecture In this lecture we will look at: Disk Structure Disk Scheduling Disk Management Swap-Space.
RAID SECTION (2.3.5) ASHLEY BAILEY SEYEDFARAZ YASROBI GOKUL SHANKAR.
Redundant Array of Independent Disks.  Many systems today need to store many terabytes of data.  Don’t want to use single, large disk  too expensive.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
The concept of RAID in Databases By Junaid Ali Siddiqui.
Raid Techniques. Redundant Array of Independent Disks RAID is a great system for increasing speed and availability of data. More data protection than.
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
Chapter 14: Mass-Storage Systems Disk Structure. Disk Scheduling. RAID.
W4118 Operating Systems Instructor: Junfeng Yang.
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
RAID.
RAID Redundant Arrays of Independent Disks
Disks and RAID.
RAID Non-Redundant (RAID Level 0) has the lowest cost of any RAID
RAID RAID Mukesh N Tekwani
TECHNICAL SEMINAR PRESENTATION
UNIT IV RAID.
RAID RAID Mukesh N Tekwani April 23, 2019
Presentation transcript:

CSE521: Introduction to Computer Architecture Mazin Yousif I/O Subsystem RAID (Redundant Array of Independent Disks)

MSY F02 2 RAID Improvements in Microprocessor performance (~ 50%) widely exceeds that of disk access time (~ 10%) - depends on Mechanical System Improvements in Magnetic Media Densities has also been Slow (~ 20%) Solution: Disk Arrays: Uses Parallelism between Multiple Disks to Improve Aggregate I/O Performance –Disk Arrays stripe data across multiple disks and access them in parallel Capacity Penalty to store redundant data Bandwidth Penalty to update it RAID

MSY F02 3 Positive Aspects of Disk Arrays: –Higher data transfer rate on large data accesses –Higher I/O rates on small data accesses –Uniform load balancing across all the disks - no hot spots (Hopefully) Negative Aspects of Disk Arrays: –Higher vulnerability to disk failures - Need to employ redundancy in the form of Error Correcting Code to tolerate failures Several Data Striping and Redundancy Schemes Sequential access generates highest data transfer with minimal head positioning Random accesses generates high I/O rates with lots of head positioning RAID

MSY F02 4 Data is Striped for improved performance –Distributes data over multiple disks to make them appear as a single fast large disk –Allows multiple I/Os to be serviced in parallel Multiple independent requests serviced in parallel A block request may be serviced in parallel by multiple disks Data is Redundant for improved reliability –Large number of disks in an array lowers the reliability of the array Reliability of N disks = Reliability of 1 disk /N Example: –50,000 hours / 70 disks = 700 hours –Disk System MTTF drops from 6 years to 1 month –Arrays without redundancy are unreliable to be useful RAID

MSY F02 5 RAID 0 (Non-redundant) –Stripes Data; but does not employ redundancy –Lowest cost of any RAID –Best Write performance - no redundant information –Any single disk failure is catastrophic –Used in environments where performance is more important than reliability. RAID

MSY F02 6 D0D3D2D1 D4 D7 D6D5 D8 D11 D10 D9 D12 D15 D14D13 D19 D18D17D16 Stripe Unit Stripe RAID Disk 1Disk 4Disk 3Disk 2

MSY F02 7 RAID 1 (Mirrored) –Uses twice as many disks as non-redundant arrays - 100% Capacity Overhead - Two copies of data are maintained –Data is simultaneously written to both arrays –Data is read from the array with shorter queuing, seek and rotation delays - Best Read Performance. –When a disk fails, mirrored copy is still available –Used in environments where availability and performance (I/O rate) are more important than storage efficiency. RAID

MSY F02 8 RAID 2 (Memory Style ECC) –Uses Hamming code - parity for distinct overlapping subsets of data –# of redundant disks is proportional to log of total # of disks - Better for large # of disks - e.g., 4 data disks require 3 redundant disks –If disk fails, other data in subset is used to regenerate lost data –Multiple redundant disks are needed to identify faulty disk RAID

MSY F02 9 RAID 3 (Bit Interleaved Parity) –Data is bit -wise over the data disks –Uses Single parity disk to tolerate disk failures - Overhead is 1/N –Logically a single high capacity, high transfer rate disk –Reads access data disks only; Writes access both data and parity disks –Used in environments that require high BW (Scientific, Image Processing, etc.), and not high I/O rates RAID

MSY F02 10 RAID 4 (Block Interleaved Parity) –Similar to bit-interleaved parity disk array; except data is block- interleaved (Striping Units) –Read requests smaller than one striping unit, access one Striping unit –Write requests update the data block; and the parity block. –Generating parity requires 4 I/O accesses (RMW) –Parity disk gets updates on all writes - a bottleneck RAID

MSY F02 11 RAID 5 (Block-Interleaved Distributed Parity) –Eliminates the parity disk bottleneck in RAID 4 - Distributes parity among all the disks –Data is distributed among all disks –All disks participates in read requests - Better performance than RAID 4 –Write requests update the data block; and the parity block. –Generating parity requires 4 I/O accesses (RMW) –Left symmetry v.s. Right Symmetry - Allows each disk to be traversed once before any disk twice RAID

MSY F02 12 D0PD3D2D1 D4D7PD6D5 D8D11D10PD9 D12D15D14D13P PD19D18D17D16 Stripe Unit Stripe RAID

MSY F02 13 D0’D0PD3D2D1 + + Old Parity (2. Read) Old Data 1. Read New Data D0’P’D3D2D1 3. Write New Data 4. Write New Parity RAID

MSY F02 14 RAID 6 (P + Q Redundancy) –Uses Reed-Solomon codes to protect against up to 2 disk failures –Data is distributed among all disks –Two sets of parity P & Q –Write requests update the data block; and the parity blocks. –Generating parity requires 6 I/O accesses (RMW) - update both P & Q –Used in environments that require stringent reliability requirements RAID

MSY F02 15 Comparisons –Read/Write Performance RAID 0 provides the best Write performance RAID 1 provides the best Read Performance –Cost - Total # of Disks RAID 1 is most expensive - 100% capacity overhead - 2N Disks RAID 0 is least expensive - N Disks - no redundancy RAID 2 needs N + ceiling(log 2 N) + 1 RAID 3, RAID 4 & RAID 5 needs N + 1 disks RAID Comparisons

MSY F02 16 Preferred Environments –RAID 0: Performance & capacity are more important than reliability –RAID 1: High I/O rate, high availability environments –RAID 2: Large I/O Data Transfer –RAID 3: High BW Applications (Scientific, Image Processing…) –RAID 4: High bit BW Applications –RAID 5 & RAID 6: Mixed Applications RAID

MSY F02 17 RAID

MSY F02 18 Performance: –What metric ? IOPS ? Byte/sec ? Response Time ? IOPS per $$ ? Hybrid ? –Application Dependent Transaction Processing: IOPS per $$ Scientific Applications: Bytes/sec per $$ File Servers: Both IOPS and Bytes/sec Time-Sharing Applications: User Capacity per $$ RAID

MSY F02 19 RAID LevelSmall ReadsSmall WritesLarge Reads Large WritesStorage Efficiency RAID RAID 111/21 RAID 31/G (G-1)/G RAID 51max(1/G,1/4)1(G-1)/G RAID 61max(1/G,1/6)1(G-2)/G The table below, which shows Throughput per $$ relative to RAID 0, assumes that G drives in an error correcting group * RAID 3 performance/cost is always =< RAID 5 performance RAID

MSY F02 20 Performance Issues Improving Small Write Performance for RAID 5: –Writes need 4 I/O accesses; Overhead is emphasized for small writes Response time increases by factor of 2; Throughput decreases by factor of 4. In contrast, RAID 1 writes require two writes - concurrent - latency may increase; throughput decreases by factor of 2. Three techniques to improve RAID 5 performance Buffering & Caching: –Disk cache (Write buffering) acknowledges the host before data is written to disk –Under high load, write backs increase & response time goes back to 4 times RAID 0 –During write back, group sequential writes together –Keep a copy of old data before writing ==> 3 I/O accesses –Keep the new parity & new data in cache; Any later updates will require 2 I/O accesses RAID

MSY F02 21 Performance Issues Floating Parity: –Shortens RMW of small writes to average 1 I/O access –Clusters parity into cylinders; each containing a track of free blocks –When new parity needs updating, it is written on the closest unallocated block following old parity  New parity update is approximately one read plus 1msec. –Overhead: Directories for unallocated blocks and parity blocks in a cache in RAID adapter  Mbytes of memory –Floating Data?? Larger directories sequential data may become discontinuous on disk RAID

MSY F02 22 Performance Issues Parity Logging: –Delay writing the new parity –Create an “update image” - difference between old & new parity - and store in log file in RAID adapter –Hopefully, can group several parity blocks when writing back –Log file is stored in NVRAM - can extend NVRAM to disk space –Although, may be more I/Os, but efficient since large chunks of data are processed –Logging reduces I/O accesses for small writes from 4 to possibly 2+ –Overhead: NVRAM, extra disk space, memory when applying parity update image to old parity RAID

MSY F02 23 Hardware v.s. Software RAID RAID can be implemented in the OS –In RAID 1, Hardware RAID allows 100% mirroring. OS implemented mirroring must distinguish between Master & Slave drives. Only master drive has the boot code; if it fails, you can continue work, but no booting is possible Hardware mirroring does not have this drawback –Since software RAIDs implement standard SCSI, repair functions such as support for spare drives and hot plug have not been implemented; in contrast hardware RAID implements various repair functions. –Hardware RAID improves system performance with its caching system, especially during high load situations, and synchronization –Microsoft Windows NT implements RAID 0 and RAID1 RAID

MSY F02 24 What RAID for which application –Fast Workstation: Caching is important to improve I/O rate If large files are installed, then RAID 0 may be necessary It is preferred to put the OS and swap files in separate drives from user drives to minimize movement between swap file area & user area. –Small Server: RAID 1 is preferred –Mid-Size Server: If more capacity is needed, then RAID 5 is recommended –Large Server: e.g. Database Servers RAID 5 is preferred Separate different I/Os in mechanically independent arrays; place index & data files in databases in different arrays RAID