We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byIsabella McKenna
Modified over 3 years ago
© 2006 Hitachi Data Systems RAID Concepts A. Ian Vogelesang Tools Competency Center (TCC) Hitachi Data Systems Hitachi Data Systems WebTech Series
2 © 2006 Hitachi Data Systems RAID Concepts Who should attend: – Systems and Storage Administrators – Storage Specialists & Consultants – IT Team Lead – System and Network Architects – IT Staff – Operations and IT Managers – Others who are looking for storage management techniques Hitachi Data Systems WebTech Educational Seminar Series
3 © 2006 Hitachi Data Systems How RAID type impacts cost The factors we will examine –Disk drive capacity vs. disk drive IOPS capability –The impact of RAID level on disk drive activity Topics to cover along the way –RAID concepts (RAID-1 vs. RAID-5 vs. RAID-6). –The 30-second elevator pitch on data flow through the subsystem. Conclusion will be –That I/O access pattern very often is the determining factor, rather than storage capacity in GB.
4 © 2006 Hitachi Data Systems Growth in recording density drives $/GB Production Year 1E-3 1E-2 1E-1 1E+0 1E+1 1E+2 1E+3 1E+4 1E+5 1E+6 Areal Density Megabits/in 2 1st MR Head 1st GMR Head % CGR 100% CGR 40% /yr Areal Density Progress Perpendicular Recording % CGR IBM RAMAC (First Hard Disk Drive)
5 © 2006 Hitachi Data Systems Time Areal Density (Gb/in 2 ) Longitudinal Perpendicular Bit Patterned Media Thermally-assisted writing ,500-4,000 2,000-15,000 10,000 Gb/in 2 = 10 Tb/in 2 50 TB 3.5-inch drive 12 TB 2.5-inch drive 1 TB 1-inch drive ~ 60 M fold increase 50 Years > 50 Million increase in areal density Areal density growth will continue
6 © 2006 Hitachi Data Systems Heres the problem Drive capacities keep doubling every 1.5 years or so If you take the data that used to be on two disk drives and put it onto one drive thats twice as big, you will also be combining the I/O activity that was on the original two drives onto the one double-size drive. The problem is that as drive capacity keeps increasing, the number of I/Os per second (IOPS) that a drive can handle has not been increasing. –An I/O operation consists of a seek, ½ turn latency, and data transfer. –Data transfer for a 4K block is now down to around 1 % of a rotation. –To position the head takes over 1 rotation (seek + ½ turn latency) –IOPS capability is ALL about mechanical positioning
7 © 2006 Hitachi Data Systems IOPS capability at 50% busy by drive type * K7310K14610K30015K7315K146 SATA 7K400 Read Write * Includes read verify after write Note that IOPS capability is the same for different drive capacities with same RPM green zoneThese are green zone upper limits per drive for back-end I/O, including RAID-penalty I/Os 4k random IOPS at 50% busy by drive type
8 © 2006 Hitachi Data Systems Access density capability When we talked about combining the data that used to be on two drives onto one double-size drive, and how that also combines (doubles) the I/O activity to the bigger drive, this illustrates that for a given workload there is a certain amount of I/O activity per GB of data. This activity per GB is called the access density of the workload, and is measured in IOPS per GB. Over the last few decades, as disk drive storage capacity has become much cheaper, from a humble beginning it became economic to store graphics, then audio, and now video. –The introduction of these new data types has reduced typical access densities by about a factor of 10 over the last 20 years. However, access density is going down slower than disk drive capacity is going up. –Typical access densities are reported in the 0.6 to 1.0 IOPS per GB range
9 © 2006 Hitachi Data Systems Random read IOPS capability by drive type This chart shows what access density each drive type can handle if you fill it up with data. marks green zone upper limit at 50% busy. The position of the left to right shows the maximum access density that the drive can comfortably handle K 15K 7K400 10K300 15K300 10K146 10K73 15K73 15K146
10 © 2006 Hitachi Data Systems RAID makes the access density problem worse The basic idea behind RAID is to make sure that you dont lose any data when a single drive fails. So what this means is that whenever a host writes data to the subsystem, that at least two disks need to be updated. The amount of extra disk drive I/O activity needed to handle write activity is the key factor in determining the lowest cost solution as a combination of disk drive RPM, disk drive capacity, and RAID type. –So thats why we will look at how different RAID levels work It is very rare that the access density is so low that you can completely fill up the cheapest drive. –Only for things like a home PVR will a 750 GB SATA drive make the smallest dent in your wallet while getting the job done.
11 © 2006 Hitachi Data Systems 30 second elevator pitch on subsystem data flow Random read hits are stripped off by cache and do not reach the back end. Random read misses go through cache unaltered and go straight to the appropriate back end disk drive. –This is the only type of I/O operation where the host always sees the performance of the back-end disk drive. Random writes –Host sees random writes complete at electronic speed Host only sees delay if too many pending writes build up. –Each host random write is transformed going through cache into a multiple I/O pattern that depends on RAID type Sequential I/O –Host sequential I/O is at electronic speed. –Cache acts like a holding tank. –Back end puts [removes] back-end buckets of data into [out of] the tank to keep the tank at an appropriate level
12 © 2006 Hitachi Data Systems What is RAID? 1993 paper by a group of researchers at UC Berkeley –http://www.eecs.berkeley.edu/Pubs/TechRpts/1993/CSD pdfhttp://www.eecs.berkeley.edu/Pubs/TechRpts/1993/CSD pdf Redundant Array of Inexpensive Disks –The original idea was to use cheap (i.e. PC) disk drives arranged in a RAID to give you mainframe reliability. –Now most call it Redundant Array of Independent Disks A RAID is an arrangement of data on disk drives in such a way that if a disk drive fails, you can still get the data back somehow from the remaining disks –RAID-1 is mirroring – just keep two copies –RAID-5 uses parity – recovers from single drive failures –RAID-6 uses dual parity – recovers from double drive failures
13 © 2006 Hitachi Data Systems RAID-1 random reads / writes Also called mirroring Two copies of the data Requires 2x number of disk drives ABC Copy #1Copy #2 XYZ Copy #1Copy #2 or For reads, the data can be read from either disk drive Read activity distributed over both copies reduces disk drive busy (due to reads) to ½ of what it would be to read from a single (non-RAID) disk drive Copy #1Copy #2 XYZ For writes, a copy must be written to both disk drives Two parity group disk drive writes for every host write Dont care about what the previous data was, just over-write with new data
14 © 2006 Hitachi Data Systems RAID-1 sequential read 2 sets of parallel I/O operations, each set reading 4 data chunks (2 MB) Parity group data MB/s = 4 x drive MB/s Chunk 1Chunk 2Chunk 3Chunk 4Chunk 5Chunk 6Chunk 7Chunk 8Chunk 1 Chunk 2 Chunk 3 Chunk 4 Chunk 5 Chunk 6 Chunk 7 Chunk shown
15 © 2006 Hitachi Data Systems RAID-1 sequential write 4 sets of parallel I/O operations, each writing 2 data chunks (1MB) and 2 parity chunks Parity group data MB/s = 2 x drive MB/s Chunk 1Chunk 2Chunk 3Chunk 4Chunk 5Chunk 6Chunk 7Chunk 8Chunk 1 Chunk 2 Chunk 3 Chunk 4 Chunk 5 Chunk 6 Chunk 7 Chunk shown
16 © 2006 Hitachi Data Systems RAID-1 comments Since RAID-1 requires doubling the number of disk drives to store the data, people tend to think of RAID-1 as the most expensive type of RAID. However, due to the intensity of host access, in RAID subsystems often one cannot completely fill up the disk drive with data because the disk drive would become too busy. RAID-1 offers the lowest RAID penalty of only having two disk drive I/Os per random write, compared to four for RAID-5, and six for RAID-6. For this reason, when the workload is sufficiently active and has a lot of random writes, RAID-1 will be the cheapest RAID type because it has the least disk drive I/O operations per random write.
17 © 2006 Hitachi Data Systems RAID-1s RAID penalty Penalty in space –Double the number of disk drives required Penalty in disk drive utilization (disk drive % busy) –Twice the number of I/O operations required for all writes –No penalty for read operations; read operation distributed over twice the number of drives.
18 © 2006 Hitachi Data Systems RAID-5 parity concept Each parity bit indicates whether or not there is an odd number of 1 bits in that bit position across the whole parity group (odd parity). If you add more data drives, you dont add any more parity Data (odd) parity 0 XOR 1 XOR 0 = 1 There is an odd number of 1s in this bit position, so parity bit is 1 1 XOR 1 XOR 0 = 0 With an even number of 1s in this bit position, parity bit is set to 0.
19 © 2006 Hitachi Data Systems RAID-5 – if drive containing parity fails You still have the data. Better reconstruct the parity on a spare disk drive right away just in case a second drive fails Data Parity
20 © 2006 Hitachi Data Systems RAID-5 – if drive containing data fails If a drive that had data on it fails, you can reconstruct the missing data. Read the corresponding chunk from all the remaining data drives, and see how many 1 bits there are in each position. By comparing how many 1 bits there are in each bit position out of the remaining disk drives with what the parity tells you there originally was, you can reconstruct the data Better reconstruct the parity on a spare disk drive right away just in case a second drive fails Data Parity A 1 bit here says there originally was an odd number of 1 data bits in this position across the data drives Since on the remaining data disks, there is now an even number of 1 bits, we know that the missing data bit is a 1` 11111
21 © 2006 Hitachi Data Systems RAID-5 random read hit Read hits operate at electronic speed Just transfer data from cache Data #1Data #2Data #3Parity Cache Copy of data #3 Read data #3
22 © 2006 Hitachi Data Systems RAID-5 random read miss Read misses are the ONLY operation that sees the speed of the disk drive during normal (not overloaded) operation I.e. read misses are the only type of host I/O operation that does not complete at electronic speed with just an access to cache Data #1Data #2Data #3Parity Cache Copy of data #1 Copy of data #3 Read data #1
23 © 2006 Hitachi Data Systems RAID-5 random write 1)Read old data, read old parity 2)Remove old data from old parity giving partial parity (parity for the rest of the row) 3)Add new data into partial parity to generate new parity 4)Write new data and new parity to disk Data #1Data #2Data #3Parity Old data Old parity New data Cache Partial parity New parity New data New data #2 from host Partial parity corresponds to remaining part of stripe without old data
24 © 2006 Hitachi Data Systems RAID-5 sequential read The subsystem detects that the host is reading sequentially after a few sequential I/Os –(The first few are treated as random reads.) The subsystem performs sequential pre-fetch to load stripes of data from the parity group into cache in advance of when the host will request the data The subsystem can usually easily keep up with the host as transfers from the parity group are performed in parallel Cache
25 © 2006 Hitachi Data Systems RAID-5 sequential read example In parallel, read a chunk from each drive in the parity group. 3 sets of parallel I/O operations to read 12 chunks (6 MB) Parity group MB/s = 4 x drive MB/s Parity 7, 8, 9 Parity 4, 5, 6 Parity 1, 2, 3 Parity 10, 11, 12 Chunk 1Chunk 2Chunk 3Chunk 4 Chunk 1 Chunk 2Chunk 3 Chunk 4 Chunk 5Chunk 6Chunk 7Chunk 8 Chunk 5 Chunk 6 Chunk 7Chunk 8 Chunk 9Chunk 10Chunk 11Chunk 12 Chunk 9 Chunk 10Chunk 11Chunk 12
26 © 2006 Hitachi Data Systems RAID-5 sequential write First compute the parity chunk for a row Then write row to disk. 4 sets of parallel I/O operations to write 12 data chunks (6 MB) with 4 parity chunks Parity group data MB/s = 3 x drive MB/s Chunk 1Chunk 2Chunk 3 Parity 1, 2, 3 Chunk 1Chunk 2Chunk 3Parity 1, 2, 3Chunk 4 Chunk 5Chunk 6 Parity 4, 5, 6 Chunk 4Chunk 5 Chunk 6Chunk 8Chunk 7 Chunk 9 Parity 7, 8, 9 Chunk 7Chunk 8Chunk 9Chunk 10Chunk 11Chunk 12 Parity 10, 11, 12 Chunk 10Chunk 11Chunk 12
27 © 2006 Hitachi Data Systems RAID-5 comments For sequential reads and writes, RAID-5 is very good. –Its very space efficient (smallest space for parity), and sequential reads and writes are efficient, since they operate on whole stripes. For low access density (light activity), RAID-5 is very good. –The 4x RAID-5 write penalty is (nearly) invisible to the host, because its non- synchronous. For workloads with higher access density and more random writes, RAID-5 can be throughput-limited due to all the extra parity group I/O operations to handle the RAID-5 write penalty
28 © 2006 Hitachi Data Systems RAID-5 RAID penalty Penalty in space –For 3+1, 33% extra space for parity –For 7+1, 14% extra space for parity Penalty in disk drive utilization (disk drive % busy) –Random writes Four times the number of I/O operations (300% extra I/Os) –Sequential writes For 3+1, 33% extra I/Os for sequential writes For 7+1, 14% extra I/Os for sequential writes
29 © 2006 Hitachi Data Systems RAID-6 RAID-6 is an extension of the RAID-5 concept which uses two separate parity-type fields usually called P and Q. The mathematics are beyond a basic course*, but RAID-6 allows data to be reconstructed from the remaining drives in a parity group when any one or two drives have failed. *The math is the same as for ECC used to correct errors in DRAM memory or on the surface of disk drives. Each RAID-6 host random write turns into 6 parity group I/O operations –Read old data, read old P, read old Q –(Compute new P, Q) –Write new data, write new P, write new Q RAID-6 parity group sizes usually start at 6+2. –This has the same space efficiency as RAID D1D1 D2D2 D3D3 D4D4 D5D5 D6D6 PQ 6D + 2P parity group
30 © 2006 Hitachi Data Systems RAID-6 RAID penalty 6+2 penalty in space –33% extra space for parity 6+2 penalty in disk drive utilization (disk drive % busy) –Random writes Six times the number of I/O operations (500% extra I/Os) –Sequential writes 33% extra I/Os
31 © 2006 Hitachi Data Systems RAID-1 vs RAID-5 vs RAID-6 summary The concept of RAID with parity groups permits data to be recovered even upon a single drive failure for RAID-1 and RAID-5, or a double drive failure for RAID-6 RAID-1 trades off more space utilization for lower RAID penalty for writes, and lower degradation after drive failure. –RAID-1 can be cheaper (require less disk drives) than RAID-5 where there is concentrated random write activity RAID-5 achieves redundancy with less parity space overhead, but at the expense of having a higher RAID penalty for random writes, and having a larger performance degradation upon a drive failure
32 © 2006 Hitachi Data Systems 30 second elevator pitch on subsystem data flow Random read hits are stripped off by cache and do not reach the back end. Random read misses go through cache unaltered and go straight to the appropriate back end disk drive. –This is the only type of I/O operation where the host always sees the performance of the back-end disk drive. Random writes –Host sees random writes complete at electronic speed Host only sees delay if too many pending writes build up. –Each host random write is transformed going through cache into a multiple I/O pattern that depends on RAID type Sequential I/O –Host sequential I/O is at electronic speed. –Cache acts like a holding tank. –Back end puts [removes] back-end buckets of data into [out of] the tank to keep the tank at an appropriate level
33 © 2006 Hitachi Data Systems RAID-5 can often be more expensive See how much busier the back end disk drives are for the RAID- 5 configuration, all due to random writes (solid blue) In this case, the RAID-1 configuration was cheaper, because fewer disk drives were needed to handle the back-end I/O activity. RAID-1 drives could be completely filled, whereas the RAID-5 drives could only be filled to 55% of their capacity.
34 © 2006 Hitachi Data Systems Conclusions – factors driving lowest cost The lowest cost configuration in terms of disk drive RPM, disk drive capacity, and RAID type depends strongly on the access density and the read:write ratio. If there is even moderate access density with significant random write activity, RAID-1 will often turn out to be the lowest cost total solution, due to being able to fill up more of the drives capacity with data. Where access densities are higher, 15K RPM drives will often turn out to offer the lowest cost overall solution. SATA drives, due to their low IOPS capability, can only be filled if the data has very low access density, and therefore are rarely the cheapest.
35 © 2006 Hitachi Data Systems Upcoming WebTech Sessions: 19 September - Enterprise Data Replication Architectures that Work: Overview and Perspectives 17 October – 10 Steps To Determine if SANs Are Right For You
© 2006 Hitachi Data Systems Questions/Discussion
© 2009 EMC Corporation. All rights reserved. Data Protection: RAID Module 1.3.
A CASE FOR REDUNDANT ARRAYS OF INEXPENSIVE DISKS (RAID) D. A. Patterson, G. A. Gibson, R. H. Katz University of California, Berkeley.
CS224 Spring 2011 Computer Organization CS224 Chapter 6A: Disk Systems With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide.
1 RAID Overview n Computing speeds double every 3 years n Disk speeds cant keep up n Data needs higher MTBF than any component in system n IO.
Storage and Disks. 11.2Database System Concepts Now Something Different 1st part of the course: Application Oriented 2nd part of the course: Systems Oriented.
Disks CS 6560: Operating Systems Design. 2 File System: Abstraction for Secondary Storage CPUMemory Bridge Disk NIC Memory Bus (System Bus) I/O Bus.
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
L/O/G/O External Memory Chapter 3 (C) CS.216 Computer Architecture and Organization.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
1 Lecture 22: I/O, Disk Systems Todays topics: I/O overview Disk basics RAID Reminder: Assignment 8 due Tue 11/21.
Triple-Parity RAID and Beyond Hai Lu. RAID RAID, an acronym for redundant array of independent disks or also known as redundant array of inexpensive disks,
Solving a Sudoku Puzzle By Marlene Walther ATeacherFirst.com.
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Volume Concepts HP Restricted Module.
IT 344: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Chia-Chi Teng CTB 265.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Magnetic Disk Magnetic disks are the foundation of external memory on virtually all computer systems. A disk is a circular platter constructed of nonmagnetic.
W4118 Operating Systems Instructor: Junfeng Yang.
What is RAID Redundant Array of Independent Disks.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
1 Secondary Storage Devices: Magnetic Disks. 2 Secondary Storage Devices Two major types of storage devices: 1.Direct Access Storage Devices (DASDs) –Magnetic.
The 5S numbers game. 1. This sheet represents our current work place. Our job during a 20 second shift, is to strike out the numbers 1 to 49 in correct.
- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
1 Advanced Database Systems Dr. Fatemeh Ahmadi-Abkenari September 2013.
CSE 451: Operating Systems Spring 2012 Module 20 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570 ©
I/O Chapter 8. Outline Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
PSSA Preparation. Question 1(no calculator) D Question 2 (no calculator)
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
Subtraction: Adding UP. Category 1 The whole is a multiple of ten.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Disk & RAID The first HDD (1956) IBM 305 RAMAC 4 MB 50x24 disks 1200 rpm 100 ms access 35k$/y rent Included computer & accounting software.
1 Disks Introduction ***-. 2 Disks: summary / overview / abstract The following gives an introduction to external memory for computers, focusing mainly.
IBM Systems Group © 2004 IBM Corporation Nick Jones What could happen to your data? What can you do about it?
1 Topic Factoring Quadratics ax ² + bx + c Factoring Quadratics ax ² + bx + c.
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
High Performance Computing Course Notes High Performance Storage.
RAID Redundant Array of Independent Disks Redundant Array of Inexpensive Disks 6 levels in common use Not a hierarchy Set of physical disks viewed as single.
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. MODULE – 3 DATA PROTECTION – RAID Module 3: Data Protection - RAID1.
1 Chapter 11 I/O Management and Disk Scheduling Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
A. Frank - P. Weisberg Operating Systems Page Replacement Algorithms.
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
© 2017 SlidePlayer.com Inc. All rights reserved.