Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.

Slides:



Advertisements
Similar presentations
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
Advertisements

RAID Redundant Arrays of Independent Disks Courtesy of Satya, Fall 99.
Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
A CASE FOR REDUNDANT ARRAYS OF INEXPENSIVE DISKS (RAID) D. A. Patterson, G. A. Gibson, R. H. Katz University of California, Berkeley.
DISK FAILURES PROF. T.Y.LIN CS-257 Presenter: Shailesh Benake(104)
Lecture 4: A Case for RAID (Part 2) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.
I/O Management and Disk Scheduling Chapter 11. I/O Driver OS module which controls an I/O device hides the device specifics from the above layers in the.
RAID Redundant Array of Inexpensive Disks Presented by Greg Briggs.
1 A triple erasure Reed-Solomon code, and fast rebuilding Mark Manasse, Chandu Thekkath Microsoft Research - Silicon Valley Alice Silverberg Ohio State.
1 Lecture 18: RAID n I/O bottleneck n JBOD and SLED n striping and mirroring n classic RAID levels: 1 – 5 n additional RAID levels: 6, 0+1, 10 n RAID usage.
CS 346 – April 4 Mass storage –Disk formatting –Managing swap space –RAID Commitment –Please finish chapter 12.
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
The Game of Algebra or The Other Side of Arithmetic The Game of Algebra or The Other Side of Arithmetic © 2007 Herbert I. Gross by Herbert I. Gross & Richard.
RAID Redundant Array of Independent Disks
CSCE430/830 Computer Architecture
 RAID stands for Redundant Array of Independent Disks  A system of arranging multiple disks for redundancy (or performance)  Term first coined in 1987.
Introduction to Information Technologies
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
Sean Traber CS-147 Fall  7.9 RAID  RAID Level 0  RAID Level 1  RAID Level 2  RAID Level 3  RAID Level 4 
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Performance/Reliability of Disk Systems So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
CSC1016 Coursework Clarification Derek Mortimer March 2010.
CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #6.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
Section Disk Failures Kevin Grant
Disk Failures Xiaqing He ID: 204 Dr. Lin. Content 1) RAID stands for: “redundancy array of independent disks” 2) Several schemes to recover from disk.
Error Detection and Correction
Data Representation Recovery from Disk Crashes – 13.4 Presented By: Deepti Bhardwaj Roll No. 223_103 SJSU ID:
RAID Systems CS Introduction to Operating Systems.
Lecture 39: Review Session #1 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Chapter 2 Data Storage How does a computer system store and manage very large volumes of data ?
1 Solid State Storage (SSS) System Error Recovery LHO 08 For NASA Langley Research Center.
1 Failure Correction Techniques for Large Disk Array Garth A. Gibson, Lisa Hellerstein et al. University of California at Berkeley.
RAID COP 5611 Advanced Operating Systems Adapted from Andy Wang’s slides at FSU.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
CS1Q Computer Systems Lecture 6 Simon Gay. Lecture 6CS1Q Computer Systems - Simon Gay2 Algebraic Notation Writing AND, OR, NOT etc. is long-winded and.
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Copyright © Curt Hill, RAID What every server wants!
Unit 5 Lecture 2 Error Control Error Detection & Error Correction.
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
- Disk failure ways and their mitigation - Priya Gangaraju(Class Id-203)
The concept of RAID in Databases By Junaid Ali Siddiqui.
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Error Detection and Correction – Hamming Code
CS1Q Computer Systems Lecture 6 Simon Gay. Lecture 6CS1Q Computer Systems - Simon Gay2 Algebraic Notation Writing AND, OR, NOT etc. is long-winded and.
Transactions and Reliability Andy Wang Operating Systems COP 4610 / CGS 5765.
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
Disk Failures Skip. Index 13.4 Disk Failures Intermittent Failures Organizing Data by Cylinders Stable Storage Error- Handling.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Magnetic Disks Have cylinders, sectors platters, tracks, heads virtual and real disk blocks (x cylinders, y heads, z sectors per track) Relatively slow,
CS Introduction to Operating Systems
Disk Failures Xiaqing He ID: 204 Dr. Lin.
Transactions and Reliability
Multiple Platters.
Introduction to Information Technologies
Vladimir Stojanovic & Nicholas Weaver
Communication Networks: Technology & Protocols
RAID RAID Mukesh N Tekwani
ICOM 6005 – Database Management Systems Design
Introduction to Information Technologies
RAID Redundant Array of Inexpensive (Independent) Disks
RAID RAID Mukesh N Tekwani April 23, 2019
Disk Failures Disk failure ways and their mitigation
Presentation transcript:

Reliability of Disk Systems

Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the reliability of disk systems. What is reliability? –Essentially, it is the availability of data when there is a disk failure of some sort. This is achieved at the cost of some redundancy –data and/or disks.

Intermittent Failures In an intermittent failure, we may get several bad reads, for example, but with repeated attempts we may eventually get a good. Disk sectors are stored with some redundant bits that can be used to tell us if an I/O operation was successful. For writes, we may want to again check the status –We can, of course, re-read the sector and compare it to the original –But this is expensive –Instead, we simply re-read the sector and check the status bits

Checksums for failure detection A useful tool for status validation is the checksum –One or more bits that, with high probability, verify the correctness of the operation –The checksum is written by the disk controller. A simple form of checksum is the parity bit: –Here, a bit is added to the data so that the number of 1s amongst the data bits + the parity bit is always even. –A disk read (per sector) would return a status value of good if the bit string has an even number of 1s; otherwise, status = bad Good Bad

(Interleaved) Parity bits It is possible that more than one bit in a sector be corrupted –Error(s) may not be detected. Suppose bits error randomly: Probability of undetected error (i.e. even 1s) is thus 50% (Why?) Lets have 8 parity bits Byte Byte Byte Byte of parity bits Probability of error is 1/2 8 = 1/256 With n parity bits, the probability of undetected error = 1/2 n

Recovery from disk crashes Mean time to failure (MTTF) = when 50% of the disks have crashed, typically 10 years Simplified (assuming this happens linearly) –In the 1 st year = 5%, –In the 2 nd year = 5%, –… –In the 20 th year = 5% However the mean time to a disk crash doesnt have to be the same as the mean time to data loss; there are solutions.

Redundant Array of Independent Disks, RAID RAID 1:Mirror each disk (data/redundant disks) If a disk fails, restore using the mirror Assume: 5% failure per year; MTTF = 10 years (for disks). 3 hours to replace and restore failed disk. If a failure to one disk occurs, then the other better not fail in the next three hours. Probability of failure = 5% 3/(24 365) = 1/ If one disk fails every 10 years (10 5% = 50%), then one of two will fail every 5 years (5 (5% + 5%) = 50% ). One in 58,400 of those failures results in data loss; MTTF = 292,000 years (5 58,400 = 292,000). Drawback: We need one redundant disk for each data disk. This is the mean time to failure for data.

RAID 4 RAID 4: One redundant disk only. n data disks & 1 redundant disk (for any n) Well refer to the expression x y as modulo-2 sum of x and y (XOR) –E.g = Now, each block in the redundant disk has the modulo-2 sum for the corresponding blocks in the other disks. i th Block of Disk 1: i th Block of Disk 2: i th Block of Disk 3: i th Block of red. disk: In effect this is just a distributed form of the block- interleaved parity discussed earlier.

Properties of XOR: Commutativity: x y = y x Associativity: x (y z) = (x y) z Identity: x 0 = 0 x = x (0 is vector 00…0) Self-inverse: x x = 0 –As a useful consequence, if x y=z, then we can add x to both sides and get y=x z –More generally: 0 = x 1... x n+1 Then adding x i to both sides, we get: x i = x 1 …x i-1 x i+1... x n+1

Failure recovery in RAID 4 We must be able to restore whatever disk crashes. Just compute the modulo­2 sum of corresponding blocks of the other disks. Use equation Example: i th Block of Disk1: i th Block of Disk 2: i th Block of Disk 3: i th Block of red disk: Disk 2 crashes. Compute it by taking the modulo 2 sum of the rest.

RAID 4 (Contd) Maintaining RAID 4 is relatively easy: Reading: as usual –Interesting possibility: If we want to read from disk i, but it is busy and all other disks are free, then instead we can read the corresponding blocks from all other disks and modulo­2 sum them. Writing: –Write block. –Update redundant block

How do we get the value for the redundant block? Naively: Read all n-1 corresponding blocks n+1 disk I/Os, which is n-1 blocks read, 1 data block write, 1 redundant block write. Better: How?

How do we get the value for the redundant block? Better Writing: To write block j of data disk i (new value = v): –Read old value of that block, say o. –Read the j th block of the redundant disk, value = r. –Compute w = v o r. –Write v in block j of disk i. –Write w in block j of the redundant disk. Total: 4 disk I/O; (true for any number of data disks) Problem Why does this work? –Intuition: v o is the change to the parity. –Redundant disk must change to compensate.

Example i th Block of Disk1: x1 i th Block of Disk 2: x2 = o i th Block of Disk 3: x3 i th Block of red disk: r Suppose we change into o v r w x x2 = v x w = new r If done the naïve way

RAID 5 RAID 4: Problem: The redundant disk is involved in every write Bottleneck! Solution is RAID 5: vary the redundant disk for different blocks. –Example: n+1 disks; –cylinder j is redundant on disk i if i = j mod n+1. Example: n=3. So, there are 4 disks. –First disk numbered 0, would be the redundant when considering cylinders numbered: 0, 4, 8, 12 etc. (because they leave reminder 0 when divided by 4). –Disk numbered 1, would be the redundant for its cylinders numbered: 1, 5, 9, 13. And so on Cylinder 2 Cylinder Cylinder 1 Disk 0 Cylinder 2 Cylinder Cylinder 0 Disk 1 Cylinder 1 Cylinder Cylinder 0 Disk 2 Cylinder 1 Cylinder Cylinder 0 Disk 3

RAID 5 (Contd) The reading/writing load for each disk is the same. In one block write whats the probability that a disk is involved? –Each disk has 1/(n+1) probability to have the block. –If not, i.e. with probability n/(n+1), then it has 1/n chance that it will be the redundant block for that block number. –So, each of the four disks is involved in: 1/(n+1) * 1 + (n/(n+1))*(1/n) *1= 2/(n+1) of the writes.

RAID 6 - for multiple disk crashes Lets focus on recovering from two disk crashes. Setup: 7 disks, numbered 1 through 7 The first 4 are data disks, and disks 5 through 7 are redundant. The relationship between data and redundant disks is summarized by a 3 x 7 matrix of 0's and 1's The columns for the redundant disks have a single 1. All columns are different. No all-0s column. Data disks Redundant disks The disks with 1 in a given row of the matrix are treated as if they were the entire set of disks in a RAID level 4 scheme.

RAID 6 - example 1) ) ) ) ) ) ) Redundant disks Data disks disk 5 is modulo 2 sum of disks 1,2,3 disk 6 is modulo 2 sum of disks 1,2,4 disk 7 is modulo 2 sum of disks 1,3,4

Why is it possible to recover from b a two disk crashes? r Let the failed disks be a and b. Since all columns of the redundancy matrix are different, we must be able to find some row r in which the columns for a and b are different. –Suppose that a has 0 in row r, while b has 1 there. Then we can compute the correct b by taking the modulo-2 sum of corresponding bits from all the disks other than b that have 1 in row r. –Note that a is not among these, so none of them have failed. Having done so, we must recompute a, with all other disks available. RAID 6 Failure Recovery

Example: Before failure After failure

RAID 6 – How many redundant disks? The number of disks can be one less than any power of 2, say 2 k – 1. Of these disks, k are redundant, and the remaining 2 k – 1– k are data disks, so the redundancy grows roughly as the logarithm of the number of data disks. For any k, we can construct the redundancy matrix by writing all possible columns of k 0's and 1's, except the all-0's column. –The columns with a single 1 correspond to the redundant disks, and the columns with more than one 1 are the data disks. Note finally that we can combine RAID 6 with RAID 5 to reduce the performance bottleneck on the redundant disks

Exercises

RAID 4 i th Block of Disk 1: i th Block of Disk 2: i th Block of Disk 3: i th Block of Disk 3: i th Block of red. disk: Now suppose that Disk 1 crashed. Recover it.

RAID 6 1) ) ) ) ) 6) 7) Redundant disks Data disks Now suppose that Disk 2 and Disk 5 crash. Recover them.

RAID 6 - exercise Find a RAID level 6 scheme using 15 disks, 4 of which are redundant

In-Class exercise Suppose we have four disks: 1 and 2 are data disks, 3 and 4 are redundant Disk 3 is a mirror of 1. Disk 4 holds parity check bits for disks 2 and 3 which combination of simultaneous 2-disk failures can we recover from?