Erasure Correcting Codes for Highly Available Storage

Slides:



Advertisements
Similar presentations
Mahdi Barhoush Mohammad Hanaysheh
Advertisements

1 A triple erasure Reed-Solomon code, and fast rebuilding Mark Manasse, Chandu Thekkath Microsoft Research - Silicon Valley Alice Silverberg Ohio State.
Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey.
Origins  clear a replacement for DES was needed Key size is too small Key size is too small The variants are just patches The variants are just patches.
Chapter 2 Solutions of Systems of Linear Equations / Matrix Inversion
Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Yuchong Hu, Patrick P. C. Lee, Kenneth.
Cyclic Code.
Error Control Code.
Computer Networking Error Control Coding
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
296.3Page :Algorithms in the Real World Error Correcting Codes II – Cyclic Codes – Reed-Solomon Codes.
15-853:Algorithms in the Real World
Information and Coding Theory
2015/4/28System Arch 2008 (Fire Tom Wada) 1 Error Correction Code (1) Fire Tom Wada Professor, Information Engineering, Univ. of the Ryukyus.
CHANNEL CODING REED SOLOMON CODES.
Redundant Data Update in Server-less Video-on-Demand Systems Presented by Ho Tsz Kin.
Forward Error Correction Steven Marx CSC45712/04/2001.
More Codes Never Enough. 2 EVENODD Code Basics of EVENODD code  each storage node as a single column # of data nodes k = p (prime) # of total nodes n.
1 Solid State Storage (SSS) System Error Recovery LHO 08 For NASA Langley Research Center.
Channel Coding Part 1: Block Coding
AES Background and Mathematics CSCI 5857: Encoding and Encryption.
1 Forward Error Correction in Sensor Networks Jaein Jeong, Cheng-Tien Ee University of California, Berkeley.
15-853Page :Algorithms in the Real World Error Correcting Codes II – Cyclic Codes – Reed-Solomon Codes.
1 Network Coding and its Applications in Communication Networks Alex Sprintson Computer Engineering Group Department of Electrical and Computer Engineering.
COEN 180 Erasure Correcting, Error Detecting, and Error Correcting Codes.
MIMO continued and Error Correction Code. 2 by 2 MIMO Now consider we have two transmitting antennas and two receiving antennas. A simple scheme called.
DIGITAL COMMUNICATIONS Linear Block Codes
ADVANTAGE of GENERATOR MATRIX:
Error Detection and Correction – Hamming Code
Error Detection and Correction
Digital Communications I: Modulation and Coding Course Term Catharina Logothetis Lecture 9.
2016/2/14 1 Error Correction Code (1) Fire Tom Wada Professor, Information Engineering, Univ. of the Ryukyus.
CHAPTER 8 CHANNEL CODING: PART 3 Sajina Pradhan
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
V. Non-Binary Codes: Introduction to Reed Solomon Codes
Simple Parity Check The simplest form of error detection is the parity check used with ASCII codes, originally on asynchronous modem links Each 7 bit ASCII.
2.8 Error Detection and Correction
Hamming Code In 1950s: invented by Richard Hamming
Linear Algebra review (optional)
Introduction to Information Technologies
Error Detection and Correction
Chapter 7 Matrix Mathematics
15-853:Algorithms in the Real World
School of Computer Science and Engineering Pusan National University
Communication Networks: Technology & Protocols
DATA COMMUNICATION AND NETWORKINGS
Advanced Computer Networks
15-853:Algorithms in the Real World
Error Detection and Correction
Algorithms in the Real World
Part III Datalink Layer 10.
Subject Name: Information Theory Coding Subject Code: 10EC55
Introduction to Reed-Solomon Coding ( Part II )
AES Objectives ❏ To review a short history of AES
Chapter 2 Determinants Basil Hamed
Introduction to Information Technologies
Information Redundancy Fault Tolerant Computing
Cyclic Code.
Error Correction Code (1)
Error Detection and Correction
Generating QR Codes from Oracle Database - Appendix
Error Correction Code (1)
Linear Algebra review (optional)
Cryptography and Network Security Chapter 5 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
Section 8.1 – Systems of Linear Equations
2.8 Error Detection and Correction
Chapter 4 Systems of Linear Equations; Matrices
Chapter 10 Error Detection and Correction
Introduction to Modern Cryptography
Presentation transcript:

Erasure Correcting Codes for Highly Available Storage Thomas Schwarz, S.J.

Error Control Codes Use redundancy to correct errors Designed for Ease of Encoding Decoding (Calculation of syndrome / location of error) Error Correction Power (Burst Errors / Low Redundancy)

Error Control Codes Block Codes: Information Symbols + Parity Symbols (i1i2 i3 i4 i5 i 6 i7 i8 p1 p2 p3)

Error Control Codes Typical Applications: Communication: Deep Space “A match made in heaven” Telephone Computer Networks Streaming Audio, Video (CD, DVD) Storage (Main Memory, Magnetic & Optical Devices)

Error Correcting Codes Most applications use hardware implemented encoding and decoding.

Erasure Correcting Codes Protect against erasure of data. Simplest Erasure Correcting Code: Parity i1 i2 i3 i4 i5 i6 i7 i8 p where p = i1 i2  i3  i4  i5  i6  i7  i8

Erasure Correcting Codes Some applications implement encoding and decoding in hardware (e.g. RAIDs). Software implementation is much more feasible because of the simpler decoding problem.

Erasure Correcting Codes Ideal Properties: Systematic: Data is stored explicitly. Data updates do not change other data. MDS: Only as much parity data is created as is necessary to reconstruct maximum level of failures Simple encoding and decoding.

Parity Based Codes Only use parity of data (XOR operation) for ease of coding and decoding.

Parity Based Codes History: Protection for Multitrack Magnetic Recording. Prusinkiewicz & Budkowski 1976: X X X X X X X X X X Parity 1 X X X X X X X X X X Data 1 X X X X X X X X X X Data 2 X X X X X X X X X X Data 3 X X X X X X X X X X Parity 2 Horizontal and diagonal parity.

Parity Based Codes Extend the scheme by using lines of different slopes. Patel 1985: horizontal + 2 diagonals (slopes 0,1,-1) However, the code is optimal only if the data band is infinite. If not, there is (slightly) more parity than data.

Parity Based Array Codes Idea: Break up data into m symbols. Arrange the symbols in columns. Use horizontal and vertical lines to calculate parity. 1st column: horizontal parity, 2nd column: vertical parity

Parity Based Array Codes But is it not so simple! Is a legitimate code word.

Parity Based Array Codes But indistinguishable from the zero code word after failure of columns 1 and 3.

Parity Based Array Codes Number of Data Columns needs to be prime.

EvenOdd Better version of array codes for two parity Code words two-dimensional m-1 by m arrays with two additional parity columns

EvenOdd The EvenOdd code has as code words the m-1 by m+2 array of symbols ai,j such that

EvenOdd Encoding Set m=5. Start with an arbitrary 4 by 5 data array.

EvenOdd Encoding Fill in the horizontal parity lines: and calculate S to be a3,1+a2,2+a1,3+a0,4 S=0+1+0+0 = 1.

EvenOdd Encoding

EvenOdd Decoding Assume that the last two data columns have failed.

EvenOdd Decoding Use the parity columns to calculate S.

EvenOdd Decoding Use S=1 and the magenta diagonal to find the data symbol in the last column.

EvenOdd Decoding Then use the horizontal parity for one more symbol.

EvenOdd Decoding The blue diagonal now can be exploited.

EvenOdd EvenOdd requires m is a prime. Hence, for a given number n of data lines, choose m to be the smallest prime  n. Set the superfluous data columns to zero:

EvenOdd Encoding and Decoding only uses XOR operations. Given formulae suggests an iterative procedure, but the equations can be easily expanded to calculate the symbols in parallel.

Higher Array Codes There exists array codes using only XOR operations that can correct up to m erasures. The decoding process involves solution of a linear equation.

Algebraic Block Codes Interpret symbols (larger than bits) as elements of a Galois Field. Calculate parity symbols as linear combinations of the data symbols.

Galois Fields Only GF(2f) for simplicity’s sake. Elements: Bit strings of length f. Addition: XOR Multiplication: Much more complicated.

Galois Field Multiplication For GF(28). Elements are bytes. Method 1: Identify byte with a binary polynomial. E.g. (0100 1001) = x6+x3+1 Multiply to polynomials as polynomials modulo a generator polynomial. E.g. modulo 1 0001 1101 = x8+x4+x3+x2+1.

Galois Field Multiplication Combination of XORs and shifts!

Galois Field Multiplication This multiplication gives a field structure to GF(2f). Multiplicative group is cyclic: There are elements  such that all nonzero elements can be written as i , i=0,1 … 2f-1.

Galois Field Multiplication For each non-zero element x  GF(2f) define log(x)=i iff i=x. Define antilog(i) = i Calculate xy = antilog(log(x)+log(y)); if x0y = 0; if x=0 or y=0.

Galois Field Multiplication Can be implemented with two tables, two zero comparisons, four additions three memory accesses. 9 elementary operations in a processor with sufficient L1 cache to store 3*(2f –1) entries.

Linear Erasure Correcting Block Codes m data symbols u = (u0,u1,u2…um-1) u0 u0’ u0’’ u0’’’ . u1 u1’ u1’’ u1’’’ . u2 u2’ u2’’ u2’’’ . u3 u3’ u3’’ u3’’’ . Code Word u’’ Bucket 0 Bucket 3

Linear Erasure Correcting Block Codes Add k=n – m parity symbols for code word a u0 u0’ u0’’ u0’’’ . u1 u1’ u1’’ u1’’’ . u2 u2’ u2’’ u2’’’ . u3 u3’ u3’’ u3’’’ . p0 p0’ p0 ’’ p0’’’ . pk-1 pk -1’ pk -1’’ pk-1’’’ . Parity Bucket k-1 Bucket 0 Bucket 3

Linear Erasure Correcting Block Codes Calculate the parity symbols as a linear combination of the data symbols: With “Generator Matrix” G.

Properties of a Good Generator Matrix Systematic: Left m by m matrix is identity matrix. MDS: All matrices formed from m different columns of G are invertible. Thus: Any m coordinates of code word a suffice to calculate data word u.

Generation of Generator Matrices Find the largest rectangular matrix with MDS property. Multiply from left with the inverse of the matrix formed by the first m columns. Result is still MDS and now systematic.

Large MDS Matrices There are known families of matrices with the MDS property: Cauchy m+n = 2f Vandermonde n=2f–1 Twice extended Vandermonde n =2f+1

Vandermonde Matrix

Vandermonde Generator Matrix

Vandermonde Generator Matrix Write column m as a linear combination of the first m columns. Multiply column i (i=0,1,…m – 1) with this coefficient (non-zero according to Cramer’s Rule. (This preserves MDS.) Multiply with A-1, where A is the matrix consisting of columns 0 to m – 1.

Vandermonde Generator Matrix

RS Erasure Correcting Codes The generator matrix is that of a twice extended, generalized Reed-Solomon code. Large number of parity symbols: If symbols are bytes, then code length is 257.

RS Erasure Correcting Codes Encoding: Generation of a parity symbol costs: m multiplications with known coefficients m-1 XOR operation 7m-1 elementary operations

RS Erasure Correcting Codes Change of one data symbol in a data word: Calculate the difference d = uinew – uinew. Send d to the site maintaining the parity symbol. Multiply with coefficient gi,l of G. Add to existing parity. 7 elementary operations per parity site. 1 elementary operation at data site. 1 message.

RS Erasure Correcting Codes Erasure Correction: Typical cases: Parity site has failed. Regenerate parity from the data sites. Data site has failed. Use column m to regenerate the data from the other data sites and the XOR stored at this first parity site.

RS Erasure Correcting Codes Erasure Correction General Case: Collect m survivors among data and parity sites Invert the matrix consisting of the corresponding columns of G Each replacement site uses this matrix and G in order to calculate a decoding matrix H

RS Erasure Correcting Codes Send surviving data to all replacement sites. Use decoding matrix in order to regenerate the lost data or parity.

Measurements XOR Update: 0.45 sec EvenOdd Update: 0.48 sec RS Update: 1.27 sec Record Group with 4 records of 100 bytes. One record is changed. Measured is time to update parity symbol. Used 700 MHz Pentium 3 Machine.