Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mario Vodisek 1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure Codes for Reading and Writing Mario Vodisek ( joint work.

Similar presentations


Presentation on theme: "Mario Vodisek 1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure Codes for Reading and Writing Mario Vodisek ( joint work."— Presentation transcript:

1 Mario Vodisek 1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure Codes for Reading and Writing Mario Vodisek ( joint work with AG Schindelhauer)

2 Mario Vodisek 2 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Agenda Erasure (Resilient) Codes in storage networks The Read-Write-Coding-System -A Lower Bound and Perfect Codes -Requirements and Techniques

3 Mario Vodisek 3 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity n -symbol message x with symbols from alphabet  m -symbol encoding y with symbols from  ( m > n ) erasure coding provides mapping: n ! m such that –reading any n · r < m symbols of y are sufficient for recovery –(mostly: r = n ) optimal for reading) advantages: –b m-r c erasures can be tolerated –storage overhead is a factor of Generally, erasure codes are used to guarantee information recovery for data transmission over unreliable channels (RS-, Turbo-, LT-Codes, …) Lots of research in code properties such as –scalability –encoding/decoding speed-up –rateless-ness Attractive also to storage networks: downloads (P2P) and fault-tolerance Erasure (Resilient) Coding coding

4 Mario Vodisek 4 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure Codes for Storage (Area) Networks SANs require high system availability –disks fail or be blocked (probability $ size) efficient modification handling –Slow devices ) expensive I/O-operations Properties: a fixed set E of existing errors can be considered at encoding time E can have changed to E ‘ at decoding time Additional requirements to erasure codes: tolerate some certain number of erasures ensure modification of codeword even if erasures occur consider E at encoding time and E ‘ at decoding time Networ k

5 Mario Vodisek 5 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity The Read-Write-Coding-System An ( n, r, w, m ) b -Read-Write-Coding System (RWC) is defined as follows: The base b : b -symbol alphabet  b as the set of all used items n  1 blocks of information x 1, …, x n   b m  n code blocks y 1, …, y m   b any n  r  m code words sufficient to read the information any n  w  m code words sufficient to change the information by  1, …,  n (In the language of Coding Theory) : given m, n, r, w, our RW-Codes provide: a (linear) code of dimension n and block length m such that for n · r, w · m : –the minimum distance of the code is at least m - r +1 –any two codewords y 1, y 2 are within a distance of at most w from another –distance( x, y ):=|{1· i· m : x i  y i }| coding m, r, w n

6 Mario Vodisek 6 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity A Lower Bound for RW-Codes Theorem: For r + w < n + m and any base b there does not exist any ( n, r, w, m ) b - RWC system ! We know: n  r, w  m Assume: r = w = n  m  n +1 Write and subsequent read n m Proof: w r Index Sets ( W, R ): | W | = w | R | = r | S | = W  R  { n, n -1} Assume: | S | = n  there are b n possible change vectors to be encoded by `write` into S ; only basis for reading with r = n (notice: R \ S code words remain unchanged) Assume: | S | < n = n -1  at most b n -1 possible change vectors for S can be encoded by `write`  ´read´ will produce faulty output 

7 Mario Vodisek 7 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Codes at Lower Bound: Perfect Codes In the best case (n, r, w, m) b -RWC have parameters r + w = n + m (perfect Codes) Unfortunately, perfect RWC do not always exist !! - E.g. there is no ( 1, 2, 2, 3) 2 -RWC but there exists a (1, 2, 2, 3) 3 -RWC ! But: all perfect RW -Codes exist if the alphabet is sufficiently large ! Notice to RAID: Definition of parity RAID (RAID 4/5) corresponds to an ( n, n, n +1, n +1) 2 -RWC From the lower bounds it follows: there is no ( n, n, n, n +1) 2 -RWC ) there is no RAID-system with improved access properties !

8 Mario Vodisek 8 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity The Model: Operations Given: X=x 1,…, x n the n -symbol information vector over a finite alphabet . Y=y 1,…, y n the m -symbol code over  b =|  |. P(M) : the power set of M, P k (M):={S 2 P(M): |S|=k} Define [m]:={1, ,m} An ( n, r, w, m ) b -RWC-system consists of the following operations: Inital state: X 0 2  n, Y 0 2  m Read function: f : P r ([ m ]) £  r !  m Write function: g : P r ([ m ]) £  r £ P w ([ m ]) £  n !  w Differential write function:  : P w ([ m ]) £  n !  w

9 Mario Vodisek 9 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Initialization: Compute the Encoding Y 0 Given (in general): the information vector X = x 1, …, x n   b the encoded vector Y = y 1, …, y n   b internal variables V = v 1, …, v k for k = m - w = r - n, with no particular information set of functions M = M 1,…, M n for encoding Compute y i from X and V by function M i ; define M i as linear combination of X and V y i = M i ( x 1,…, x n, v 1,…, v k ) =  j =1 n x j M i,j +  l=1 k v l M i,l ( Define M as some m £ r matrix; M i as rows. It follows: M(XV = Y ) RW-Codes are closely related to Reed-Solomon-Codes !

10 Mario Vodisek 10 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity The Matrix Approach: ( n, r, w, m ) b - RWC Consider: the information vector X = x 1, …, x n   b the encoded vector Y = y 1, …, y m   b internal slack variables V = v 1, …, v k for k = m - w = r - n Further: an m  r generator matrix M : M i,j   b the submatrix ( M i,j ) i  [ m ], j  { n +1, …, r } is called the variable matrix =

11 Mario Vodisek 11 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Efficient Encoding:  b = F[ b ] (Finite Fields) RWC requires efficient arithmetic on elements of  b for encoding ) set  b = F[ b ] (finite field with b elements (formerly: GF( b ))) b = p n for some prime number p and integer n ) F[ p n ] always exists Computation of binary words of length v : b = 2 v, F[2 v ] = {0,…,2 v -1} Features: F[ b ] is closed under addition, multiplication ) exact computation on field elements ) not more than v bits for representiation of results Addition, subtraction via XOR (avoids rounding, no carryover) Multiplication, division via mapping tables (analogous to logarithm tables for real numbers) –T : table mapping an integer to its logarithm in F[2 v ] –IT: table mapping an integer to its inverse logarithm in F[2 v ] ) multiplication, division by adding/subtracting the logs taking the inverse log

12 Mario Vodisek 12 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity The Vandermonde Matrix Consider M as m £ r Vandermonde matrix M i,j = j i -1 : X, Y, V 2 F[ b ] M i,j 2 F[ b ] and all elements are different The Vandermonde matrix is non-singular ) invertible Any k ‘ £ k ‘ submatix M ‘ is also invertible = Consider: each device i in the SAN corresponds to a row of M and element y i

13 Mario Vodisek 13 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Reading (or Recovery) Read: Given any r code entries from Y, compute X Rearrange rows of M and Y such that first r entries of Y are available - (any r rows of M are linear independent in a Vandermonde matrix) M ! M ‘ and Y ! Y ‘ The first r rows of M ‘ describe an invertible r £ r matrix M ‘‘ X is computed by: ( X | V ) T = ( M ‘‘) -1 Y M (X | V) Y r m M‘ Y‘

14 Mario Vodisek 14 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Differential Write Given: - The change vector  =  1,…,  n and w code entries from Y - X‘ = X +  is new information vector ) change X without reading entries (XOR) - Compute the difference for the w code entries of Y Further: - Only choices w < r make sense - Rearrange m £ r matrix M and Y as follows: y 1,…,y w (denote M ‘ and Y ‘) - k = r-n (slack vector V )

15 Mario Vodisek 15 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Differential Write (con‘t) Define following sub-matrices: - M Ã" = ( M ‘ i,j ) i 2[ w ], j 2[ n ] - M "! = ( M ’ i,j ) i 2[ w ], j 2{ n+1,…, r } - M Ã# = ( M ’ i,j ) i 2{ w+1,…, m }, j 2[ n ] - M # ! = ( M ’ i,j ) i 2{ w+1,…, m }, j 2{ n+1,…, r } M Ã" M "! M Ã# M #! w n w+1…m n+1…r M #! is k £ k = m - w £ r - n matrix ) M #! invertible The vector Y can then be updated by a vector   =  ,…,  w :  = (( M Ã" ) – ( M "! )( M #! ) -1 ( M Ã# )) ¢ 

16 Mario Vodisek 16 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Differential Write: Proof Use: Vector  =  1,…,  k the change of vector V Vector  =  1,…,  w the change of vector Y M Ã" M "! M Ã# M #! X ’ = X +  V ’ = V +  Y ’ = Y +  Correctness follows by combining: M = M + M = + This equation is equivalent to: ( M #! )  + ( M Ã# )  = 0, ( M Ã" )  + ( M "! )  =  Since  is given,  is obained as follows:  = ( M #! ) -1 (- M Ã# ) ¢ 

17 Mario Vodisek 17 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Heinz Nixdorf Institute & Computer Science Institute University of Paderborn Fürstenallee 11 33102 Paderborn, Germany Tel.: +49 (0) 52 51/60 64 51 Fax: +49 (0) 52 51/62 64 82 E-Mail: vodisek@upb.de http://www.upb.de/cs/ag-madh Thank you for your attention!


Download ppt "Mario Vodisek 1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure Codes for Reading and Writing Mario Vodisek ( joint work."

Similar presentations


Ads by Google