Download presentation

Presentation is loading. Please wait.

Published byHolden Urton Modified over 2 years ago

1
236601 - Coding and Algorithms for Memories Lecture 12 1

2
Array Codes and Distributed Storage 2

3
Large Scale Storage Systems 3 Big Data Players: Facebook, Amazon, Google, Yahoo,… Cluster of machines running Hadoop at Yahoo! (Source: Yahoo!) Failures are the norm

4
Node failures at Facebook 4 Date XORing Elephants: Novel Erasure Codes for Big Data M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur, VLDB 2013

5
Problem Setup Disks are stored together in a group (rack) Disk failures should be supported Requirements: – Support as many disk failures as possible – And yet… Optimal and fast recovery Low complexity 5

6
Problem Setup Question 1: How many extra disks are required to support a single disk failure? Question 2: How many extra disks are required to support two disk failures? Question 3: How many extra disks are required to support d disk failures? 6 A A B B C C A+B+C A A B B C C A+ B + C A A B B C C A+B+C A+ B + C ’A+ ’ B+ ’C {(x 1,x 2,x 3,x 4 ): x 1 +x 2 +x 3 +x 4 = 0 } {(x 1,x 2,x 3,x 4,x 5 ): x 1 +x 2 +x 3 +x 4 =0 x 1 + x 2 + x 3 +x 5 =0 } {(x 1,x 2,x 3,x 4,x 5,x 6 ): x 1 +x 2 +x 3 +x 4 =0 x 1 + x 2 + x 3 +x 5 =0 ’x 1 + ’x 2 + ’x 3 +x 6 =0} {(x 1,x 2,x 3,x 4 ): H 1 ∙(x 1,x 2,x 3,x 4 ) T =0} H 1 = (1,1,1,1) {(x 1,x 2,x 3,x 4,x 5 ): H 2 ∙(x 1,x 2,x 3,x 4,x 5 ) T =0} H 2 = (1,1,1,1,0; , , ,0,1) {(x 1,x 2,x 3,x 4,x 5,x 6 ):H 3 ∙(x 1,x 2,x 3,x 4,x 5,x 6 ) T =0} H 3 = (1,1,1,1,0,0; , , ,0,1,0; ’, ’, ’,0,1,0)

7
Reed Solomon Codes 7

8
Advantages: – Support the maximum number of disk failures – Are very comment in practice and have relatively efficient encoding/decoding schemes Disadvantages – Require to work over large fields – Need to read all the disks in order to recover even a single disk failure – not efficient rebuild 8

9
Reed Solomon Codes Advantages: – Support the maximum number of disk failures – Are very comment in practice and have relatively efficient encoding/decoding schemes Disadvantages – Require to work over large fields Solution: EvenOdd Codes – Need to read all the disks in order to recover even a single disk failure – not efficient rebuild Solution: ZigZag Codes 9

10
EVENODD Codes Designed by Mario Balum, Jim Brady, Jehoshua Bruck, and Jai Menon Goal: Construct array codes correcting 2 disk failures using only binary XOR operations – No need for calculations over extension fields Code construction: – Every disk is a column – The array size is (m-1)x(m+2), m is prime – The last two arrays are used for parity 10

11
EVENODD Codes 11 01101 00110 00011 11010 0101101 0000110 1000011 0111010 0000000

12
The Repair Problem 12 1 1 2 2 3 3 4 4 5 5 6 6 7 7 9 9 10 8 8 P1 P3 P4 P2 A disk is lost – Repair job starts Access, read, and transmit data of disks! Overuse of system resources during single repair Goal: Reduce repair cost in a single disk repair Facebook’s storage Scheme: – 10 data blocks – 4 parity blocks – Can tolerate any four disk failures RS code

13
ZigZag Codes Designed by Itzhak Tamo, Zhiying Wang, and Jehoshua Bruck The goal: construct codes correcting the max number of erasures and yet allow efficient reconstruction if only a single drive fails 13

14
ZigZag Codes Example 14 aba+ba+2d cdc+dc+b

15
ZigZag Codes Lower bound: The min amount of data required to be read to recover a single drive failure – (n,k) code: n drives, k information, and n-k redundancy – M- size of a single drive in bits For (n,n-2) code it is required to read at least 1/2 from the remaining drives, that is at least (1/2)(n-1)M bits – The last example is optimal In general, for (n,n-r) code it required to read at least 1/r from the remaining drives (1/r)(n-1)M 15

16
ZigZag Codes Example 16 info 1info 2info 3 Row parity ZigZag parity

17
ZigZag Codes Example 17 info 1info 2info 3 Row parity ZigZag parity 0210 1301 2032 3123

Similar presentations

OK

A Hybrid Approach of Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation Yinlong Xu University of Science and Technology of.

A Hybrid Approach of Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation Yinlong Xu University of Science and Technology of.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google