22/07/2011 1 The MDS Scaling Problem for Cloud Storage Yu-chong Hu Institute of Network Coding.

Slides:



Advertisements
Similar presentations
A Digital Fountain Approach to Reliable Distribution of Bulk Data
Advertisements

Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Yuchong Hu, Patrick P. C. Lee, Kenneth.
current hadoop architecture
Alex Dimakis based on collaborations with Dimitris Papailiopoulos Arash Saber Tehrani USC Network Coding for Distributed Storage.
Henry C. H. Chen and Patrick P. C. Lee
1 NCFS: On the Practicality and Extensibility of a Network-Coding-Based Distributed File System Yuchong Hu 1, Chiu-Man Yu 2, Yan-Kit Li 2 Patrick P. C.
Digital Fountain Codes V. S
BASIC Regenerating Codes for Distributed Storage Systems Kenneth Shum (Joint work with Minghua Chen, Hanxu Hou and Hui Li)
Simple Regenerating Codes: Network Coding for Cloud Storage Dimitris S. Papailiopoulos, Jianqiang Luo, Alexandros G. Dimakis, Cheng Huang, and Jin Li University.
Yuchong Hu1, Henry C. H. Chen1, Patrick P. C. Lee1, Yang Tang2
1 MM3 - Reliability and Fault tolerance in Networks Service Level Agreements Jens Myrup Pedersen.
1 Degraded-First Scheduling for MapReduce in Erasure-Coded Storage Clusters Runhui Li, Patrick P. C. Lee, Yuchong Hu The Chinese University of Hong Kong.
On Large-Scale Peer-to-Peer Streaming Systems with Network Coding Chen Feng, Baochun Li Dept. of Electrical and Computer Engineering University of Toronto.
Lava: A Reality Check of Network Coding in Peer-to-Peer Live Streaming Mea Wang, Baochun Li Department of Electrical and Computer Engineering University.
Scalable On-demand Media Streaming with Packet Loss Recovery Anirban Mahanti Department of Computer Science University of Calgary Calgary, AB T2N 1N4 Canada.
Availability in Globally Distributed Storage Systems
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Facebook f4 Steve Ko Computer Sciences and Engineering University at Buffalo.
Network Coding in Peer-to-Peer Networks Presented by Chu Chun Ngai
An Upper Bound on Locally Recoverable Codes Viveck R. Cadambe (MIT) Arya Mazumdar (University of Minnesota)
Beyond the MDS Bound in Distributed Cloud Storage
Coding for Atomic Shared Memory Emulation Viveck R. Cadambe (MIT) Joint with Prof. Nancy Lynch (MIT), Prof. Muriel Médard (MIT) and Dr. Peter Musial (EMC)
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Jack Lee Yiu-bun, Raymond Leung Wai Tak Department.
Resilient Peer-to-Peer Streaming Paper by: Venkata N. Padmanabhan Helen J. Wang Philip A. Chou Discussion Leader: Manfred Georg Presented by: Christoph.
1 Network Coding: Theory and Practice Apirath Limmanee Jacobs University.
June 3, A New Multipath Routing Protocol for Ad Hoc Wireless Networks Amit Gupta and Amit Vyas.
Network Coding for Large Scale Content Distribution Christos Gkantsidis Georgia Institute of Technology Pablo Rodriguez Microsoft Research IEEE INFOCOM.
Using Redundancy to Cope with Failures in a Delay Tolerant Network Sushant Jain, Michael Demmer, Rabin Patra, Kevin Fall Source:
Application Layer Multicast
Efficient replica maintenance for distributed storage systems Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon, M. Frans Kaashoek,
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Presented by: Raymond Leung Wai Tak Supervisor:
Cooperative regenerating codes for distributed storage systems Kenneth Shum (Joint work with Yuchong Hu) 22nd July 2011.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds
Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu.
Network Coding Distributed Storage Patrick P. C. Lee Department of Computer Science and Engineering The Chinese University of Hong Kong 1.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Copyright © Clifford Neuman and Dongho Kim - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE Advanced Operating Systems Lecture.
IEEE Globecom 2010 Tan Le Yong Liu Department of Electrical and Computer Engineering Polytechnic Institute of NYU Opportunistic Overlay Multicast in Wireless.
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
Degraded-First Scheduling for MapReduce in Erasure-Coded Storage Clusters Runhui Li, Patrick P. C. Lee, Yuchong Hu th Annual IEEE/IFIP International.
Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Yuchong Hu, Yinlong Xu, Xiaozhao Wang, Cheng Zhan and Pei.
Erasure Coding for Real-Time Streaming Derek Leong and Tracey Ho California Institute of Technology Pasadena, California, USA ISIT
On the Optimal Scheduling for Media Streaming in Data-driven Overlay Networks Meng ZHANG with Yongqiang XIONG, Qian ZHANG, Shiqiang YANG Globecom 2006.
1 Enabling Efficient and Reliable Transitions from Replication to Erasure Coding for Clustered File Systems Runhui Li, Yuchong Hu, Patrick P. C. Lee The.
6 December On Selfish Routing in Internet-like Environments paper by Lili Qiu, Yang Richard Yang, Yin Zhang, Scott Shenker presentation by Ed Spitznagel.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
Use Cases for High Bandwidth Query and Control of Core Networks Greg Bernstein, Grotto Networking Young Lee, Huawei draft-bernstein-alto-large-bandwidth-cases-00.txt.
Throughput-Smoothness Trade-offs in Streaming Communication Gauri Joshi (MIT) Yuval Kochman (HUJI) Gregory Wornell (MIT) 1 13 th Oct 2015 Banff Workshop.
20/10/ Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Yuchong Hu Institute of Network Coding Please.
A Fast Repair Code Based on Regular Graphs for Distributed Storage Systems Yan Wang, East China Jiao Tong University Xin Wang, Fudan University 1 12/11/2013.
Best Available Technologies: External Storage Overview of Opportunities and Impacts November 18, 2015.
Coding and Algorithms for Memories Lecture 13 1.
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
Performance Limitations of ADSL Users: A Case Study Matti Siekkinen, University of Oslo Denis Collange, France Télécom R&D Guillaume Urvoy-Keller, Ernst.
A Comparison of RaDiO and CoDiO over IEEE WLANs May 25 th Jeonghun Noh Deepesh Jain A Comparison of RaDiO and CoDiO over IEEE WLANs.
Database Laboratory Regular Seminar TaeHoon Kim Article.
Cloud-based movie search web application with transaction service Group 14 Yuanfan Zhang Ji Zhang Zhuomeng Li.
Pouya Ostovari and Jie Wu Computer & Information Sciences
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 10: Mass-Storage Systems.
A Tale of Two Erasure Codes in HDFS
Double Regenerating Codes for Hierarchical Data Centers
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
A New Multipath Routing Protocol for Ad Hoc Wireless Networks
Xiaoyang Zhang1, Yuchong Hu1, Patrick P. C. Lee2, Pan Zhou1
Overview Continuation from Monday (File system implementation)
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
Specialized Cloud Architectures
Presentation transcript:

22/07/ The MDS Scaling Problem for Cloud Storage Yu-chong Hu Institute of Network Coding

Background 2 A webmaster wants to upload his website for Michael Jackson to Internet

Background Dedicated Storage: purchase storage devices (servers) for his demand. A webmaster wants to upload his website for Michael Jackson to Internet

Background 4 Dedicated Storage: purchase storage devices (servers) for his demand. Cloud Storage: purchase storage services for his demand. A webmaster wants to upload his website for Michael Jackson to Internet

Background  The demand is dynamic Normal case During the 3 months after MJ died (BURST!) $50,000 $100,000 Return to normal 3 months later $100,000

Background  The demand is dynamic Normal case During the 3 months after MJ died (BURST!) $50,000 $100,000 Return to normal 3 months later $100,000 waste

Background  The demand is dynamic 7 Normal case During the 3 months after MJ died (BURST!) $50,000 Standard : $1 per hour $100,000 High Performance : $2 per hour Return to normal 3 months later $100,000 Standard : $1 per hour waste

Background  The demand is dynamic 8 Normal case During the 3 months after MJ died (BURST!) $50,000 Standard : $1 per hour $100,000 High Performance : $2 per hour Return to normal 3 months later $100,000 Standard : $1 per hour waste Always match the demand: No waste

Motivation 9 Network Applications Network Applications

Motivation 10 Network Applications Network Applications Dynamic Demand Dynamic Demand

Motivation 11 Network Applications Network Applications Dynamic Demand Dynamic Demand Cloud Storage Cloud Storage

Motivation 12 Network Applications Network Applications Dynamic Demand Dynamic Demand Cloud Storage Cloud Storage Scale the capacity Up Scale the capacity Up Scale the capacity down Scale the capacity down

Motivation 13 Network Applications Network Applications Dynamic Demand Dynamic Demand Cloud Storage Cloud Storage How to do the scaling as fast as possible? How to do the scaling as fast as possible? Scale the capacity Up Scale the capacity Up Scale the capacity down Scale the capacity down

Motivation 14 Network Applications Network Applications Dynamic Demand Dynamic Demand Cloud Storage Cloud Storage minimizing data migration scaling traffic minimizing data migration scaling traffic How to do the scaling as fast as possible? How to do the scaling as fast as possible? Scale the capacity Up Scale the capacity Up Scale the capacity down Scale the capacity down

Motivation  Replication-based data can be scaled very easily

Motivation  Replication-based data can be scaled very easily  Erasure coding-based data are more reliable. E.g:

Motivation  Replication-based data can be scaled very easily  Erasure coding-based data are more reliable. E.g: vs Replication A A B B (4,2) MDS erasure code A B A+B A+2B Node 1 Node 2 Node 3 Node 4 Node 1 Node 2 Node 3 Node 4 Source A B File Source Node

Motivation  Replication-based data can be scaled very easily  Erasure coding-based data are more reliable. E.g: vs Replication A A B B (4,2) MDS erasure code A B A+B A+2B Tolerate one failure Node 1 Node 2 Node 3 Node 4 Node 1 Node 2 Node 3 Node 4 Source A B File Source Node

Motivation  Replication-based data can be scaled very easily  Erasure coding-based data are more reliable. E.g: vs Replication A A B B (4,2) MDS erasure code A B A+B A+2B Tolerate one failure Tolerate two failures Node 1 Node 2 Node 3 Node 4 Node 1 Node 2 Node 3 Node 4 Source A B File Source Node

MDS Scaling Problem  Problem Statement: How to minimize the scaling traffic in storage system based on MDS codes.

MDS Scaling Problem  Problem Statement: How to minimize the scaling traffic in storage system based on MDS codes.  Network Coding + MDS Scaling Problem 21

MDS Scaling Problem  Problem Statement: How to minimize the scaling traffic in storage system based on MDS codes.  Network Coding + MDS Scaling Problem Source Node 1 n MDS (n,k)

MDS Scaling Problem  Problem Statement: How to minimize the scaling traffic in storage system based on MDS codes.  Network Coding + MDS Scaling Problem Source Node 1 n n’ Scaling MDS (n,k)MDS (n’,k’)

MDS Scaling Problem  Problem Statement: How to minimize the scaling traffic in storage system based on MDS codes.  Network Coding + MDS Scaling Problem 24 Source Node 1 n n’ Data collector any k’ of n’ nodes can rebuild file Data collector Data collector MDS (n,k)MDS (n’,k’) Scaling

MDS Scaling Problem  Problem Statement: How to minimize the scaling traffic in storage system based on MDS codes.  Network Coding + MDS Scaling Problem 25 Source Node 1 n n’ Scale traffic Data collector any k’ of n’ nodes can rebuild file Data collector Data collector MDS scaling problem = multicasting on information flow graph. MDS scaling problem = multicasting on information flow graph. MDS (n,k)MDS (n’,k’)

MDS Scaling Problem  Related work 1.Dress Codes[1]: The scaling problem is considered, but dress codes are especially designed for optimal repairing at first, and the value of k is fixed; MDS scaling problem needs to consider optimal scaling at first, and k can be scaled to k’ Dress Codes for the Storage Cloud: Simple Randomized Constructions, Sameer Pawar et al. 2.FastScale: Accelerate RAID Scaling by Minimizing Data Migration, W Zheng et al., FAST 2011’ 3.ALV: A new data redistribution approach to RAID-5 scaling, G Zhang et al. Transactions on computers

MDS Scaling Problem  Related work 1.Dress Codes[1]: The scaling problem is considered, but dress codes are especially designed for optimal repairing at first, and the value of k is fixed; MDS scaling problem needs to consider optimal scaling at first, and k can be scaled to k’. 2.RAID Scaling [2, 3]: The authors use a technique “sliding window” to reduce the moved packets for scaling of RAID codes, but they do not give a theoretical bound for scaling traffic Dress Codes for the Storage Cloud: Simple Randomized Constructions, Sameer Pawar et al. 2.FastScale: Accelerate RAID Scaling by Minimizing Data Migration, W Zheng et al., FAST 2011’ 3.ALV: A new data redistribution approach to RAID-5 scaling, G Zhang et al. Transactions on computers

MDS Scaling Problem  There are very few theoretical results about the scaling problem.  No bounds obtained  No optimal code and scaling schemes presented  Outline of my talk:  (n,k)MDS → (n+m,k) MDS  Example 1: (3,2) MDS → (4,2) MDS  Example 2: (3,2) MDS → (4,3) MDS  (n,k)MDS → (n+m,k+m) MDS  Example 3: (4,3) MDS → (5,4) MDS 28

Example 1: (3,2)MDS → (4,2)MDS 29 (3,2)MDS (4,2)MDS Data collector any 2 of 3 nodes can rebuild file The scaling (n,k)MDS → (n+m,k) MDS increases more data reliability because more failures can be tolerated. Data collector any 2 of 4 nodes can rebuild file Tolerate one failureTolerate two failures

Example 1: (3,2)MDS → (4,2)MDS 30 Data collector Source M: any 2 of 3 nodes can rebuild file (3,2)MDS Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D

Example 1: (3,2)MDS → (4,2)MDS 31 Data collector Source M: any 2 of 3 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D A (3,2)MDS

Example 1: (3,2)MDS → (4,2)MDS 32 Data collector Source M: any 2 of 3 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D A ⊕ (3,2)MDS

Example 1: (3,2)MDS → (4,2)MDS 33 Data collector Source M: any 2 of 3 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D A C⊕DC⊕D ⊕ (3,2)MDS

Example 1: (3,2)MDS → (4,2)MDS 34 Data collector Source M: any 2 of 3 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D A C⊕DC⊕D B⊕DB⊕D ⊕ (3,2)MDS

Example 1: (3,2)MDS → (4,2)MDS 35 Data collector Source M: any 2 of 3 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D A C⊕DC⊕D B⊕DB⊕D ⊕ (3,2)MDS

Example 1: (3,2)MDS → (4,2)MDS 36 Data collector Source M: any 2 of 3 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D A C⊕DC⊕D B⊕DB⊕D B⊕CB⊕C ⊕ (3,2)MDS

Example 1: (3,2)MDS → (4,2)MDS 37 Data collector Source M: any 2 of 3 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D A C⊕DC⊕D B⊕DB⊕D B⊕CB⊕C ⊕ (3,2)MDS

Example 1: (3,2)MDS → (4,2)MDS 38 Source M: Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D B⊕CB⊕C A⊕B⊕DA⊕B⊕D A C⊕DC⊕D B⊕DB⊕D ⊕ (4,2)MDS

Example 1: (3,2)MDS → (4,2)MDS 39 Data collector Source M: any 2 of 4 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D B⊕CB⊕C A⊕B⊕DA⊕B⊕D A C⊕DC⊕D B⊕DB⊕D ⊕ (4,2)MDS

Flow graph: 1-node Scaling = 1-loss Repairing 40 Data collector Source M: any 2 of 4 nodes can rebuild file Source Node A B C D A⊕CA⊕C B⊕DB⊕D A B C D 1 loss repair of (4,2) RC code B⊕CB⊕C A⊕B⊕DA⊕B⊕D B⊕CB⊕C A⊕B⊕DA⊕B⊕D A C⊕DC⊕D B⊕DB⊕D ⊕

Flow graph: 1-node Scaling = 1-loss Repairing 41 (3,2)MDS → (4,2)MDS 1-loss repair of (4,2) RC 2 2

Results  Conclusion 1: The optimal scaling problem from (n,k)MDS to (n+1,k)MDS is equivalent to the 1-loss repair problem based on (n+1,k)RC. 42

Results  Conclusion 1: The optimal scaling problem from (n,k)MDS to (n+1,k)MDS is equivalent to the 1-loss repair problem based on (n+1,k)RC.  Conclusion 2: The opitmal scaling problem from (n,k)MDS to (n+r,k)MDS is equivalent to the r-loss repair problem based on (n+r,k)RC for multiple failures. 43

Open Problems How to design the optimal scaling algorithms in which (n,k)MDS can grow to different MDS? 44

Open Problems How to design the optimal scaling algorithms in which (n,k)MDS can grow to different MDS? Motivation: In cloud storage, a consumer, at first, may select a default level of storage service with availability of 99.99%. Later, the consumer wants to increase the availability, so the cloud storage service should provide different higher level of storage redundancy for user’s selection. 45

Open Problems How to design the optimal scaling algorithms in which (n,k)MDS can grow to different MDS? Motivation: In cloud storage, a consumer, at first, may select a default level of storage service with availability of 99.99%. Later, the consumer wants to increase the availability, so the cloud storage service should provide different higher level of storage redundancy for user’s selection. Example: (3,2)MDS:99.99% 46 (4,2)MDS: % (5,2)MDS: %

Open Problems How to design the optimal scaling algorithms in which (n,k)MDS can grow to different MDS? Motivation: In cloud storage, a consumer, at first, may select a default level of storage service with availability of 99.99%. Later, the consumer wants to increase the availability, so the cloud storage service should provide different higher level of storage redundancy for user’s selection. Example: (3,2)MDS:99.99% Difficulty: This optimal scaling problem may not be equivalent to the optimal repair problem, because there exist at least two optional scaling size. 47 (4,2)MDS: % (5,2)MDS: %

Example 2: (3,2)MDS → (4,3)MDS 48 (3,2)MDS (4,3)MDS Data collector any 2 of 3 nodes can rebuild file (n,k)MDS → (n+m,k+m) MDS scaling increases higher I/O performance because more I/O bandwidth for data collector can be obtained. Data collector any 3 of 4 nodes can rebuild file 10MB/s

Example 2: (3,2)MDS → (4,3)MDS 49 Data collector Source M: any 2 of 3 nodes can rebuild file A B Source Node C D A B C E F D E F A⊕DA⊕D B⊕EB⊕E C⊕FC⊕F (3,2)MDS

Example 2: (3,2)MDS → (4,3)MDS 50 Source M: A B (4,3)MDS Source Node C D A B C E F D E F A⊕DA⊕D B⊕EB⊕E C⊕FC⊕F

Example 2: (3,2)MDS → (4,3)MDS 51 Source M: A B (4,3)MDS Source Node C D A B C E F D E F A⊕DA⊕D B⊕EB⊕E C⊕FC⊕F C

Example 2: (3,2)MDS → (4,3)MDS 52 Source M: A B (4,3)MDS Source Node C D A B E F D E F A⊕DA⊕D B⊕EB⊕E C⊕FC⊕F C

Example 2: (3,2)MDS → (4,3)MDS 53 Source M: A B (4,3)MDS Source Node C D A B E F D E F A⊕DA⊕D B⊕EB⊕E C⊕FC⊕F C D

Example 2: (3,2)MDS → (4,3)MDS 54 Source M: A B (4,3)MDS Source Node C D A B E F E F A⊕DA⊕D B⊕EB⊕E C⊕FC⊕F C D

Example 2: (3,2)MDS → (4,3)MDS 55 (4,3)MDS A⊕DA⊕D B⊕EB⊕E C⊕FC⊕F

Example 2: (3,2)MDS → (4,3)MDS 56 (4,3)MDS A⊕DA⊕D B⊕EB⊕E C⊕FC⊕F A⊕D⊕B⊕EA⊕D⊕B⊕E B⊕E⊕C⊕FB⊕E⊕C⊕F

Example 2: (3,2)MDS → (4,3)MDS 57 Source M: A B (4,3)MDS Source Node C D A B E F E F A⊕D⊕B⊕EA⊕D⊕B⊕E B⊕E⊕C⊕FB⊕E⊕C⊕F (3,2)MDS C D

Example 2: (3,2)MDS → (4,3)MDS 58 Data collector Source M: any 3 of 4 nodes can rebuild file A B (4,3)MDS Source Node C D A B E F E F A⊕D⊕B⊕EA⊕D⊕B⊕E B⊕E⊕C⊕FB⊕E⊕C⊕F (3,2)MDS C D

Example 2: (3,2)MDS → (4,3)MDS 59 Data collector Source M: any 3 of 4 nodes can rebuild file A B (4,3)MDS Source Node C D A B E F E F A⊕D⊕B⊕EA⊕D⊕B⊕E B⊕E⊕C⊕FB⊕E⊕C⊕F (3,2)MDS C D Scaling traffic = M/3

Example 2: (3,2)MDS → (4,3)MDS 60 Data collector Source M: any 3 of 4 nodes can rebuild file A B (4,3)MDS Source Node C D A B E F E F A⊕D⊕B⊕EA⊕D⊕B⊕E B⊕E⊕C⊕FB⊕E⊕C⊕F (3,2)MDS C D Scaling traffic = M/3 The size of new node is M/3, so the scaling traffic is minimal

Example 3: (4,3)MDS → (5,4)MDS (n+m,k+m) MDS→(n+m+l,k+m+l)MDS scaling (iterative scaling) increases higher and higher I/O performance. (4,3)MDS (5,4)MDS Data collector any 3 of 4 nodes can rebuild file 10MB/s Data collector any 4 of 5 nodes can rebuild file 10MB/s

Example 3: (4,3)MDS → (5,4)MDS In real storage systems, a data object is divided into a block stream for some practical reasons: A1 B1 C1 D1 E1 F1 A2 B2 C2 D2 E2 F2 …… Data Object:

Example 3: (4,3)MDS → (5,4)MDS (4,3)MDS C1 D1 A1 B1 E1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 A2 B2 E2 F2 C2 D2 Segment 1Segment 2 In real storage systems, a data object is divided into a block stream for some practical reasons: A1 B1 C1 D1 E1 F1 A2 B2 C2 D2 E2 F2 …… Data Object:

Example 3: (4,3)MDS → (5,4)MDS C1 D1 A1 B1 E1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 A2 B2 E2 F2 C2 D2 Segment 1Segment 2

Example 3: (4,3)MDS → (5,4)MDS C1 D1 A1 B1 E1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 A2 B2 E2 F2 C2 D2 Segment 1Segment 2 Consider two adjacent segments

Example 3: (4,3)MDS → (5,4)MDS 66 (4,3)MDS C1 D1 E1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 E2 F2 C2 D2 A1 B1 A2 B2

Example 3: (4,3)MDS → (5,4)MDS 67 C1 D1 E1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2

Example 3: (4,3)MDS → (5,4)MDS 68 C1 D1 E1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2

Example 3: (4,3)MDS → (5,4)MDS 69 C1 D1 E1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1

Example 3: (4,3)MDS → (5,4)MDS 70 C1 D1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1

Example 3: (4,3)MDS → (5,4)MDS 71 C1 D1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1 D1

Example 3: (4,3)MDS → (5,4)MDS 72 C1 F1 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1 D1

Example 3: (4,3)MDS → (5,4)MDS 73 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 (5,4)MDS

Example 3: (4,3)MDS → (5,4)MDS 74 A1 ⊕ D1 ⊕ B1 ⊕ E1 B1 ⊕ E1 ⊕ C1 ⊕ F1 A2 ⊕ D2 ⊕ B2 ⊕ E2 B2 ⊕ E2 ⊕ C2 ⊕ F2 (5,4)MDS A1 ⊕ D1 ⊕ B1 ⊕ E1B1 ⊕ E1 ⊕ C1 ⊕ F1 ⊕ A2 ⊕ D2 ⊕ B2 ⊕ E2 B1 ⊕ E1 ⊕ C1 ⊕ F1 B2 ⊕ E2 ⊕ C2 ⊕ F2 ⊕

Example 3: (4,3)MDS → (5,4)MDS 75 C1 F1 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1 D1 A1 ⊕ D1 ⊕ C1 ⊕ F1 B1 ⊕ E1 ⊕ C1 ⊕ F1 ⊕ B2 ⊕ E2 ⊕ C2 ⊕ F2 A2 ⊕ D2 ⊕ B2 ⊕ E2

Example 3: (4,3)MDS → (5,4)MDS 76 C1 F1 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1 D1 A1 ⊕ D1 ⊕ C1 ⊕ F1 B1 ⊕ E1 ⊕ C1 ⊕ F1 ⊕ B2 ⊕ E2 ⊕ C2 ⊕ F2 A2 ⊕ D2 ⊕ B2 ⊕ E2 Data collector any 4 of 5 nodes can rebuild file

Example 3: (4,3)MDS → (5,4)MDS 77 C1 F1 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1 D1 A1 ⊕ D1 ⊕ C1 ⊕ F1 B1 ⊕ E1 ⊕ C1 ⊕ F1 ⊕ B2 ⊕ E2 ⊕ C2 ⊕ F2 A2 ⊕ D2 ⊕ B2 ⊕ E2 Data collector any 4 of 5 nodes can rebuild file Scaling traffic = 3 blocks

Example 3: (4,3)MDS → (5,4)MDS 78 C1 F1 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1 D1 A1 ⊕ D1 ⊕ C1 ⊕ F1 B1 ⊕ E1 ⊕ C1 ⊕ F1 ⊕ B2 ⊕ E2 ⊕ C2 ⊕ F2 A2 ⊕ D2 ⊕ B2 ⊕ E2 Data collector any 4 of 5 nodes can rebuild file Scaling traffic = 3 blocks

Example 3: (4,3)MDS → (5,4)MDS 79 C1 F1 E2 F2 C2 D2 (5,4)MDS A1 B1 A2 B2 E1 D1 A1 ⊕ D1 ⊕ C1 ⊕ F1 B1 ⊕ E1 ⊕ C1 ⊕ F1 ⊕ B2 ⊕ E2 ⊕ C2 ⊕ F2 A2 ⊕ D2 ⊕ B2 ⊕ E2 Data collector any 4 of 5 nodes can rebuild file Scaling traffic = 3 blocks Optimal code

Open Problems 1.What is the theoretical bound of scaling traffic from (n,k)MDS to (n+m,k+m)MDS? 2.How to design the generalized optimal scaling algorithms to match the bound? 3.How to deal with more complicated cases, such as iterative scaling? 4.What about (n,k)MDS → (n-m,k-m)MDS? 80 Generalize form: How to minimize the scaling traffic in storage system from (n,k) to (n’,k’) MDS codes. Generalize form: How to minimize the scaling traffic in storage system from (n,k) to (n’,k’) MDS codes.

Fin 81