Exact Regenerating Codes on Hierarchical Codes Ernst Biersack Eurecom France Joint work and Zhen Huang.

Slides:



Advertisements
Similar presentations
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
Advertisements

Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Yuchong Hu, Patrick P. C. Lee, Kenneth.
current hadoop architecture
Alex Dimakis based on collaborations with Dimitris Papailiopoulos Arash Saber Tehrani USC Network Coding for Distributed Storage.
Henry C. H. Chen and Patrick P. C. Lee
1 NCFS: On the Practicality and Extensibility of a Network-Coding-Based Distributed File System Yuchong Hu 1, Chiu-Man Yu 2, Yan-Kit Li 2 Patrick P. C.
Self-repairing Homomorphic Codes for Distributed Storage Systems [1] Tao He Software Engineering Laboratory Department of Computer Science,
Simple Regenerating Codes: Network Coding for Cloud Storage Dimitris S. Papailiopoulos, Jianqiang Luo, Alexandros G. Dimakis, Cheng Huang, and Jin Li University.
Yuchong Hu1, Henry C. H. Chen1, Patrick P. C. Lee1, Yang Tang2
Availability in Globally Distributed Storage Systems
Beyond the MDS Bound in Distributed Cloud Storage
Queueing Models for P2P Systems.  Extend classical queuing theory for P2P systems.  Develop taxonomy for different variations of these queuing models.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Compressive Oversampling for Robust Data Transmission in Sensor Networks Infocom 2010.
1 Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes Yunfeng Lin, Ben Liang, Baochun Li INFOCOM 2007.
Network Coding for Large Scale Content Distribution Christos Gkantsidis Georgia Institute of Technology Pablo Rodriguez Microsoft Research IEEE INFOCOM.
Informed Content Delivery Across Adaptive Overlay Networks J. Byers, J. Considine, M. Mitzenmacher and S. Rost Presented by Ananth Rajagopala-Rao.
On Object Maintenance in Peer-to-Peer Systems IPTPS 2006 Kiran Tati and Geoffrey M. Voelker UC San Diego.
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
1 Forward Error Correction Shimrit Tzur-David School of Computer Science and Engineering Hebrew University of Jerusalem.
Redundant Data Update in Server-less Video-on-Demand Systems Presented by Ho Tsz Kin.
Cooperative regenerating codes for distributed storage systems Kenneth Shum (Joint work with Yuchong Hu) 22nd July 2011.
Performance Evaluation of Peer-to-Peer Video Streaming Systems Wilson, W.F. Poon The Chinese University of Hong Kong.
RAPTOR CODES AMIN SHOKROLLAHI DF Digital Fountain Technical Report.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Efficient Sub-stream Encoding and Transmission for P2P Video on Demand 1 Efficient Sub-Stream Encoding and Transmission for P2P Video on Demand Zhengye.
1 Graduate Operating Systems iDIBS: Reliable and Efficient Distributed Backup Tam Chantem, Philip Little and Faruck Morcos.
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
Network Coding vs. Erasure Coding: Reliable Multicast in MANETs Atsushi Fujimura*, Soon Y. Oh, and Mario Gerla *NEC Corporation University of California,
NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds
Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu.
Network Coding Distributed Storage Patrick P. C. Lee Department of Computer Science and Engineering The Chinese University of Hong Kong 1.
Computer Science CSC 774Dr. Peng Ning1 CSC 774 Advanced Network Security Topic 2.4 Rabin’s Information Dispersal Algorithm Slides by Sangwon Hyun.
1 The Google File System Reporter: You-Wei Zhang.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
©2001 Pål HalvorsenINFOCOM 2001, Anchorage, April 2001 Integrated Error Management in MoD Services Pål Halvorsen, Thomas Plagemann, and Vera Goebel University.
IPDPS 2007 Making Peer-to-Peer Anonymous Routing Resilient to Failures Yingwu Zhu Seattle University
Repairable Fountain Codes Megasthenis Asteris, Alexandros G. Dimakis IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 32, NO. 5, MAY /5/221.
© 2012 A. Datta & F. Oggier, NTU Singapore Redundantly Grouped Cross-object Coding for Repairable Storage Anwitaman Datta & Frédérique Oggier NTU Singapore.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
Network Coding and Information Security Raymond W. Yeung The Chinese University of Hong Kong Joint work with Ning Cai, Xidian University.
UbiStore: Ubiquitous and Opportunistic Backup Architecture. Feiselia Tan, Sebastien Ardon, Max Ott Presented by: Zainab Aljazzaf.
Resilient P2P Anonymous Routing by Using Redundancy Yingwu Zhu.
Tsunami: Maintaining High Bandwidth Under Dynamic Network Conditions Dejan Kostić, Ryan Braud, Charles Killian, Eric Vandekieft, James W. Anderson, Alex.
Efficient P2P backup through buffering at the edge S. Defrance, A.-M. Kermarrec (INRIA), E. Le Merrer, N. Le Scouarnec, G. Straub, A. van Kempen.
Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Yuchong Hu, Yinlong Xu, Xiaozhao Wang, Cheng Zhan and Pei.
Serverless Network File Systems Overview by Joseph Thompson.
1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
Analyzing the Vulnerability of Superpeer Networks Against Attack Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
1 Enabling Efficient and Reliable Transitions from Replication to Erasure Coding for Clustered File Systems Runhui Li, Yuchong Hu, Patrick P. C. Lee The.
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
1 Push-to-Peer Video-on-Demand System. 2 Abstract Content is proactively push to peers, and persistently stored before the actual peer-to-peer transfers.
Network RS Codes for Efficient Network Adversary Localization Sidharth Jaggi Minghua Chen Hongyi Yao.
A Fast Repair Code Based on Regular Graphs for Distributed Storage Systems Yan Wang, East China Jiao Tong University Xin Wang, Fudan University 1 12/11/2013.
Dynamic Control of Coding for Progressive Packet Arrivals in DTNs.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
A Tale of Two Erasure Codes in HDFS
Double Regenerating Codes for Hierarchical Data Centers
Presented by Haoran Wang
Section 7 Erasure Coding Overview
Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary
Xiaoyang Zhang1, Yuchong Hu1, Patrick P. C. Lee2, Pan Zhou1
Image Coding and Compression
Motion-Aware Routing in Vehicular Ad-hoc Networks
Chapter 14: File-System Implementation
复杂网络可控性 研究进展 汪秉宏 2014 北京 网络科学论坛.
Presentation transcript:

Exact Regenerating Codes on Hierarchical Codes Ernst Biersack Eurecom France Joint work and Zhen Huang

Outline :: Introduction and motivation :: Hierarchical Codes :: Regenerating Codes :: Combining Hierarchical Codes and Regenerating Codes :: Conclusion

3 Motivation: Elements of a P2P backup system Performance metrics: Storage efficiency: how much redundant information do you store? From Julian Monteiro

4 Motivation: Network Bandwidth is a scarce resource Our first objective is to find erasure codes that consume less communication bandwidth, i.e. have better efficiency factor ρ - Network communication bandwidth cannot be “put aside” for later use A second objective should be to adopt repair policies that provide a smooth utilization of the communication bandwidth

Hierarchical Codes Regenerating Codes ER-Hierarchical Codes

Linear Codes: Overview - A particular way to build erasure codes is linear codes o1 o2 o3 o4 original fragments p1 p2 p3 p4 [c 1,1 c 1,2 c 1,3 c 1,4 ] [c 2,1 c 2,2 c 2,3 c 2,4 ] [c 3,1 c 3,2 c 3,3 c 3,4 ] [c 4,1 c 4,2 c 4,3 c 4,4 ] P = CO O=C -1 P If C is invertible, i.e. the coefficient vectors are linearly independent, we can reconstruct the original fragments. p1 c 1,1 c 1,2 c 1,4 c 1,3 parity fragment Linear combination p5 p6 If coefficients are chosen randomly in GF(2 16 ), the matrix is invertible with a very high probability.

7 Hierarchical codes: Idea let us try to change the way the code is built: o1o1 o2o2 o3o3 o4o4 p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 There are sets of 4 parity fragments that are not sufficient to reconstruct the original file. o1o1 o2o2 o3o3 o4o4 p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 traditional erasure code Hierarchical code

8 Hierarchical codes : Repair degree Failure ρ =4 ρ =2 The repair degree determines the efficiency factor ρ

9 Hierarchical codes: Recursive Construction HC-(k,h) k original blocks h redundant blocks

10 Hierarchical codes: Theory

11 Hierarchical codes: Repair What if p_1 and p_3 are lost? Use p_2, 1 out of {p_7, p_8} and 1 out {p_4, p_5, p_6}  need 3 blocks What if p_1, p_2, and p_3 are lost? Use …..  need ???? blocks In HC, the earlier we repair the repair is often “cheaper”

hierarchical codes: Reliability vs Cost Two possible instances of a hierarchical code - Lower repair cost comes at the prices of reduced reliability

ER-Hierarchical Codes Regenerating Codes Hierarchical Codes

14 Regenerating Codes: Idea What happens if… Regenerating codes (by G. Dimakis) give the answer: the repair communication requirements are much smaller. upon a repair we contact more than k peers? p1p1 p2p2 p5p5 p7p7 p’ 4 d>k p8p8 Every peer stores a parity block larger (or equal) than the usual parity fragment (i.e. 1/k of the file size)? o1 o2 o3 o4 |block|≥|file|/k b1

15 Regenerating codes: Performance - regenerating codes are controlled by two additional parameters beyond k and h :: d the repair degree :: i the block expansion index k ≤ d ≤ k+h-1 0 ≤ i ≤ k-1 - if we consider a regenerating code with k=32 and h=32: classical erasure codes MBR: Minimum-Bandwidth Regenerating MSR: Minimum-Storage Regenerating

16 Regenerating codes: Performance - k=32 and h=32 and a stored file of 1MB: codedirepair down storage Classical erasure code3201 MB2 MB “ extreme“ regenerating code KB2.61 MB “reasonable” erasure code40784,62 KB2.11 MB Communication is impressively reduced with small amount of extra storage. Additional space

17 Regenerating codes: A new dimension in the trade-off Storage Communication Erasure Codes Replication Regenerating codes can be seen as a generalization of replication and RSE that allow to more flexibly trade off communication and storage requirements. RC(k,h,d,i,) k original pieces h additional pieces d repair degree i block expansion factor

18 Regenerating codes: Want to know more See A wiki on Coding for Distributed Storage maintained by Alexandros G. Dimakis

Hierarchical Codes Regenerating Codes ER-Hierarchical Codes

20 ER-Hierarchical Codes Can we combine Hierarchical codes and Regenerating Codes? Yes: ER-Hierarchical Codes combine concepts of Hierarchical Codes and Regenerating Codes, namely that most parity blocks are linear combinations of only a small subset of all original blocks and that a storage block consists of α fragments, while a repair block has only β fragments, with, β < α

21 ER-Hierarchical Codes: Construction How to transform Hierarchical code into ER-Hierarchical Code?

22 ER-Hierarchical Codes: Construction

23 ER-Hierarchical Codes: Repair In HC we would need to download 4 blocks of size 1 each  4 units of traffic In ER-HC we now download 5 fragments of size ½ each  2.5 units of traffic

24 ER-Hierarchical Codes: Traffic reduction (analysis) ER-HC reduces the traffic by more than 85% as compared to RSE and Regenerating Codes 40% compared to Hierarchical codes Reg Code is MSR with d=k+1

25 ER-Hierarchical Codes: Repair Strategies

26 ER-Hierarchical Codes: Performance (simulation) In HC and ER-HC, the earlier we repair the “cheaper” the repair; is not the case for RG and RSE

27 Conclusion - Have presented some new codes that -greatly reduce the communications overhead -Regenerating codes apply principles of network coding to distributed storage and allow to trade off storage space for communications bandwidth -As compared to RSE codes -Regenerating codes increase the repair degree (number of nodes that must be contacted for repair) but significantly reduce the amount of data downloaded from each node -Hierarchical codes significantly reduce the repair degree while keeping the amount of data transferred by each node the same (as RSE) -Combining Regenerating Codes and Hierarchical Codes makes us win at both fronts -Reduces repair degree and the amount of data transmitted by each node

28 Future work Further exploit the possibilities offered by ER-Hierarchical Codes Study the relationship between coding and repair policies for systems with churn Reactive repair results in repair burst Proactive repair has smoother repair traffic but does unnecessary repairs. If repairs are cheap, as they are for ER-HC, proactive repair becomes much more attractive since the “earlier we repair”, the cheaper a repair -

Thanks Questions?