A Comprehensive Study on RAID6 Codes: Horizontal vs. Vertical Chao Jin*, Dan Feng*, Hong Jiang†, Lei Tian*† *Huazhong University of Science and Technology.

Slides:



Advertisements
Similar presentations
A Case for Redundant Arrays Of Inexpensive Disks Paper By David A Patterson Garth Gibson Randy H Katz University of California Berkeley.
Advertisements

Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
A CASE FOR REDUNDANT ARRAYS OF INEXPENSIVE DISKS (RAID) D. A. Patterson, G. A. Gibson, R. H. Katz University of California, Berkeley.
RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
RAID: Redundant Array of Inexpensive Disks Supplemental Material not in book.
CSCE430/830 Computer Architecture
XtremIO Data Protection (XDP) Explained
Henry C. H. Chen and Patrick P. C. Lee
Coding and Algorithms for Memories Lecture 12 1.
1 STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Mingqiang Li and Patrick P. C.
The TickerTAIP Parallel RAID Architecture P. Cao, S. B. Lim S. Venkatraman, J. Wilkes HP Labs.
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
Chapter 3 Presented by: Anupam Mittal.  Data protection: Concept of RAID and its Components Data Protection: RAID - 2.
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Facebook f4 Steve Ko Computer Sciences and Engineering University at Buffalo.
Sean Traber CS-147 Fall  7.9 RAID  RAID Level 0  RAID Level 1  RAID Level 2  RAID Level 3  RAID Level 4 
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
RAID Technology CS350 Computer Organization Section 2 Larkin Young Rob Deaderick Amos Painter Josh Ellis.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Performance/Reliability of Disk Systems So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Modularized Redundant Parallel Virtual System
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
1 Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes Yunfeng Lin, Ben Liang, Baochun Li INFOCOM 2007.
6/5/ TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time Authors: Qing Yang,Weijun Xiao,Jin Ren University of Rhode.
CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #6.
A Hybrid Approach of Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation Yinlong Xu University of Science and Technology of.
High Performance Computing Course Notes High Performance Storage.
Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.
More Codes Never Enough. 2 EVENODD Code Basics of EVENODD code  each storage node as a single column # of data nodes k = p (prime) # of total nodes n.
PARTITIONING “ A de-normalization practice in which relations are split instead of merger ”
7/2/2015Errors1 Transmission errors are a way of life. In the digital world an error means that a bit value is flipped. An error can be isolated to a single.
Mario Vodisek 1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure Codes for Reading and Writing Mario Vodisek ( joint work.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
Redundant Array of Inexpensive Disks (RAID). Redundant Arrays of Disks Files are "striped" across multiple spindles Redundancy yields high data availability.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
RAID Shuli Han COSC 573 Presentation.
Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2
1 Network Coding and its Applications in Communication Networks Alex Sprintson Computer Engineering Group Department of Electrical and Computer Engineering.
IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.
A Cost-based Heterogeneous Recovery Scheme for Distributed Storage Systems with RAID-6 Codes Yunfeng Zhu 1, Patrick P. C. Lee 2, Liping Xiang 1, Yinlong.
Array BP-XOR Codes for Reliable Cloud Storage Systems Yongge Wang UNC Charlotte, USA IEEE ISIT(International Symposium on Information Theory)
Planning Warehouse Storage Chapter 9. Data Partitioning zBreaking up a data into separate physical units that can be handled independently zEase of: -
The concept of RAID in Databases By Junaid Ali Siddiqui.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
All Your Queries are Belong to Us: The Power of File-Injection Attacks on Searchable Encryption Yupeng Zhang, Jonathan Katz, Charalampos Papamanthou University.
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
RAID TECHNOLOGY RASHMI ACHARYA CSE(A) RG NO
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
A Tale of Two Erasure Codes in HDFS
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
Vladimir Stojanovic & Nicholas Weaver
A Study of Group-Tree Matching in Large Scale Group Communications
Mean Value Analysis of a Database Grid Application
Rate 7/8 (1344,1176) LDPC code Date: Authors:
ICOM 6005 – Database Management Systems Design
Xiaoyang Zhang1, Yuchong Hu1, Patrick P. C. Lee2, Pan Zhou1
Dr. Zhijie Huang and Prof. Hong Jiang University of Texas at Arlington
Information Redundancy Fault Tolerant Computing
CSE 451: Operating Systems Autumn 2004 Redundant Arrays of Inexpensive Disks (RAID) Hank Levy 1.
CSE 451: Operating Systems Winter 2004 Module 17 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
Approximate Mean Value Analysis of a Database Grid Application
Presentation transcript:

A Comprehensive Study on RAID6 Codes: Horizontal vs. Vertical Chao Jin*, Dan Feng*, Hong Jiang†, Lei Tian*† *Huazhong University of Science and Technology †University of Nebraska-Lincoln

Outline Background Code-Length Extending Algorithm for Vertical RAID6 Codes Performance Analysis and Comparison Conclusion

Background RAID6 Codes – Can tolerate two concurrent disk failures (column erasures) Horizontal RAID6 Codes – Parity blocks held in dedicated parity columns – Data blocks held in data columns Vertical RAID6 Codes – No dedicated parity column – Data columns hold both data and parity blocks RDP - Horizontal Code P-Code - Vertical Code

Background RAID6 Code-Length Restrictions – Lengths (number of columns in the code structure) can not be an arbitrary number – Usually related to a prime number – RDP: prime+1; P-Code: prime-1, prime; … RAID6 Code-Length Extensions – Horizontal Codes: easy to extend; just by removing data columns directly – Vertical codes: harder; can not be extended like horizontal codes

Code-Length Extensions for Horizontal Codes Code-Length Extension: shortening from a standard code Removing some columns directly from the standard code Result in extended codes with shorter code lengths Removed columns assumed to contain only zeros, thus do not affect the fault-tolerating ability of the extended codes (1,5) (2,6) (3,7) (4,8) ( 1,6 ) (2,7) (3,8) (4) (1,7) (2,8) (3) (4,5) (1,8) (2) (3,5) (4,6) (1) (2,5) (3,6) (4,7) (5) (6) (7) (8) D1D2D3D4PQ Standard RDP Code with 6 columns Assume the column contain zeros, remove it from the code structure Extended RDP Code with 5 columns

Code-Length Extensions for Vertical Codes Vertical codes can not be shortened directly like horizontal codes Each column contain not only data blocks but also parity blocks Removing a parity block may leave the parity stripe in an inconsistent state Tow extending algorithms for vertical codes are proposed D1 (2,6) (3,5) (1) D2 (3,6) (4,5) (2) D3 (1,2) (4,6) (3) D4 (1,3) (5,6) (4) D5 (1,4) (2,3) (5) D6 (1,5) (2,4) (6) (2,6)(3,6)(2,6) (4,6) (3,6)(2,6) (5,6)(4,6) (3,6)(2,6) Parity stripe P(6) loses the failure recovery ability

Code-Length Extensions for Vertical Codes First extending algorithm – Select a new parity block for the parity stripe – The parity block is originally a data block of the parity stripe – The parity stripe remain consistent! D1 (2,6) (3,5) (1) D2 (3,6) (4,5) (2) D3 (1,2) (4,6) (3) D4 (1,3) (5,6) (4) D5 (1,4) (2,3) (5) D6 (1,5) (2,4) (6) (2,6)(3,6)(2,6) (4,6) (3,6)(2,6) (5,6)(4,6) (3,6)(2,6) (4,6) (3,6) (5,6)(4,6) (3,6) (4,6)(5,6)(2,6)  =

Code-Length Extensions for Vertical Codes Second extending algorithm – Remove the entire parity stripe from the code structure – Additional data blocks may be removed to keep an equal number of blocks per column – Removed blocks may be assumed to be zero blocks D1 (3,5) (1) D2 (4,5) (2) D3 (1,2) (4,6) (3) D4 (1,3) (5,6) (4) D5 (1,4) (2,3) (5) D6 (1,5) (2,4) (6) (4,6)(5,6)(4,6)(3,6)(2,6)

RAID6 Code Performance Metrics Space Efficiency – Ratio between data volume and whole volume (data and parity volume). [Optimal (highest, MDS Codes): (n-2)/n] – Related to the redundant rate of the RAID6 systems Update Complexity – Average number of parity blocks need to be updated upon each data block update. [Optimal (lowest): 2] – Impact write overhead of the RAID6 systems Computational Complexity – Average number of XOR per data block during encoding/decoding. [Optimal (lowest) for MDS Codes: 2- 2/(n-2), (n-3)] – Impact CPU overhead of the RAID6 systems

Space Efficiency Space efficiency comparison for RDP and P-Code – Non-standard code lengths obtained by code-length extending algorithms – P-Code – first/second refers to extended P-Code by the first/second extending algorithm for vertical codes – RDP and P-Code – first are MDS codes with optimal space efficiency; P-Code – second is non-MDS codes

Computational Complexity Computational complexity comparison for RDP and P-Code – RDP and P-Code – first have non-optimal (higher) computational complexity at extended code lengths – P-Code – second has even lower computational complexity than the optimal computational complexity for MDS codes – Reveal that non-MDS codes has lower computational complexity than MDS codes; proved trade offs between space efficiency and computational complexity

Update Complexity Update complexity comparison between RDP and P-Code – P-Code – second always has the optimal update complexity of 2 – P-Code – first has non-optimal update complexity at extended lengths – RDP has non-optimal update complexity, with an asymptotic value of 3

Summarize RAID-6 Codes Space Efficiency Computational Complexity Update Complexity Standard Lengths Extended Lengths Standard Lengths Extended Lengths Standard Lengths Extended Lengths RDP MDS (optimal) MDS (optimal) Optimal Of MDS Higher than optimal of MDS > 2 Higher than optimal > 2 Higher than optimal P-Code - first MDS (optimal) MDS (optimal) Optimal Of MDS Higher than optimal of MDS 2 (optimal) > 2 Higher than optimal P-Code - second Non-MDS (lower than optimal) Lower than optimal of MDS 2 (optimal)

Vertical shortening of vertical RAID6 codes Removing data rows instead of columns from the vertical code structure Data rows do not contain parity blocks, so vertical shortening do not damage the parity stripe consistency Vertically shortened codes are non-MDS codes, but with lower computational complexity Provide tradeoffs between space efficiency and computational complexity D1 (2,6) (3,5) (1) D2 (3,6) (4,5) (2) D3 (1,2) (4,6) (3) D4 (1,3) (5,6) (4) D5 (1,4) (2,3) (5) D6 (1,5) (2,4) (6) (2,6)(3,6)(2,6) (4,6) (3,6)(2,6) (5,6)(4,6) (3,6)(2,6) Assume they are zero blocks and remove them Vertically shortened P-Code

Conclusions Horizontal codes are easy to be extended to eliminate the code-length restrictions, but vertical codes are not easy to be extended. We proposed two extending algorithms for vertical codes. The first one selects a new parity block for the parity stripe; the second one removes the entire parity stripe from the code structure. We compared the performance of horizontal codes and vertical codes at consecutive code lengths. We also studied the impact of the code-length extending algorithms on the performance of the RAID6 codes. We proposed the vertical shortening algorithm for vertical codes. The algorithm can provide the tradeoff between space efficiency and computational complexity for vertical codes.

Thanks!