Module – 3 Data protection – raid

Slides:



Advertisements
Similar presentations
Storage Management Lecture 7.
Advertisements

By Rakshith Venkatesh Outline What is RAID? RAID configurations used. Performance of each configuration. Implementations. Way.
A Case for Redundant Arrays Of Inexpensive Disks Paper By David A Patterson Garth Gibson Randy H Katz University of California Berkeley.
Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
Data Storage Solutions Module 1.2. Data Storage Solutions Upon completion of this module, you will be able to: List the common storage media and solutions.
RAID Redundant Array of Inexpensive Disks Presented by Greg Briggs.
Copyright © 2009 EMC Corporation. Do not Copy - All Rights Reserved.
1 Lecture 18: RAID n I/O bottleneck n JBOD and SLED n striping and mirroring n classic RAID levels: 1 – 5 n additional RAID levels: 6, 0+1, 10 n RAID usage.
1 Jason Drown Mark Rodden (Redundant Array of Inexpensive Disks) RAID.
RAID A RRAYS Redundant Array of Inexpensive Discs.
RAID: Redundant Array of Inexpensive Disks Supplemental Material not in book.
XtremIO Data Protection (XDP) Explained
 RAID stands for Redundant Array of Independent Disks  A system of arranging multiple disks for redundancy (or performance)  Term first coined in 1987.
Database Administration and Security Transparencies 1.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
© 2009 EMC Corporation. All rights reserved. EMC Proven Professional The #1 Certification Program in the information storage and management industry Data.
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
R.A.I.D. Copyright © 2005 by James Hug Redundant Array of Independent (or Inexpensive) Disks.
Chapter 3 Presented by: Anupam Mittal.  Data protection: Concept of RAID and its Components Data Protection: RAID - 2.
WHAT IS RAID? Christopher J Dutra Seton Hall University.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
This courseware is copyrighted © 2011 gtslearning. No part of this courseware or any training material supplied by gtslearning International Limited to.
RAID CS5493/7493. RAID : What is it? Redundant Array of Independent Disks configured into a single logical storage unit.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Module – 11 Local Replication
Module – 12 Remote Replication
Module – 7 network-attached storage (NAS)
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
By : Nabeel Ahmed Superior University Grw Campus.
Redundant Array of Inexpensive Disks (RAID). Redundant Arrays of Disks Files are "striped" across multiple spindles Redundancy yields high data availability.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
Lecture 13 Fault Tolerance Networked vs. Distributed Operating Systems.
LAN / WAN Business Proposal. What is a LAN or WAN? A LAN is a Local Area Network it usually connects all computers in one building or several building.
CSE 321b Computer Organization (2) تنظيم الحاسب (2) 3 rd year, Computer Engineering Winter 2015 Lecture #4 Dr. Hazem Ibrahim Shehata Dept. of Computer.
Redundant Array of Independent Disks
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
Two or more disks Capacity is the same as the total capacity of the drives in the array No fault tolerance-risk of data loss is proportional to the number.
Architecture of intelligent Disk subsystem
Data Center Infrastructure
N-Tier Client/Server Architectures Chapter 4 Server - RAID Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept RAID – Redundant Array.
Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
CSI-09 COMMUNICATION TECHNOLOGY FAULT TOLERANCE AUTHOR: V.V. SUBRAHMANYAM.
Copyright © 2014 EMC Corporation. All Rights Reserved. Block Storage Provisioning and Management Upon completion of this module, you should be able to:
Module – 4 Intelligent storage system
Redundant Array of Independent Disks.  Many systems today need to store many terabytes of data.  Don’t want to use single, large disk  too expensive.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
The concept of RAID in Databases By Junaid Ali Siddiqui.
Αρχιτεκτονική Υπολογιστών Ενότητα # 6: RAID Διδάσκων: Γεώργιος Κ. Πολύζος Τμήμα: Πληροφορικής.
RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,
Raid Techniques. Redundant Array of Independent Disks RAID is a great system for increasing speed and availability of data. More data protection than.
WINDOWS SERVER 2003 Genetic Computer School Lesson 12 Fault Tolerance.
© 2006 EMC Corporation. All rights reserved. Section 2 – Storage Systems Architecture Introduction.
Modul ke: Fakultas Program Studi Teknologi Pusat Data 13 FASILKOM Teknik Informatika Infrastruktur Pusat Data.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Seminar on RAID TECHNOLOGY Redundant Array of Independent Disk By CHANDAN.R 8 TH ISE, 1ap05is013 Under the guidance of Mr.Mithun.B.N, Lecturer,Dept.ISE.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
This courseware is copyrighted © 2016 gtslearning. No part of this courseware or any training material supplied by gtslearning International Limited to.
RAID TECHNOLOGY RASHMI ACHARYA CSE(A) RG NO
Network-Attached Storage. Network-attached storage devices Attached to a local area network, generally an Ethernet-based network environment.
© 2009 EMC Corporation. All rights reserved. Why do we need RAID o Performance limitation of disk drive o An individual drive has a certain life expectancy.
Data Center Infrastructure
Fujitsu Training Documentation RAID Groups and Volumes
Storage Virtualization
TECHNICAL SEMINAR PRESENTATION
UNIT IV RAID.
IST346: Storage and File Systems
Seminar on Enterprise Software
Presentation transcript:

Module – 3 Data protection – raid 1

Module 3: Data Protection – RAID Upon completion of this module, you should be able to: Describe RAID implementation methods Describe the three RAID techniques Describe commonly used RAID levels Describe the impact of RAID on performance Compare RAID levels based on their cost, performance, and protection Module 3: Data Protection - RAID

Module 3: Data Protection – RAID Lesson 1: RAID Overview During this lesson the following topics are covered: RAID Implementation methods RAID array components RAID techniques Module 3: Data Protection - RAID

Why RAID? RAID It is a technique that combines multiple disk drives into a logical unit (RAID set) and provides protection, performance, or both. Due to mechanical components in a disk drive it offers limited performance An individual drive has a certain life expectancy and is measured in MTBF: For example: If the MTBF of a drive is 750,000 hours, and there are 1000 drives in the array, then the MTBF of the array is 750 hours (750,000/1000) RAID was introduced to mitigate these problems Module 3: Data Protection - RAID

RAID Implementation Methods Software RAID implementation Uses host-based software to provide RAID functionality Limitations Use host CPU cycles to perform RAID calculations, hence impact overall system performance Support limited RAID levels RAID software and OS can be upgraded only if they are compatible Hardware RAID Implementation Uses a specialized hardware controller installed either on a host or on an array Module 3: Data Protection - RAID

RAID Array Components RAID Controller Logical Array (RAID Sets) Hard Disks Host RAID Array Module 3: Data Protection - RAID

RAID Techniques Three key techniques used for RAID are: Striping Mirroring Parity Module 3: Data Protection - RAID

RAID Technique – Striping Controller Stripe Host Module 3: Data Protection - RAID

RAID Technique – Mirroring Block 0 RAID Controller Block 0 Block 0 Host Module 3: Data Protection - RAID

RAID Technique – Parity 4 D1 6 D2 RAID Controller 1 D3 7 Host D4 18 P Actual parity calculation is a bitwise XOR operation Module 3: Data Protection - RAID

Data Recovery in Parity Technique 4 D1 6 D2 RAID Controller ? D3 7 Host D4 Regeneration of data when Drive D3 fails: 18 4 + 6 + ? + 7 = 18 ? = 18 – 4 – 6 – 7 ? = 1 P Module 3: Data Protection - RAID

Module 3: Data Protection – RAID Lesson 2: RAID Levels During this lesson the following topics are covered: Commonly used RAID levels RAID impacts on performance RAID comparison Hot spare Module 3: Data Protection - RAID

RAID Levels Commonly used RAID levels are: RAID 0 – Striped set with no fault tolerance RAID 1 – Disk mirroring RAID 1 + 0 – Nested RAID RAID 3 – Striped set with parallel access and dedicated parity disk RAID 5 – Striped set with independent disk access and a distributed parity RAID 6 – Striped set with independent disk access and dual distributed parity Module 3: Data Protection - RAID

RAID 0 C B A Data from host RAID Controller A1 A2 A3 A4 A5 B1 B2 B3 B4 Data Disks Module 3: Data Protection - RAID

RAID 1 F E D C B A Data from host RAID Controller A A D D B B E E C C Mirror Set Mirror Set Module 3: Data Protection - RAID

Nested RAID – 1+0 C B A Data from host RAID Controller A1 A1 A2 A2 A3 Striping RAID Controller Mirroring Mirroring Mirroring A1 A1 A2 A2 A3 A3 B1 B1 B2 B2 B3 B3 C1 C1 C2 C2 C3 C3 Mirror Set A Mirror Set B Mirror Set C Module 3: Data Protection - RAID

RAID 3 C B A Data from host RAID Controller A1 A2 A3 A4 AP B1 B2 B3 B4 BP C1 C2 C3 C4 CP Data Disks Dedicated Parity Disk Module 3: Data Protection - RAID

RAID 5 C B A Data from host RAID Controller A1 A2 A3 A4 AP B1 B2 B3 BP CP C3 C4 Distributed Parity Module 3: Data Protection - RAID

RAID 6 C B A Data from host RAID Controller A1 A2 A3 AP AQ B1 B2 BP BQ CP CQ C2 C3 Dual Distributed Parity Module 3: Data Protection - RAID

RAID Impacts on Performance RAID Controller Cp new Cp old C4 old C4 new = - + 2 3 4 1 A1 A2 A3 A4 AP B1 B2 B3 BP B4 C1 C2 CP C3 C4 In RAID 5, every write (update) to a disk manifests as four I/O operations (2 disk reads and 2 disk writes) In RAID 6, every write (update) to a disk manifests as six I/O operations (3 disk reads and 3 disk writes) In RAID 1, every write manifests as two I/O operations (2 disk writes) Module 3: Data Protection - RAID

RAID Penalty Calculation Example Total IOPS at peak workload is 1200 Read/Write ratio 2:1 Calculate disk load at peak activity for: RAID 1/0 RAID 5 Module 3: Data Protection - RAID

Solution: RAID Penalty For RAID 1/0, the disk load (read + write) = (1200 x 2/3) + (1200 x (1/3) x 2) = 800 + 800 = 1600 IOPS For RAID 5, the disk load (read + write) = (1200 x 2/3) + (1200 x (1/3) x 4) = 800 + 1600 = 2400 IOPS Module 3: Data Protection - RAID

where n = number of disks RAID Comparison RAID level Min disks Available storage capacity (%) Read performance Write performance Write penalty Protection 1 2 50 Better than single disk Slower than single disk, because every write must be committed to all disks Moderate Mirror 1+0 4 Good 3 [(n-1)/n]*100 Fair for random reads and good for sequential reads Poor to fair for small random writes fair for large, sequential writes High Parity (Supports single disk failure) 5 Good for random and sequential reads Fair for random and sequential writes 6 [(n-2)/n]*100 Good for random and sequential reads Poor to fair for random and sequential writes Very High (Supports two disk failures) where n = number of disks Module 3: Data Protection - RAID

Suitable RAID Levels for Different Applications Suitable for applications with small, random, and write intensive (writes typically greater than 30%) I/O profile Example: OLTP, RDBMS – Temp space RAID 3 Large, sequential read and write Example: data backup and multimedia streaming RAID 5 and 6 Small, random workload (writes typically less than 30%) Example: email, RDBMS – Data entry Module 3: Data Protection - RAID

Hot Spare RAID Controller Failed disk Replace failed disk Hot spare Module 3: Data Protection - RAID

Module 3: Summary Key points covered in this module: RAID implementation methods and techniques Common RAID levels RAID write penalty Compare RAID levels based on their cost and performance Module 3: Data Protection - RAID

Exercise 1: RAID A company is planning to reconfigure storage for their accounting application for high availability Current configuration and challenges Application performs 15% random writes and 85% random reads Currently deployed with five disk RAID 0 configuration Each disk has an advertised formatted capacity of 200 GB Total size of accounting application’s data is 730 GB which is unlikely to change over 6 months Approaching end of financial year, buying even one disk is not possible Task Recommend a RAID level that the company can use to restructure their environment fulfilling their needs Justify your choice based on cost, performance, and availability Module 3: Data Protection - RAID

Exercise 2: RAID A company (same as discussed in exercise 1) is now planning to reconfigure storage for their database application for HA Current configuration and challenges The application performs 40% writes and 60% reads Currently deployed on six disk RAID 0 configuration with advertised capacity of each disk being 200 GB Size of the database is 900 GB and amount of data is likely to change by 30% over the next 6 months It is a new financial year and the company has an increased budget Task Recommend a suitable RAID level to fulfill company’s needs Estimate the cost of the new solution (200GB disk costs $1000) Justify your choice based on cost, performance, and availability Module 3: Data Protection - RAID