Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?

Similar presentations


Presentation on theme: "Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?"— Presentation transcript:

1 Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?

2 Lecture 4 2 RAID Redundant Array of Inexpensive Disks –Higher throughput –Ability to recover from failures Disk Array (Availability reduced to 1/N) Data Integrity Striping MTTR MTTF

3 Lecture 4 3 Manufacturing Advantages of Disk Arrays 14” 10”5.25”3.5” Disk Array: 1 disk design Conventional: 4 disk designs Low End High End Disk Product Families

4 Lecture 4 4 Replace Small # of Large Disks with Large # of Small Disks! (1988 Disks) Data Capacity Volume Power Data Rate I/O Rate MTTF Cost IBM 3390 (K) 20 GBytes 97 cu. ft. 3 KW 15 MB/s 600 I/Os/s 250 KHrs $250K IBM 3.5" 0061 320 MBytes 0.1 cu. ft. 11 W 1.5 MB/s 55 I/Os/s 50 KHrs $2K x70 23 GBytes 11 cu. ft. 1 KW 120 MB/s 3900 IOs/s ??? Hrs $150K Disk Arrays have potential for large data and I/O rates high MB per cu. ft., high MB per KW reliability?

5 Lecture 4 5 Array Reliability Reliability of N disks = Reliability of 1 Disk / N 50,000 Hours / 70 disks = 700 hours Disk system MTTF: Drops from 6 years to 1 month! Arrays (without redundancy) too unreliable to be useful! Hot spares support reconstruction in parallel with access: very high media availability can be achieved Hot spares support reconstruction in parallel with access: very high media availability can be achieved

6 Lecture 4 6 Redundant Arrays of Disks Files are "striped" across multiple spindles Redundancy yields high data availability Disks will fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store it Bandwidth penalty to update Mirroring/Shadowing (high capacity cost) Horizontal Hamming Codes (overkill) Parity & Reed-Solomon Codes Failure Prediction (no capacity overhead!) VaxSimPlus — Technique is controversial Techniques:

7 Lecture 4 7 Redundant Arrays of Disks RAID 1: Disk Mirroring/Shadowing Each disk is fully duplicated onto its "shadow" Very high availability can be achieved Bandwidth sacrifice on write: Logical write = two physical writes Reads may be optimized Most expensive solution: 100% capacity overhead Targeted for high I/O rate, high availability environments recovery group

8 Lecture 4 8 Redundant Arrays of Disks RAID 3: Parity Disk P 10010011 11001101 10010011... logical record 1001001110010011 1100110111001101 1001001110010011 0011000000110000 Striped physical records Parity computed across recovery group to protect against hard disk failures 33% capacity cost for parity in this configuration wider arrays reduce capacity costs, decrease expected availability, increase reconstruction time Arms logically synchronized, spindles rotationally synchronized logically a single high capacity, high transfer rate disk Targeted for high bandwidth applications: Scientific, Image Processing

9 Lecture 4 9 Redundant Arrays of Disks RAID 5+: High I/O Rate Parity A logical write becomes four physical I/Os Independent writes possible because of interleaved parity Reed-Solomon Codes ("Q") for protection during reconstruction A logical write becomes four physical I/Os Independent writes possible because of interleaved parity Reed-Solomon Codes ("Q") for protection during reconstruction D0D1D2 D3 P D4D5D6 P D7 D8D9P D10 D11 D12PD13 D14 D15 PD16D17 D18 D19 D20D21D22 D23 P.............................. Disk Columns Increasing Logical Disk Addresses Stripe Unit Targeted for mixed applications

10 Lecture 4 10 Problems of Disk Arrays: Small Writes D0D1D2 D3 P D0' + + D1D2 D3 P' new data old data old parity XOR (1. Read) (2. Read) (3. Write) (4. Write) RAID-5: Small Write Algorithm 1 Logical Write = 2 Physical Reads + 2 Physical Writes

11 Lecture 4 11 Subsystem Organization host array controller single board disk controller single board disk controller single board disk controller single board disk controller host adapter manages interface to host, DMA control, buffering, parity logic physical device control often piggy-backed in small format devices striping software off-loaded from host to array controller no applications modifications no reduction of host performance

12 Lecture 4 12 Summary: A Little Queuing Theory Queuing models assume state of equilibrium: input rate = output rate Notation: raverage number of arriving customers/second T ser average time to service a customer (tradtionally ต = 1/ T ser ) userver utilization (0..1): u = r x T ser T q average time/customer in queue T sys average time/customer in system: T sys = T q + T ser L q average length of queue: L q = r x T q L sys average length of system : L sys = r x T sys Little’s Law: Length system = rate x Time system (Mean number customers = arrival rate x mean service time) ProcIOCDevice Queue server System

13 Lecture 4 13 Summary: Redundant Arrays of Disks (RAID) Techniques Disk Mirroring, Shadowing (RAID 1) Each disk is fully duplicated onto its "shadow" Logical write = two physical writes 100% capacity overhead Parity Data Bandwidth Array (RAID 3) Parity computed horizontally Logically a single high data bw disk High I/O Rate Parity Array (RAID 5) Interleaved parity blocks Independent reads and writes Logical write = 2 reads + 2 writes Parity + Reed-Solomon codes 1001001110010011 1100110111001101 1001001110010011 0011001000110010 1001001110010011 1001001110010011

14 Lecture 4 14 Homework #3 Problem 6.5 and 6.6 Due date: June 26, 2001

15 Lecture 4 15 Assignment #1 Explain RAID 0-6 in greater detail. Due date: July 3, 2001


Download ppt "Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?"

Similar presentations


Ads by Google