Presentation is loading. Please wait.

Presentation is loading. Please wait.

First magnetic disks, the IBM 305 RAMAC (2 units shown) introduced in 1956. One platter shown top right. A RAMAC stored 5 million characters on 50 24-inch.

Similar presentations


Presentation on theme: "First magnetic disks, the IBM 305 RAMAC (2 units shown) introduced in 1956. One platter shown top right. A RAMAC stored 5 million characters on 50 24-inch."— Presentation transcript:

1 First magnetic disks, the IBM 305 RAMAC (2 units shown) introduced in 1956. One platter shown top right. A RAMAC stored 5 million characters on 50 24-inch diameter platters. Two access arms moved up and down to select a disk, and in/out to select a track. Right: a variety of disk drives: 8”, 5.25”, 3.5”, 1.8” and 1”.

2 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Storage Anselmo Lastra

3 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 3 Outline Magnetic Disks RAID Advanced Dependability/Reliability/Availability I/O Benchmarks, Performance and Dependability Conclusion

4 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 4 Disk Figure of Merit: Areal Density Bits recorded along a track ♦ Metric is Bits Per Inch (BPI) Number of tracks per surface ♦ Metric is Tracks Per Inch (TPI) Disk metric is bit density per unit area ♦ Metric is Bits Per Square Inch: Areal Density = BPI x TPI

5 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 5 Historical Perspective 1956 IBM Ramac — early 1970s Winchester ♦ For mainframe computers, proprietary interfaces ♦ Steady shrink in form factor: 27 in. to 14 in. Form factor and capacity drives market more than performance 1970s developments ♦ 5.25 inch floppy disk form factor ♦ Emergence of industry standard disk interfaces Early 1980s: PCs and first generation workstations Mid 1980s: Client/server computing ♦ Centralized storage on file server ♦ Disk downsizing: 8 inch to 5.25 ♦ Mass market disk drives become a reality industry standards: SCSI, IPI, IDE 5.25 inch to 3.5 inch drives for PCs; End of proprietary interfaces 1990s: Laptops => 2.5 inch drives 2000s: 1.8” used in media players (1” microdrive didn’t do as well)

6 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 6 Current Disks Caches to hold recently accessed blocks Microprocessor and command buffer to enable reordering of accesses

7 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 7 Future Disk Size and Performance Continued advance in capacity (60%/yr) and bandwidth (40%/yr) Slow improvement in seek, rotation (8%/yr) Time to read whole disk YearSequentiallyRandomly (1 sector/seek) 1990 4 minutes6 hours 2000 12 minutes 1 week(!) 2006 56 minutes 3 weeks (SCSI) 2006 171 minutes 7 weeks (SATA) Cost has dropped by 100,000 since 1983

8 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 8 Arrays of Small Disks? 14” 10”5.25”3.5” Disk Array: 1 disk design Conventional: 4 disk designs Low End High End Katz and Patterson asked in 1987: Can smaller disks be used to close gap in performance between disks and CPUs?

9 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 9 Advantages of Small Form Factor Disk Drives Low cost/MB High MB/volume High MB/watt Low cost/Actuator Cost and Environmental Efficiencies

10 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 10 Replace Small Number of Large Disks with Large Number of Small Disks! (1988 Disks) Capacity Volume Power Data Rate I/O Rate MTTF Cost IBM 3390K 20 GBytes 97 cu. ft. 3 KW 15 MB/s 600 I/Os/s 250 KHrs $250K IBM 3.5" 0061 320 MBytes 0.1 cu. ft. 11 W 1.5 MB/s 55 I/Os/s 50 KHrs $2K x70 23 GBytes 11 cu. ft. 1 KW 120 MB/s 3900 IOs/s ??? Hrs $150K Disk Arrays have potential for large data and I/O rates, high MB per cu. ft., high MB per KW, but what about reliability? 9X 3X 8X 6X

11 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 11 Array Reliability Reliability of N disks = Reliability of 1 Disk ÷ N 50,000 Hours ÷ 70 disks = 700 hours Disk system MTTF: Drops from 6 years to 1 month! Arrays (without redundancy) too unreliable to be useful! Hot spares support reconstruction in parallel with access: very high media availability can be achieved Hot spares support reconstruction in parallel with access: very high media availability can be achieved

12 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 12 Redundant Arrays of (Inexpensive) Disks Files are "striped" across multiple disks Redundancy yields high data availability ♦ Availability: service still provided to user, even if some components failed Disks will still fail Contents reconstructed from data redundantly stored in the array  Capacity penalty to store redundant info  Bandwidth penalty to update redundant info

13 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 13 RAID0 Performance only No redundancy Stripe data to get higher bandwidth Latency not improved

14 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 14 Redundant Arrays of Inexpensive Disks RAID 1: Disk Mirroring/Shadowing Each disk is fully duplicated onto its “mirror” Very high availability can be achieved Bandwidth sacrifice on write: Logical write = two physical writes Reads may be optimized Most expensive solution: 100% capacity overhead ( RAID 2 not interesting, so skip) recovery group

15 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 15 Redundant Array of Inexpensive Disks RAID 3: Parity Disk P 10010011 11001101 10010011... logical record 1010001110100011 1100110111001101 1010001110100011 1100110111001101 P contains sum of other disks per stripe mod 2 (“parity”) If disk fails, subtract P from sum of other disks to find missing information Striped physical records

16 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 16 RAID 3 Sum computed across recovery group to protect against hard disk failures, stored in P disk Logically, a single high capacity, high transfer rate disk: good for large transfers Wider arrays reduce capacity costs, but decreases availability 33% capacity cost for parity if 3 data disks and 1 parity disk

17 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 17 Inspiration for RAID 4 RAID 3 relies on parity disk to discover errors on Read But every sector has an error detection field To catch errors on read, rely on error detection field vs. the parity disk Allows independent reads to different disks simultaneously

18 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 18 Redundant Arrays of Inexpensive Disks RAID 4: High I/O Rate Parity D0D1D2 D3 P D4D5D6 PD7 D8D9 PD10 D11 D12 PD13 D14 D15 P D16D17 D18 D19 D20D21D22 D23 P.............................. Disk Columns Increasing Logical Disk Address Stripe Insides of 5 disks Example: small read D0 & D5, large write D12-D15 Example: small read D0 & D5, large write D12-D15

19 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 19 Inspiration for RAID 5 RAID 4 works well for small reads Small writes (write to one disk): ♦ Option 1: read other data disks, create new sum and write to Parity Disk ♦ Option 2: since P has old sum, compare old data to new data, add the difference to P (2 reads, 2 writes) Small writes are limited by Parity Disk: Write to D0, D5 both also write to P disk D0 D1D2 D3 P D4 D5 D6 P D7

20 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 20 Redundant Arrays of Inexpensive Disks RAID 5: High I/O Rate Interleaved Parity Independent writes possible because of interleaved parity Independent writes possible because of interleaved parity D0D1D2 D3 P D4D5D6 P D7 D8D9P D10 D11 D12PD13 D14 D15 PD16D17 D18 D19 D20D21D22 D23 P.............................. Disk Columns Increasing Logical Disk Addresses Example: write to D0, D5 uses disks 0, 1, 3, 4

21 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 21 Downside of Disk Arrays: Cost of Small Writes D0D1D2 D3 P D0' + + D1D2 D3 P' new data old data old parity XOR (1. Read) (2. Read) (3. Write) (4. Write) RAID-5: Small Write Algorithm 1 Logical Write = 2 Physical Reads + 2 Physical Writes

22 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 22 RAID 6: Recovering from 2 failures Like the standard RAID schemes, it uses redundant space based on parity calculation per stripe Idea is that operator may make mistake and swap wrong disk, or 2 nd disk may fail while replacing 1 st Since it is protecting against a double failure, it adds two check blocks per stripe of data. ♦ If p+1 disks total, p-1 disks have data; assume p=5

23 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 23 Summary of RAID Techniques Disk Mirroring, Shadowing (RAID 1) Each disk is fully duplicated onto its "shadow" Logical write = two physical writes 100% capacity overhead Parity Data Bandwidth Array (RAID 3) Parity computed horizontally Logically a single high data bw disk High I/O Rate Parity Array (RAID 5) Interleaved parity blocks Independent reads and writes Logical write = 2 reads + 2 writes 1001001110010011 1100110111001101 1001001110010011 0011001000110010 1001001110010011 1001001110010011

24 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 24 HW Failures in Real Systems: Tertiary Disks A cluster of 20 PCs in seven 7-foot high, 19-inch wide racks with 368 8.4 GB, 7200 RPM, 3.5-inch IBM disks. The PCs are P6-200MHz with 96 MB of DRAM each.

25 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 25 Does Hardware Fail Fast? 4 disks that failed in Tertiary Disk Author says that almost all disk failures began as transient failures. The operator had to decide when to replace.

26 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 26 Internet Archive Good section in text about Internet Archive ♦ In 2006, over a petabyte of disk (10 15 bytes) ♦ Growing at 20 terabytes (10 12 ) per month ♦ Now says ~ 3PB Each PC was 1GHz Via, 512MB, dissipates 80 W Each node had 4 500 GB drives 40 nodes/rack ♦ Petabyte takes 12 racks PC cost $500, each disk $375, 40-port Ethernet switch $3000

27 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Capricorn PS now AMD Athlon 64 (x2) 4 SATA disks (1-4TB) 92 Watts/node 40 nodes/rack So 160 TB/rack 24 kW/petabyte ♦ So 576 kWh/day, ~17,000/mo ♦ Ave. U.S. house used 920 kWh a month in 2006 ♦ Best housed in KY? 27

28 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Drives Today Just ordered simple RAID enclosure (just levels 0, 1) ♦ $63 Two 1TB SATA drives ♦ $85/ea 28

29 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 29 Summary Disks: Areal Density now 30%/yr vs. 100%/yr in 2000s Components often fail slowly Real systems: problems in maintenance, operation as well as hardware, software


Download ppt "First magnetic disks, the IBM 305 RAMAC (2 units shown) introduced in 1956. One platter shown top right. A RAMAC stored 5 million characters on 50 24-inch."

Similar presentations


Ads by Google