Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CDA3101 Fall 2013 Computer Storage: Practical Aspects 6,13 November 2013 Copyright © 2011 Prabhat Mishra.

Similar presentations

Presentation on theme: "1 CDA3101 Fall 2013 Computer Storage: Practical Aspects 6,13 November 2013 Copyright © 2011 Prabhat Mishra."— Presentation transcript:

1 1 CDA3101 Fall 2013 Computer Storage: Practical Aspects 6,13 November 2013 Copyright © 2011 Prabhat Mishra

2 2 Storage Systems Introduction Disk Storage Dependability and Reliability I/O Performance Server Computers Conclusion CDA 3101 – Fall 2013 Copyright © 2011 Prabhat Mishra

3 Case for Storage Shift in focus from computation to communication and storage of information The Computing Revolution (1960s to 1980s) –IBM, Control Data Corp., Cray Research The Information Age (1990 to today) –Google, Yahoo, Amazon, … Storage emphasizes reliability and scalability as well as cost-performance Program crash – frustrating Data loss is unacceptable dependability is key concern Which software determines HW features? Operating System for storage Compiler for processor

4 Cost vs Access time in DRAM/Disk DRAM is 100,000 times faster, and costs 30-150 times more per gigabyte.

5 Chapter 6 Storage and Other I/O Topics 5 Flash Storage Nonvolatile semiconductor storage 100× – 1000× faster than disk Smaller, lower power, more robust But more $/GB (between disk and DRAM) §6.4 Flash Storage

6 6 Hard Disk Drive

7 Seek Time is not Linear in Distance RULE OF THUMB: average seek is the time to access 1/3 rd of the number of cylinders -- it is not linear, accelerate arm, pause, decelerate, wait for settle time. -- The average does not work well due to locality property. Requires 3 revolutions to perform 4 reads (26, 100, 724, 9987) Requires just 3/4 th of a revolution

8 Dependability Fault: failure of a component May or may not lead to system failure Service accomplishment Service delivered as specified Service interruption Deviation from specified service FailureRestoration

9 Dependability Measures Reliability: mean time to failure (MTTF) Service interruption: mean time to repair (MTTR) Mean time between failures (MTBF) MTBF = MTTF + MTTR Availability = MTTF / (MTTF + MTTR) Improving Availability Increase MTTF: fault avoidance, fault tolerance, fault forecasting Reduce MTTR: improved tools and processes for diagnosis and repair

10 Disk Access Example Given 512B sector, 15,000rpm, 4ms average seek time, 100MB/s transfer rate, 0.2ms controller overhead, idle disk Average read time 4ms seek time + ½ / (15,000/60) = 2ms rotational latency + 512 / 100MB/s = 0.005ms transfer time + 0.2ms controller delay = 6.2ms If actual average seek time is 1ms Average read time = 3.2ms

11 Use Arrays of Small Disks? 14 105.253.5 Disk Array: 1 disk design Conventional: 4 disk designs Low End High End Can smaller disks be used to close gap in performance between disks and CPUs? Improves throughput, latency may not improve

12 Array Reliability Reliability of N disks = Reliability of 1 Disk ÷ N 50,000 Hours ÷ 70 disks = 700 hours Disk system MTTF: Drops from 6 years to 1 month! Arrays (w/o redundancy) too unreliable to be used Hot spares support reconstruction in parallel with access: very high media availability can be achieved Hot spares support reconstruction in parallel with access: very high media availability can be achieved

13 Redundant Arrays of (Inexpensive) Disks Files are "striped" across multiple disks Redundancy yields high data availability Availability: service still provided to user, even if some components failed Disks will still fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store redundant information Bandwidth penalty to update redundant information

14 RAID 1: Disk Mirroring/Shadowing Each disk is fully duplicated onto its mirror Very high availability can be achieved Bandwidth sacrifice on write: Logical write = two physical writes Reads may be optimized Most expensive solution: 100% capacity overhead recovery group

15 RAID 10 vs RAID 01 Striped mirrors RAID 1 + 0 For example, four pair of disks for four-disk data Mirrored stripes For example, pair of four disks for four-disk data RAID 0 + 1 15

16 RAID 2 Memory-style error correcting codes in disks Not used anymore. Other RAID organizations are more attractive 16

17 RAID 3: Parity Disk P 10010011 11001101 10010011... logical record 1010001110100011 1100110111001101 1010001110100011 1100110111001101 P contains sum of other disks per stripe mod 2 (parity) If disk fails, subtract P from sum of other disks to find missing information Striped physical records

18 Inspiration for RAID 4 RAID 3 relies on parity disk to discover errors on Read But every sector has an error detection field To catch errors on read, rely on error detection field vs. the parity disk Allows independent reads to different disks simultaneously

19 RAID 4: High I/O Rate Parity D0D1D2 D3 P D4D5D6 PD7 D8D9 PD10 D11 D12 PD13 D14 D15 P D16D17 D18 D19 D20D21D22 D23 P.............................. Disk Columns Increasing Logical Disk Address Stripe Inside of 5 disks Example: small read D0 & D5, large write D12-D15 Example: small read D0 & D5, large write D12-D15

20 Inspiration for RAID 5 RAID 4 works well for small reads Small writes (write to one disk): Option 1: read other data disks, create new sum and write to Parity Disk Option 2: since P has old sum, compare old data to new data, add the difference to P Small writes are limited by Parity Disk: Write to D0, D5 both also write to P disk D0 D1D2 D3 P D4 D5 D6 P D7

21 RAID 5: Distributed Parity N + 1 disks Like RAID 4, but parity blocks distributed across disks Avoids parity disk being a bottleneck Widely used

22 RAID 6: Recovering from 2 failures Why > 1 failure recovery? If operator accidentally replaces the wrong disk during a failure Since disk bandwidth is growing slowly than disk capacity, the MTTR of a disk is increasing increases the chances of a 2nd failure during repair since it takes longer –500 GB SATA disk could take 3 hours to read sequentially reading much more data during reconstruction meant increasing the chance of an uncorrectable media failure, which would result in data loss Increasing number of disks, use of ATA disks (slower and larger than SCSI disks).

23 RAID 6: Recovering from 2 failures Network Appliances row-diagonal parity or RAID-DP Like the standard RAID schemes, it uses redundant space based on parity calculation per stripe Since it is protecting against a double failure, it adds two check blocks per stripe of data. If p+1 disks total, p-1 disks have data Row parity disk is just like in RAID 4 Even parity across other data blocks in its stripe Each block of the diagonal parity disk contains the even parity of the blocks in the same diagonal

24 Example p = 5 Row diagonal parity starts by recovering one of the 4 blocks on the failed disk using diagonal parity Since each diagonal misses one disk, and all diagonals miss a different disk, 2 diagonals are only missing 1 block Once the data for those blocks are recovered, then the standard RAID recovery scheme can be used to recover two more blocks in the standard RAID 4 stripes Process continues until two failed disks are restored

25 I/O - Introduction I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections

26 I/O System Characteristics Dependability is important Particularly for storage devices Performance measures Latency (response time) Throughput (bandwidth) Desktops & embedded systems Primary focus is response time & diversity of devices Servers Primary focus is throughput & expandability of devices

27 Typical x86 PC I/O System

28 I/O Register Mapping Memory mapped I/O Registers are addressed in same space as memory Address decoder distinguishes between them OS uses address translation mechanism to make them only accessible to kernel I/O instructions Separate instructions to access I/O registers Can only be executed in kernel mode Example: x86

29 Polling Periodically check I/O status register If device ready, do operation If error, take action Common in small or low-performance real- time embedded systems Predictable timing Low hardware cost In other systems, wastes CPU time

30 Interrupts When a device is ready or error occurs Controller interrupts CPU Interrupt is like an exception But not synchronized to instruction execution Can invoke handler between instructions Cause information often identifies the interrupting device Priority interrupts Devices needing more urgent attention get higher priority Can interrupt handler for a lower priority interrupt

31 I/O Data Transfer Polling and interrupt-driven I/O CPU transfers data between memory and I/O data registers Time consuming for high-speed devices Direct memory access (DMA) OS provides starting address in memory I/O controller transfers to/from memory autonomously Controller interrupts on completion or error

32 Server Computers Applications are increasingly run on servers Web search, office apps, virtual worlds, … Requires large data center servers Multiple processors, networks connections, massive storage Space and power constraints Server equipment built for 19 racks Multiples of 1.75 (1U) high

33 Chapter 6 Storage and Other I/O Topics 33 Rack-Mounted Servers Sun Fire x4150 1U server

34 4 cores each 16 x 4GB = 64GB DRAM

35 Concluding Remarks I/O performance measures Throughput, response time Dependability and cost also important Buses used to connect CPU, memory, I/O controllers Polling, interrupts, DMA RAID Improves performance and dependability Please read Sections 6.1 – 6.10 P&H 4 th Ed.

36 THINK: Weekend !! 36 The best way to predict the future is to create it. Peter Drucker

Download ppt "1 CDA3101 Fall 2013 Computer Storage: Practical Aspects 6,13 November 2013 Copyright © 2011 Prabhat Mishra."

Similar presentations

Ads by Google