Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thomas Schwarz, S.J. Qin Xin, Ethan Miller, Darrell Long, Andy Hospodor, Spencer Ng Summarized by Leonid Kibrik.

Similar presentations


Presentation on theme: "Thomas Schwarz, S.J. Qin Xin, Ethan Miller, Darrell Long, Andy Hospodor, Spencer Ng Summarized by Leonid Kibrik."— Presentation transcript:

1 Thomas Schwarz, S.J. Qin Xin, Ethan Miller, Darrell Long, Andy Hospodor, Spencer Ng Summarized by Leonid Kibrik

2 outline Archive storage systems properties Disk structures Disk failure Data redundancy Disk scrubbing Disk power cycling Simulation resultes

3 Archive storage systems Large scale storage systems based on disks are becoming increasingly attractive for archive storage systems. Very large number of disks, store petabytes of data. Most of the disk are powered off between accesses to conserve power and extend disk lifetime. Massive Array of mainly Idle Disks (MAID)

4 Disk structure Data in disks are addressed in Blocks Each block contains one or more sectors Sectors usually contain 512 Bytes of data Each sector has error correcting code (ECC) bits

5 Preamble – used to sync R/W head ECC – usually reed-solomon code

6 Disk failure Many block defects are the result of: Imperfection in the machining of disk substrate Non-uniformity in the magnetic coating Contaminants within the head disk assembly Disk manufacture try to detect disk manufacturing defects during a self-scan and “map-out” defects blocks. (P-List) During a lifetime of a disk, additional blocks can be map-out (G- List) Driver only detects error after reading the affected block On multiple errors in a block the ECC can: Correct the errors Flag the read as unsuccessful Mis-correct the error (extremely rare)

7 Disk Failure Rates Most block failures are not related Device failure rates are specified by disk drive manufactures as MTBF (Mean Time Between Failures) The actual observed values depend in practice heavily on operation condition that are frequently worse than the manufactures implicit assumption. Errors may occur even if we do not access the disk.

8 Data redundancy In large systems, disk failure will become frequent =>we need some kind of a redundancy in storing the data Collect data into large reliability blocks, group m of these blocks in a redundancy group to which we add k parity blocks Parity blocks are calculated with an erasure correcting code Data is recoverable if we can access m out of the n = m + k blocks making up the redundancy group If error are not detected the data is in jeopardy

9 Disk scrubbing Disk scrubbing – reading all the data in a certain regain called a scrubbing block. (s-block) If a certain sector suffers failure, the internal ECC on the disk sector flags the sector a unreadable, but only when a sector is read. Periodically scrub an s- block by reading it into the drive buffer

10 Different scrubbing strategies Random scrubbing – scrub a s-block at random times, with a fixed mean time between scrubs Deterministic scrubbing – scrub a s-block at fixed time intervals. Opportunistic scrubbing – piggy-backs as much as possible on other disk operations to avoid additional power on cycles.

11 Power Cycling and Reliability Turning a disk on and off has significant impact on the reliability of the disk. Especially true for commodity disks that lack techniques used by more expensive laptop disks to keep the R/W head from touching the surface during power-down Disk manufacturers are reluctant to publish actual failure rates, because they depend strongly on how disks are operated Estimated analysis of Seagate data show that power cycling a disk is equivalent to running the drive for eight hours in terms of driver reliability

12 Simulation results 1PB archival data store. Disks have MTBF of 10 5 hours. 10,000 disk drives 10GB reliability blocks. ~1TB/day traffic

13 Simulation results Two redundancy schemes: Two-way mirroring RAID 5 The mean time to scrub of a single disk is set up to 3 time per year for random and deterministic schemes For the opportunistic scheme we scrub the disk no more then 3 times per year

14 Two-way Mirroring

15 RAID 5 redundancy scheme

16 Result analysis When no scrubbing is done there is a great deal of data loss Random scrubbing performs the worst of the 3 methods Opportunistic schema provides high reliability when data access is relatively frequent, but number of data losses are increased when data access is not infrequent For systems where data is infrequently accessed we must power disks on periodically to scrub them in addition to doing scrubbing when the drive is accessed normally.

17 Conclusion When dealing with a systems that contains large number of disks, disk failure is likely to accrue For a redundancy method to be effective, error detection needs to happen as soon as possible. Disk scrubbing is an essential technique in a large storage system Opportunistic scrubbing is an attractive scheme that allows detecting of error without unnecessary power cycling disk drives.


Download ppt "Thomas Schwarz, S.J. Qin Xin, Ethan Miller, Darrell Long, Andy Hospodor, Spencer Ng Summarized by Leonid Kibrik."

Similar presentations


Ads by Google