Section 13.4 - Disk Failures Kevin Grant 007512375.

Section 13.4 - Disk Failures Kevin Grant 007512375

Disk Failures – Common Problems Intermittent Failure Happens when an attempt to read or write to a sector is unsuccessful but with repeated attempts it is able to perform successfully. Media Decay Happens when bits of a particular sector becomes corrupted and makes it impossible to read that sector no matter how many repeated tries occur.

Disk Failures – Common Problems Write Failure Happens when an attempt to write to a sector is made but it is unsuccessful and the user can not even retrieve the previously written sector. One possible cause is a power outage during writing. Disk Crash Happens when the entire disk becomes unreadable, suddenly, and permanently.

Intermittent Failures Occurs if we try to read a sector but the correct content of that sector is not delivered to the disk controller. Usually will retry a certain limit or number of times such as 100 tries.

How to cut down on these problems?

Checksums Each sector has additional bits called the checksum. These bits are set depending on the values of the data bits in the sector. If on reading the checksum is different than then checksum of the data bits than an error occurred during reading. One form of a checksum is based on the parity of the bits in the sector (Example Next Slide)

Parity-based Checksum Examples If Sector is composed of bits 01101000  Odd number of 1s so the parity bit is 1 and we add it to the original bits to get 011010001. If Sector is composed of bits 11101110  Even number of 1s so the parity bit is 0 and we add it to the original bits to get 111011100. This method poses a problem as it uses only 1 bit for the checksum thus leaving a 50% chance errors go undetected.

Parity-based Checksum By keeping several bits as parity bits we can improve our chances to detect error. 8 bits of parity would mean 50% chance that errors go undetected for each bit. The total probability for not detecting would be.5^8 = 1/256 would go undetected As a general model using N parity bits as checksum results in 1/2^N probability the error is not detected.

Stable Storage Stable Storage is used to prevent problems that occur when you attempt to overwrite data and an error occurs in writing and you lose the old and new data of that sector. Stable Storage involves having a pair for each sector. So that given a sector X we have both a XL and a XR that are both copies of X. Reading policy usually will alternate which side it reads, XR or XL, assuming if a good read value is received than that side contains true X.

Stable Storage - Operation 1. Write value of X into XL 2. Check that the parity check bits are correct in the written copy. If not, attempt rewrite. 3. If write is still unsuccessful after a set number of retries then XL has a media failure and we must allocate other sector space for XL and perform these steps again 4. Perform steps 1-3 for XR

Section 13.4 - Disk Failures Kevin Grant 007512375.

Similar presentations

Presentation on theme: "Section 13.4 - Disk Failures Kevin Grant 007512375."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Section 13.4 - Disk Failures Kevin Grant 007512375.

Similar presentations

Presentation on theme: "Section 13.4 - Disk Failures Kevin Grant 007512375."— Presentation transcript:

Similar presentations

About project

Feedback