Presentation is loading. Please wait.

Presentation is loading. Please wait.

Error Correcting Memory EECS 373 Jon Beaumont Ben Mason.

Similar presentations


Presentation on theme: "Error Correcting Memory EECS 373 Jon Beaumont Ben Mason."— Presentation transcript:

1 Error Correcting Memory EECS 373 Jon Beaumont Ben Mason

2 What is ECC? Error Correcting Code is a mechanism for systems to ensure that data is reliable in all cases

3 Why ECC? ECC prevents both Soft Errors Transmission Errors This is particularly necessary in systems that must run continuously with very low tolerance for error

4 What happens after a Soft Error? Incorrect values in the instruction or data streams Best case: Execution of illegal instructions or memory addresses Automatic reboot Worst case: Error goes undetected and multiplies as data is used to calculate new data http://www.eetimes.com/design/programmable-logic/4390101/Enabling-error-resilience-throughout-the-embedded-system

5 ECC vs No-ECC

6 ECC Considerations What range of errors? How much overhead? Detection versus Correction

7 Different Methods of Memory Correction Detection Parity bit Detection and correction Triple-redundancy Hamming Code

8 Parity Bit (even parity) For every chunk of data, add a single parity bit set so there are in total an even number of binary 1's An odd number of binary 1's means an error has occured

9 Parity Bit (even parity) Raw Data: 1001011 (4 1’s) 0101111 (5 1’s) Prepend a parity bit 01001011 10101111

10 Parity Bit Cons Can detect only an odd number of errors No way to detect which bit caused an error, can only discard data Pros Simple to implement (XOR) Low overhead Good for applications in which the original data can be easily resent/recalculated (e.g. SCSI, PCI, UART)

11 Triple Redundancy Data is calculated and stored 3 times Majority wins Pros: Simple to execute Can correct errors (potentially multiple bits) Cons: Very inefficient (1/2 data:overhead)

12 Objective: A concise method of detecting the precise location of an error so that it can be detected and corrected without drastic action Intuition: include multiple parity bits, so that each data bit can be uniquely identified by a set of parity bits which cover it Hamming Code

13 Algorithm: Assign each position in a chunk of data a binary number Those positions that are a power of 2 (i.e. have exactly one 1 bit) are parity bits

14 Hamming Code Algorithm: Parity bits cover all data bits whose binary position shares a common 1 bit [D7, D5, D3, P1]

15 Hamming Code Algorithm: Parity bits cover all data bits whose binary position shares a common 1 bit [D7,D6, D3, P2]

16 Hamming Code Algorithm: Parity bits cover all data bits whose binary position shares a common 1 bit [D7, D6, D5, P4]

17 Hamming Code Example: Encoding the following nibble using even-parity: b1101 Allocate space for parity bits: b110_1__

18 Hamming Code Example: Encoding the following nibble using even-parity: b1101 P1 covers [D7,D5,D3] b110_1_?

19 Hamming Code Example: Encoding the following nibble using even-parity: b1101 P1 covers [D7,D5,D3] b110_1_0

20 Hamming Code Example: Encoding the following nibble using even-parity: b1101 P2 covers [D7,D6,D3] b110_1?0

21 Hamming Code Example: Encoding the following nibble using even-parity: b1101 P2 covers [D7,D6,D3] b110_110

22 Hamming Code Example: Encoding the following nibble using even-parity: b1101 P4 covers [D7,D6,D5] b110?110

23 Hamming Code Example: Encoding the following nibble using even-parity: b1101 P4 covers [D7,D6,D5] b1100110

24 Hamming Code Example: Encoding the following nibble using even-parity: b1101 Encoded data b1100110

25 Hamming Code D6 gets flipped between write and read b1100110 -> b1000110

26 Hamming Code D6 gets flipped between write and read b1100110 -> b1000110 Parity bit 1: b1000110 Even number 1 bits -> No Error

27 Hamming Code D6 gets flipped between write and read b1100110 -> b1000110 Parity bit 2: b1000110 Odd number 1 bits -> ERROR Parity bits generating error: [P2]

28 Hamming Code D6 gets flipped between write and read b1100110 -> b1000110 Parity bit 4: b1000110 Odd number 1 bits -> ERROR Parity bits generating error: [P2, P4]

29 X= ERRORO= NO ERROR Only column with just X's is D6, the incorrect bit Hamming Code D3D5D6D7 P1OOO P2XXX P4XXX

30 Hamming Code Pros: Overhead of only O(log(n)) bits 4 data bits -> 3 parity bits (57%) 248 data bits -> 8 parity bits (97%) Good for large chunks of memory (DRAM) Cons: More complicated to implement detection logic than simple parity bit

31 Drawbacks of ECC More Expensive When error correcting algorithm acts on shorter correction code, performance drops abruptly. This loss of performance known as “error floor phenomenon”

32 Recent Developments in ECC Moving away from Hamming Code scheme towards BCH code which is more efficient For more information visit http://www.princeton.edu/~achaney/tmve/wiki100k/docs/ BCH_code.html http://www.princeton.edu/~achaney/tmve/wiki100k/docs/ BCH_code.html

33 Questions?


Download ppt "Error Correcting Memory EECS 373 Jon Beaumont Ben Mason."

Similar presentations


Ads by Google