Presentation is loading. Please wait.

Presentation is loading. Please wait.

Error Detection in Hardware VO Hardware-Software-Codesign Philipp Jahn.

Similar presentations


Presentation on theme: "Error Detection in Hardware VO Hardware-Software-Codesign Philipp Jahn."— Presentation transcript:

1 Error Detection in Hardware VO Hardware-Software-Codesign Philipp Jahn

2 6.6.2007Error Detection in Hardware2 Error detection  How to detect errors with hardware methods during system operation  Conditions  Coverage (probability that error is detected)  Latency (time between start of error and detection)  Performance Slide from VO „Echtzeitsysteme“, H. Kopetz

3 6.6.2007Error Detection in Hardware3 Hardware-based error detection  Hardware redundancy  Passive (TMR, majority voting)  Active (duplication and comparison, standby)  Hybrid  Information redundancy  Parity  Checksums  Arithmetic Codes  Time redundancy  Watchdog timers  Checking  Capability Checking  Consistency Checking  Control-Flow Checking

4 6.6.2007Error Detection in Hardware4 Information redundancy (1)  Detection / Correction  Hamming distance  X = (1001), Y = (0111)  d(X,Y) = 3  SEC – DED

5 6.6.2007Error Detection in Hardware5 Information redundancy (2)  Parity  One extra bit (even / odd)  Decoding circuit (set of XOR gates)  Routine checking in busses, memory and registers  Detecting single bit errors (no stuck-at faults)

6 6.6.2007Error Detection in Hardware6 Information redundancy (3)  Overlapping parity  m of n codes  Duplication codes  Cycle redundancy checks  Sender and receiver agree upon generator polynom G(x)  Append checksum (k bit) at end of data frame (n-k bit)  Checksum / G(x) = 0  correct  Simple implementation (linear feedback shift register and XOR gates)  Detect single-bit errors, multiple adjacent bit errors affecting fewer than n-k bits, and burst transient errors  High successful in serial transmission (communication channels: Ethernet, Token Ring)

7 6.6.2007Error Detection in Hardware7 Information redundancy (4)  Checksums

8 6.6.2007Error Detection in Hardware8 Information redundancy (5)  Arithmetic Codes  Detect errors in arithmetic units (parity would not be preserved)  Separate or nonseparate  Examples  AN codes  Residue codes

9 6.6.2007Error Detection in Hardware9 Time redundancy (1)  Repetition of computations two or more times and then comparing (detection or correction by majority)  Error detected  maybe retry  Good for detecting transient faults  Not protecting against errors resulting from permanent faults  No extra hardware needed but longer processing time  Non-time-critical applications  Alternate Logic also detects permanent faults (self-checking circuits f(x) = f ‘(x’))

10 6.6.2007Error Detection in Hardware10 Time redundancy (2)  Handle permanent faults per encoding the second computation (must not alter calculation) e.g. k-shift  Error in k-1 consecutive bit of arithmetic or logical operation detected  Additional hardware (two shifters, storage register, comparator)

11 6.6.2007Error Detection in Hardware11 Watchdog timers  Implemented in hardware (external timer) or software (process)  If timer expires  system reset or recover  Detect only very specific type = control-flow error  If error occurs but timer reset  no detection  Difficult to determine runtime  High detection latency

12 6.6.2007Error Detection in Hardware12 Capability & Consistency Checking  Capability checking limits access to objects (e.g. memory segments) to authorized users (processes)  Implemented in hardware (error traps) or software (firewall)  e.g. checking of address validity by MMU  Consistency checking determines if states or results are reasonable  e.g. range checking, address checking, opcode checking

13 6.6.2007Error Detection in Hardware13 Control-Flow Checking (1)  Hardware scheme  Divide application program into blocks  Each block has a single entry and exit point  Reference signature represents an encoding of the correct execution  Watchdog processor validates the application program by comparing the runtime with the signature  70% of transient faults lead to control flow errors  Limitations  Only suitable for processors running single programs (multiple processes or threads)  Reduced coverage if transmission errors on the bus to the watchdog processor occurs

14 6.6.2007Error Detection in Hardware14 Control-Flow Checking (2)  Signatured Instruction Stream (SIS)  Hardware: Watchdog processor with cyclic code signature generator  Software: Modified assembler and loader  Control Flow Checking using Shadow Processing

15 6.6.2007Error Detection in Hardware15 Summary  Hardware low error latency  Hardware is more expensive  e.g. Massively parallel multiprocessors  Combining error detection mechanism

16 6.6.2007Error Detection in Hardware16 References  Ravishankar K. Iyer, Zbigniew Kalbarczyk - Hardware and Software Error Detection - Center for Reliable and High-Performance Computing, University of Illinois at Urbana-Champaign  Real-Time Systems, Design Principles for Distributed Embedded Applications Kopetz, Hermann, 1997, 356 p., Hardcover, ISBN: 978-0-7923- 9894-3  Alireza Vahdatpour, Mahdi Fazeli, Seyed Ghassem Miremadi - Transient Error Detection in Embedded Sysetms Using Reconfigurable Components - IES, October 2006  M. Dal Chin, W. Hohl, E. Michel, A. Pataricza - Error Detection Mechansims for Massively Parallel Multiprocessors - IEEE Proceedings, 1993  Evaluation of error detection coverage and fault-tolerance of digital plant protection system in nuclear power plants  http://robotics.ee.uwa.edu.au/courses/faulttolerant/notes/FT2b.pdf  A. Steiniger, C. Scherrer - Identifying Efficient Combinations of Error Detection Mechanisms Based on Results of Fault Injection Experiments - IEEE Transactions on computers, Vol. 51, No. 2, February 2002


Download ppt "Error Detection in Hardware VO Hardware-Software-Codesign Philipp Jahn."

Similar presentations


Ads by Google