Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fehlererkennung in SW David Rigler. Overview Types of errors detection Fault/Error classification Description of certain SW error detection techniques.

Similar presentations


Presentation on theme: "Fehlererkennung in SW David Rigler. Overview Types of errors detection Fault/Error classification Description of certain SW error detection techniques."— Presentation transcript:

1 Fehlererkennung in SW David Rigler

2 Overview Types of errors detection Fault/Error classification Description of certain SW error detection techniques Evaluation (Coverage / Overhead) Conclusion

3 Failure Runtime Detection (in Software) Software Diversity / N-Version P. Defensive Programming  Assertions  Bound/Range checking Control Flow checking  Block Entry Exit Checking  Error Capturing Instructions  Advanced Techniques … Redundant Data/Code HW - Failures SW - Failures

4 Transient Hardware Error Classification Data Errors Code Errors  Type S1 Statements affecting data only  Type S2 Statements affecting the execution flow  Type E1 Errors changing operation (not control flow)  Type E2 Errors changing the Statement type (S1  S2)

5 Data Errors (Executable Assertions) Generic  Bound  Integrity For SW and HW Errors Non-Generic  Value Range  Approximate (False alarm)

6 Data Errors (systematic Data Redundancy) Rules  Duplicate every variable: x -> (x1 and x2)  Perform write operations on x1 and x2  Read operation on x -> check for consistency of x1 and x2

7 Data Errors (systematic Data Redundancy) Generic Approach  Use pre-processor on high level language Compiler optimisations may be a problem All (visible) single Bit Flip Errors in DATA Memory can be detected

8 Control Flow Errors Block Entry Exit Checking  Unique signatures for Basic Blocks  Assign at Entry  Compare at Exit Problems  Jumps within Block  Granularity  Jumps to unused Area

9 Control Flow Errors Duplicate Condition Checks

10 Control Flow Errors Error Capturing Instructions  Special or unused Instructions Trap, SWI, …  Spread over unused Memory Program Memory Data Memory  Call Error Handling Function

11 Control Flow Errors Watchdog Timer  Periodically reset timer  Take Action at specific timer value  Needs Support of Hardware Common in embedded Controllers  Detects infinite loop errors

12 Coverage Example 1 BEEC, Duplicate Condition Checks, Systematic Data Redundancy Simulated bit-flip errors in memory ~ 5x Performance slow down ~ 2x Size No Silent Violations (Data) High Coverage even for Errors in Code Area.

13 Coverage Example 2 Physical Fault Injection  Heavy-Ion Radiation  Power-Supply Disturbances Hardware WDT Effect of additional SW  60%  85%

14 Improving Coverage Separate BB for redundant variables Separated in Memory  No single bit-flip jumps Use cumulative Signatures  Detect jumps within Block Avoid Signature aliasing  Hamming distance

15 100% Coverage For simple failure model  Single bit-flip  Data- and Code-Memory/Registers  Hidden Registers not included (Branch Buffer, Cache tags, etc) High Overhead  ~4x Memory usage  >3x Time

16 Conclusion: Error Detection in SW Pure SW: high coverage only for simple failure models Addition to HW Error Detection Trade-off: Overhead  Coverage  Fine tuning possible  Use available Resources (Time, Memory)

17 Miremadi G., J. Karlsson, U. Gunneflo, and J. Torin, Two Software Techniques for On-Line Error Detection, Proc. of the 22th International Symposium on Fault-Tolerant Computing (FTCS-22), July 1992, pp. 328-335. Miremadi G. and J. Torin, Evaluation Processor-Behavior Three Error-Detection Mechanisms Using Physical Fault-Injection, IEEE Trans. On Reliability, Vol. 44, No. 3, Sept. 1995, pp. 441-453. Rabejac C., J.-P. Blanquart, J.-P. Queille, Lab. for Dependability Eng., CNRS, Toulouse, France, Executable assertions and timed traces for on-line software error detection, Proc. of the 26th International Symposium on Fault-Tolerant Computing (FTCS-26), 1996. Alkhalifa Z., V. S. S. Nair, N. Krishnamurthy and J. A. Abraham, Design and Evaluation of Systemlevel Checks for On-line Control Flow Error Detection, IEEE Trans. on Parallel and Distributed Systems, Vol. 10, No. 6, Jun. 1999, pp. 627-641. M. Fazeli, R. Farivar, S. G. Miremadi, "A Software-Based Concurrent Error Detection Technique for PowerPC Processor-based Embedded systems", Proc. Of 20th IEEE Symposium on Defect and Fault Tolerance in VLSI Systems (DFT), Monterey, California, 2005. Software Detection Mechanisms Providing Full Coverage Against Single Bit-Flip Faults B. Nicolescu, Y. Savaria, Senior Member, IEEE, and R. Velazco, Member, IEEE Soft-error Detection through Software Fault-Tolerance techniques Maurizio REBAUDENGO, Matteo SONZA REORDA, Marco TORCHIANO, Massimo VIOLANTE


Download ppt "Fehlererkennung in SW David Rigler. Overview Types of errors detection Fault/Error classification Description of certain SW error detection techniques."

Similar presentations


Ads by Google