Dependability ITV Real-Time Systems Anders P. Ravn Aalborg University February 2006
Characteristics of a RTS Timing Constraints Dependability Requirements Concurrent control of separate components Facilities to interact with special purpose hardware
Dependability - attributes Availability Reliability Safety Confidentiality Integrity Maintainability BW p. 139
Dependability - means Fault prevention Fault tolerance Error Removal Failure Forecasting BW p. 106,...
Dependability - impediments Faults Errors Failures BW p. 103,... FaultErrorFailure... Fault
System and Component
Fault classification Origin Kind Property physical (internal/external) logical (design/interaction) omission value timing byzantine duration (permanent, transient) consistency (determinate, nondeterminate) autonomy (spontaneous, event-dependent)
Error Classification (Fault Error) Effect Extent latent effective local distributed
Failure Classification (Fault Failure) Consequence benign malign (a mishap) BW (Failure modes) p. 105
Dependability - means Fault prevention Fault tolerance Error Removal Failure Forecasting
Fault Prevention Careful Design Conservative Design process (procedures) notations tools robust functionality testability tracability
Dependability - means Fault prevention Fault tolerance Error Removal Failure Forecasting
Error Removal Verification (analysis of design) Test (analysis of implementation)
Dependability - means Fault prevention Fault tolerance Error Removal Failure Forecasting
Calculation – analysis of design Simulation – measurement on design Test -- measurement on implementation
Dependability - means Fault prevention Fault tolerance Error Removal Failure Forecasting BW p. 106 …
Fault Tolerance Means to isolate component faults Prevents system failures May increase system dependability... And mask them
Fault Tolerance
FT - levels Full tolerance Graceful Degradation Fail safe BW p. 107
FT basis: Redundancy Time Space TryRetry... Try... BW p. 109
N-version programming V1 V2 V3 Driver (comporator) Comparison vectors (votes) Comparison status indicators BW p. 109 Comparison points
Fault classification (scope of N-VP) Origin Kind Property physical (internal/external) logical (design/interaction) omission value timing byzantine duration (permanent, transient) consistency (determinate, nondeterminate) autonomy (spontaneous, event-dependent) + (+) ++ (+) + / (+) + / +
Dynamic Redundancy 1.Error detection 2.Damage confinement and assessment 3.Error recovery 4.Fault treatment and continued service BW p. 114
Error Detection f: State x Input State x Output Environment (exception) Application BW p. 115 Assertion: precondition (input) postcondition (input, output) invariant(state, state’) Timing: WCET(f, input) Deadline (f,input) D
Damage Confinement Static structure Dynamic structure BW p. 117 object I I
Error Recovery Forward Backward BW p. 118 Repair the state – if you can ! define recovery points checkpoint state at r. p. roll back retry Domino effect
Recovery blocks ENSURE acceptance_test BY { module_1 } ELSE BY { module_2 }... ELSE BY { module_m } ELSE ERROR BW p. 120
The ideal FT-component Exception HandlerNormal mode Request/response Interface exception Interface exception Failure exception Failure exception BW p. 126