Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University

Similar presentations


Presentation on theme: "1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University"— Presentation transcript:

1 1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University pattara.l@ku.ac.th

2 2 Dependability ( ความเชื่อถือได้ )  Trustworthiness ( ความไว้วางใจ, ความเชื่อมั่น, ความเชื่อ ใจ ) of a computer system Reliance can be justified by the service it delivers Why dependability is necessary for computer?  Life critical task (lost of human life) Patient monitoring Missile guidance control Air traffic system (i.e. Die Hard2 )  Task that critically depends on computers (financial lost) Banking systems Stock markets Online shopping Introduction Make a group and give an example as many as you can (10 min).

3 3 Reliability and Availability Attributes important for dependability  Reliability ( ความน่าเชื่อถือ ), availability ( การหามาได้ ), safety, security Attributes important for fault tolerance  Reliability Deals with continuity of services  Availability Deals with readiness for usage

4 4 Fault Avoidance & Fault Tolerance Fault Avoidance  Approach to prevent faults from the occurring or getting introduced into the system (direct approach) Fault Tolerance  Approach to provide service despite the presence of faults in the system. Fault = abnormality of a component of the system

5 5 Fault Avoidance Eliminate as many faults as possible before the system is put in use. Has no redundancy ( ความซ้ำซ้อน ) Focus on methodologies on design, testing and validation All component must work correctly without failing, at all time. Manual maintenance methods are needed to repair the system when failure takes place IMPOSSIBLE

6 6 Failure, Fault, Error Fault  Abnormality of a component of the system  Cause of an error and failure Error  Abnormal state of a component system of a system  Appearance of fault in the system  Cause of failure Failure  The system cannot provide the desired service (behavior of system deviates from the required specification) 100% Not 100% 0 0 1 0 1 OK 0 1 NG

7 7 Type of Faults (by Duration) Duration  Transient fault ( ชั่วคราว ) Faults of limited duration (exist only in short duration) Caused by temporary malfunction of system Hard to detect Intermittent fault ( เป็นช่วงๆ ) (transient fault that occurs repeatedly in short duration)  Permanent fault ( ถาวร ) Permanently exist until the faulty component is repaired Most of techniques for fault tolerance assume that the component fail permanently Should be detected

8 8 Type of Faults (by Phase) Phase in which faults are introduced  Design fault Introduced during system design Introduced during modification of the system  Operational fault Appear during the system life time, and caused due to the physical reasons

9 9 Fault Tolerance and Redundancy Fault-tolerant system A system that can mask ( ปิดบัง ) an effect of fault by using redundancy Redundancy ( ความซ้ำซ้อน, การมีมากเกินไป ・ A kind of redundancy is needed for fault tolerant system ・ Defined as those parts of the system that are not needed for the correct functioning system (No need when the system is normal)  Space Redundancy Hardware, Software  Time Redundancy Extra time for performing tasks for fault tolerance Goal = avoid system failure even if faults are present

10 10 Digital Circuit Review x0x0 x2x2 x4x4 x6x6 x1x1 x3x3 x5x5 x7x7 x8x8 z2z2 z1z1


Download ppt "1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University"

Similar presentations


Ads by Google