Fault Tolerant Distributed Computing system.

Fault Tolerant Distributed Computing system.

Fundamentals What is fault? Why fault tolerant?
A fault is a blemish, weakness, or shortcoming of a particular hardware or software component. Fault, error and failures Why fault tolerant? Availability, reliability, dependability, … How to provide fault tolerance ? Replication Checkpointing and message logging Hybrid

Message Logging Tolerate crash failures
Each process periodically records its local state and log messages received after Once a crashed process recovers, its state must be consistent with the states of other processes Orphan processes surviving processes whose states are inconsistent with the recovered state of a crashed process Message Logging protocols guarantee that upon recovery no processes are orphan processes

Message logging protocols
Pessimistic Message Logging avoid creation of orphans during execution no process p sends a message m until it knows that all messages delivered before sending m are logged; quick recovery Can block a process for each message it receives - slows down throughput allows processes to communicate only from recoverable states; synchronously log to stable storage any information that may be needed for recovery before allowing process to communicate

Message Logging Optimistic Message Logging
take appropriate actions during recovery to eliminate all orphans Better performance during failure-free runs allows processes to communicate from non-recoverable states; failures may cause these states to be permanently unrecoverable, forcing rollback of any process that depends on such states

Causal Message Logging
no orphans when failures happen and do not block processes when failures do not occur. Weaken condition imposed by pessimistic protocols Allow possibility that the state from which a process communicates is unrecoverable because of a failure, but only if it does not affect consistency. Append to all communication information needed to recover state from which communication originates - this is replicated in memory of processes that causally depend on the originating state.

KAN – A Reliable Distributed Object System
Developed at UC Santa Barbara Project Goal: Language support for parallelism and distribution Transparent location/migration/replication Optimized method invocation Fault-tolerance Composition and proof reuse

System Description Kan source Kan Compiler
Java bytecode + Kan run-time libraries JVM JVM JVM UNIX sockets

Fault Tolerance in Kan Log-based forward recovery scheme:
Log of recovery information for a node is maintained externally on other nodes. The failed nodes are recovered to their pre-failure states, and the correct nodes keep their states at the time of the failures. Only consider node crash failures. Processor stops taking steps and failures are eventually detected.

Basic Architecture of the Fault Tolerance Scheme
Logical Node y Logical Node x Fault Detector Failure handler Request handler Communication Layer Physical Node i External Log IP Address Network

Logical Ring Use logical ring to minimize the need for global synchronization and recovery. The ring is only used for logging (remote method invocations). Two parts: Static part containing the active correct nodes. It has a leader and a sense of direction: upstream and downstream. Dynamic part containing nodes that trying to join the ring A logical node is logged at the next T physical nodes in the ring, where T is the maximum number of nodes failures to tolerate.

Logical Ring Maintenance
Each node participating in the protocol maintains a variables: Failedi(j): true if i has detected the failure of j Mapi(x): the physical node on which logical node x resides Leaderi: i’s view of the leader of the ring Viewi: i’s view of the logical ring (membership and order) Pendingi: the set of physical nodes that i suspects of failing Recovery_counti: the number of logical nodes that need to be recovered Readyi: records whether I is active. Initial set of ready nodes; new nodes become ready when they are linked into the ring.

Failure Handling When node i is informed of failure of node j:
If every node upstream of i has failed, then I must become new leader. It remaps all logical nodes from the upstream physical nodes, and informs the other correct nodes by sending a remap message. It then recovers the logical nodes. If the leader has failed but there is some upstream node k that will become the new leader, then just update the map and leader variables to reflect the new situation If the failed node j is upstream of i, then just update map. If I is the next downstream node from j, also recover the logical nodes from j. If j is downstream of i and there is some node k downstream of j, then just update map. If j is downstream of I and there is no node downstream of j, then wait for the leader to update map. If i is the leader and must recover j, then change map, send a remap message to change the correct nodes’ maps, and recover all logical nodes that are mapped locally

Physical Node and Leader Recovery
When a physical node comes back up: It sends a join message to the leader. The leader tries to link this node in the ring: Acquire <-> Grant Add, Ack_add Release When the leader fails, the next downstream node in the ring becomes the new leader.

AQuA Fault tolerance Adaptive Quality of Service Availability
Developed in UIUC and BBN. Goal: Allow distributed applications to request and obtain a desired level of availability. Fault tolerance replication reliable messaging

Features of AQuA Uses the QuO runtime to process and make availability requests. Proteus dependability manager to configure the system in response to faults and availability requests. Ensemble to provide group communication services. Provide CORBA interface to application objects using the AQuA gateway.

Proteus functionality
How to provide fault tolerance for appl. Style of replication (active, passive) voting algorithm to use degree of replication type of faults to tolerate (crash, value or time) location of replicas How to implement chosen ft scheme dynamic configuration modification start/kill replicas, activate/deactivate monitors,voters

Group structure For reliable mcast and pt-to-pt. Comm
Replication groups Connection groups Proteus Communication Service Group for replicated proteus manager replicas and objects that communicate with the manager e.g. notification of view change, new QuO request ensure that all replica managers receive same info Point-to-point groups proteus manager to object factory

AQuA Architecture

Fault Model, detection and Handling
Object Fault Model: Object crash failure - occurs when object stops sending out messages; internal state is lost crash failure of an object is due to the crash of at lease one element composing the object Value faults - message arrives in time with wrong content (caused by application or QuO runtime) Detected by voter Time faults Detected by monitor Leaders report fault to Proteus; Proteus will kill objects with fault if necessary, and generate new objects

AQuA Gateway Structure

Egida Developed in UT, Austin
An object-oriented, extensible toolkit for low-overhead fault-tolerance Provides a library of objects that can be used to compose log-based rollback recovery protocols. Specification language to express arbitrary rollback-recovery protocols

Log-based Rollback Recovery
Checkpointing independent, coordinated, induced by specific patterns of communication Message Logging Pessimistic, optimistic, causal

Core Building Blocks Almost all the log-based rollback recovery protocols share event-driven structures The common events are: Non-deterministic events Orphans, determinant Dependency-generating events Output-commit events Checkpointing events Failure-detection events

A grammar for specifying rollback-recovery protocols
Protocol := <non-det-event-stmt>* <output-commit-event-stmt>* <dep-gen-event-stmt> <ckpt-stmt>op t <recovery-stmt>op t <non-det-event-stmt> := <event> : determinant : <determinant-structure> <Log <event-info-list> <how-to-log> on <stable-storage>>opt <output-commit-event-stmt> := <output-commit-proto> output commit on < event-list> <event> := send | receive | read | write <determinant-structure> := {source, sesn, dest, dest} <output-commit-proto> := independent | co-ordinated <how-to-log> := synchronously | asynchronously <stable-storage> := local disk | volatile memory of self

Egida Modules EventHandler Determinant HowToOutputCommit
LogEventDeterminant LogEventInfo HowToLog WhereToLog StableStorage VolatileStorage Checkpointing …

Fault Tolerant Distributed Computing system.

Similar presentations

Presentation on theme: "Fault Tolerant Distributed Computing system."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fault Tolerant Distributed Computing system.

Similar presentations

Presentation on theme: "Fault Tolerant Distributed Computing system."— Presentation transcript:

Similar presentations

About project

Feedback