Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Chapter 8 Fault.

Similar presentations


Presentation on theme: "Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Chapter 8 Fault."— Presentation transcript:

1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Chapter 8 Fault Tolerance (2) DISTRIBUTED SYSTEMS (dDist)

2 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Distributed Commit (1/2) Given a process group and an operation –The operation might or might not be committable at all processes Either everybody commits or everybody aborts –Consistency, validity, termination

3 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Distributed Commit (2/2) Can we not just do this with Virtual Synchrony? –Coordinator multicasts vote request –All processes respond to request –Coordinator multicasts vote result COMMIT iff all vote COMMIT This handles some error cases But, what if a participant B crashes between a backup votes COMMIT and the COMMIT result is broadcast and then comes back to live?

4 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Two-Phase Commit 1) Commit → 2) Vote-request → 3) Vote-commit ← 4) Global-commit →

5 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Two-Phase Commit Figure 8-18. (a) The finite state machine for the coordinator in 2PC. (b) The finite state machine for a participant. Input event Output event COORDINATORPARTICIPANT

6 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Two-Phase Commit 2PC detects crashes via timeouts 2PC handles crashes by logging state to permanent storage, turning crash errors into reset errors

7 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Coordinator Perspective Blocks in WAIT –Participant may have failed –That participant might vote ABORT, in which case a GLOBAL COMMIT would be wrong and irreversible –So, must do a GLOBAL ABORT TIMEOUT COORDINATOR

8 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Coordinator Perspective Figure 8-20. Outline of the steps taken by the coordinator in a two-phase commit protocol.... COORDINATOR

9 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Coordinator Perspective Figure 8-20. Outline of the steps taken by the coordinator in a two-phase commit protocol.... COORDINATOR

10 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Participant Perspective Blocks in READY –Coordinator may have failed What to do? –Some participants may already have committed… –Perhaps another participant knows what to do…? PARTICIPANT

11 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Participant Perspective Figure 8-19. Actions taken by a participant P when residing in state READY and having contacted another participant Q. We know that coordinator managed to start commit At least one participant aborted and coordinator noticed Q did not even receive vote-request, so no one committed yet What if all in READY? After timeout allowing all messages in transit to arrive:

12 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Two-Phase Commit Figure 8-21. (a) The steps taken by a participant process in 2PC. PARTICIPANT

13 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 All READY (1/2) ? Why do we block when all live participants are in the READY state? PARTICIPANTCOORDINATOR

14 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 All READY (2/2) ? Same view, but different decisions, so Yellow needs to wait for Blue or Green to come up again and inspect their log files! PARTICIPANTCOORDINATOR

15 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Two-Phase Commit Two-Phase Commit has the problem that if the coordinator and one participant crashes at a bad time the entire system freezes until one of them is up again Getting a server up and running again typically involves human (a.k.a. very slow) intervention

16 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Three-Phase Commit Three-Phase Commit enhances Two- Phase Commit in that it is non-blocking in many more cases As long as the live participants can make a majority decision they can continue on their own If there are many participants, this makes it very unlikely that 3PC blocks

17 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Figure 8-22. (a) The finite state machine for the coordinator in 3PC. (b) The finite state machine for a participant. TIMEOUT PARTICIPANTCOORDINATOR Three-Phase Commit

18 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Figure 8-22. (a) The finite state machine for the coordinator in 3PC. (b) The finite state machine for a participant. TIMEOUT PARTICIPANTCOORDINATOR Three-Phase Commit

19 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 On timeout: IF anyone in ABORT  ABORT ELIF anyone in COMMIT  COMMIT ELIF anyone in INIT  ABORT ELSE elect new coordinator among the live New Coordinator: Go to WAIT and from there goto ABORT or PRECOMMIT ABORT: If a majority of participants are in READY PRECOMMIT: If a majority are in PRECOMMIT If no majority, then block PARTICIPANTCOORDINATOR

20 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 PARTICIPANTCOORDINATOR On timeout: IF anyone in ABORT  ABORT ELIF anyone in COMMIT  COMMIT ELIF anyone in INIT  ABORT ELSE elect new coordinator among the live New Coordinator: Go to WAIT and from there goto ABORT or PRECOMMIT ABORT: If a majority of participants are in READY PRECOMMIT: If a majority are in PRECOMMIT If no majority, then block If anyone is in PRECOMMIT, then original coordinators vote is set to be PRECOMMIT, as the original coordinator must be in PRECOMMIT

21 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 PARTICIPANTCOORDINATOR On timeout: IF anyone in ABORT  ABORT ELIF anyone in COMMIT  COMMIT ELIF anyone in INIT  ABORT ELSE elect new coordinator among the live New Coordinator: Go to WAIT and from there goto ABORT or PRECOMMIT ABORT: If a majority of participants are in READY PRECOMMIT: If a majority are in PRECOMMIT If no majority, then block If anyone is in PRECOMMIT, then original coordinators vote is set to be PRECOMMIT, as the original coordinator must be in PRECOMMIT

22 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 More Non-Blocking Follows from the decision rules that the live agents always can make decisions on their own unless no true majority for READY or PRECOMMIT can be found True majority: Majority among all processes, both dead and live

23 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Correctness (1/4) Let P and Q be any two processes which both acted as coordinator at some point THEOREM It can never happen that P is in ABORT and Q is in COMMIT Proof: 1.When P went to ABORT there was a true majority in READY 2.When Q went to COMMIT there was a true majority in PRECOMMIT 3.These two configurations are mutually exclusive

24 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Correctness (2/4) By construction: If there is a process in ABORT, then there is a coordinator in ABORT PARTICIPANTCOORDINATOR

25 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Correctness (3/4) Bu construction: If there is a process in COMMIT, then there is a coordinator in COMMIT PARTICIPANTCOORDINATOR

26 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Correctness (4/4) Let P and Q be any two processes COROLLARY It can never happen that P is in ABORT and Q is in COMMIT

27 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Summary Looked at Distributed Commit Distributed commit –2PC – blocking, has a bad state –3PC – less blocking, but not widely used in practice


Download ppt "Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Chapter 8 Fault."

Similar presentations


Ads by Google