Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.

Similar presentations


Presentation on theme: "CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion."— Presentation transcript:

1 CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion in distributed systems Centralized approach Token based approach Fully distributed approach Deadlock prevention  Today: Deadlock detection Global wait-for graph Deadlock detection algorithms Election algorithms

2 CS3772 Deadlock handling with deadlock detection  The deadlock-prevention may preempt resources even if no deadlock has occurred!  Deadlock detection is based on so called wait-for graphs, corresponds to local or nonlocal processes that either hold or request local resources.  A wait-for graph shows resource allocation state  A cycle in the wait-for graph represents deadlock P5P3 P1P2 P3 P2 P4 Site A Site B

3 CS3773 Global wait-for graphs  To show that there is NO DEADLOCK it is not enough to show that there is no cycle locally  We need to construct the global wait-for graph  It is the union of all local graphs. P5P3 P1 P2 P4

4 CS3774 How to construct this global wait-for graph?  Centralized approach:  The graph is maintained in ONE process: the deadlock-detection coordinator  Since there is communication delay in the system we have two types of graphs: Real wait-for graph // real but unknown state of the system Constructed wait-for graph // approximation generated by the coordinator during the execution of its algorithm  When is the wait-for graph constructed? 1.Whenever a new local edge inserted/removed a message is sent 2. Periodically maintained 3. Whenever the coordinator invokes the cycle-detector algorithm  What happens if a cycle is detected? The coordinator selects a victim and notifies all processes

5 CS3775 Centralized approach deadlock detection  False cycles may exist in the constructed global wait-for graph (because messages arrive in some order and delays contribute to edges added that form cycles; if a remove edge message arrives after another add edge message)  There is a centralized deadlock detection algorithm based on Option 3 that guarantees that it detects all deadlocks and no false deadlocks are detected.

6 CS3776 Centralized deadlock detection algorithm  Wait-for graph is built whenever the coordinator needs to invoke the cycle-detection algorithm.  Also 1. When Pi at site A requests a resource from Pj, at site B, a Request message is sent together with the TS. 2. The edge Pi->Pj is inserted in the local wait-for graph of A. 3. This edge is inserted in the local wait-for graph of B only if B cannot grant the requested resource immediately. TS is associated with the link. 4. Local edges do not have TSs associated.

7 CS3777 Cycle detection phase 1.The coordinator sends message to each site. 2.Each site responds with its local wait-for graph. 3.The coordinator builds global wait-for as follows:  Graph will contain a vertex per every process  Graph will have edge Pi->Pj if and only if:  Pi->Pj is a local edge in one of the local graphs, or  Pi->Pj has a TS associated and appears in more than 2 local graphs. 4.Cycle means deadlock, no cycle means no deadlock state in this graph.

8 CS3778 Fully distributed approach- deadlock detection  Each site is taking part in the deadlock detection process.  Every site constructs a special augmented local wait-for graph that characterizes the dynamic behavior of the system.  Deadlock means that there is a cycle at least at one of the sites.

9 CS3779 Building the special local wait- for graphs  We add one additional node, Pex, to the local wait- for graphs.  We add edges: Pi->Pex if Pi is waiting on a resource held by any process in a different site. Pex->Pj if there is a process in a remote site that waits on Pj.  Detection idea: if there is a cycle without Pex then it is a deadlock state; if Pex is involved then it is possibly a deadlock.

10 CS37710 Augmented local wait-for graphs P5P3 P1P2 P3 P2 P4 Site A Site B Pex P3 is waiting for a resource hold by P4, therefore P3->Pex Pex->P2 is added as P4 waits on P2 at site B

11 CS37711 Detection  If at site Si there is a cycle Pex->…Pk->Pex then: Assuming Pk is waiting on a resource from site Sj Si sends message to Sj, Sj’s local graph is updated, a cycle without Pex is searched for in Sj. If there is a cycle then the algorithm terminates, deadlock is detected. If there is a cycle involving Pex then a message is sent similar to above. The algorithm ends when all nodes have been updated and no local cycle not involving Pex is found or when a deadlock is detected.

12 CS37712 Detection - example P3 P2 P4 Site B Pex Edge P2->P3 is added, due to message from Site A Deadlock detected: P2,P3,P4,P2

13 CS37713 Election algorithms  Many distributed algorithms are using a coordinator process, e.g. in: Enforcing mutual exclusion Maintaining the global wait-for graph Replacing a lost token  In case the coordinator process fails a new copy of the coordinator should be started.  Where? – decided with election algorithms.

14 CS37714 Election algorithms - assumptions  Every process has a priority number.  The coordinator is always the process with the largest priority number.  If the coordinator fails the algorithm must elect the process with the largest priority number.

15 CS37715 The Bully algorithm  Suppose Pi sends a message to the coordinator and that is not answered within time T. Pi assumes failure of coordinator.  Pi sends an election message to all processes with higher priorities.  If no response is received within time T, Pi assumes that all these processes have failed. Pi starts a copy of the coordinator.  If a response is received then Pi waits for time T’ to see if a process with a higher priority has been elected.

16 CS37716 The Bully algorithm, cont.  If no answer is received within T’ then the algorithm is restarted (as this may indicate that the process that sent the first reply has failed).  A process that is elected sends a message to all other processes.

17 CS37717 Summary  Deadlocks are primarily handled with detection in distributed systems.  The main problem is maintaining the wait-for graph.  Some of the distributed algorithms require the use of a coordinator.  In case of failure of the coordinator an election algorithm is run to restart a new copy on some site.


Download ppt "CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion."

Similar presentations


Ads by Google