Distributed process management: Distributed deadlock

Slides:



Advertisements
Similar presentations
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community.
Advertisements

Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
1 Concurrency: Deadlock and Starvation Chapter 6.
1 Distributed Deadlock Fall DS Deadlock Topics Prevention –Too expensive in time and network traffic in a distributed system Avoidance.
Handling Deadlocks n definition, wait-for graphs n fundamental causes of deadlocks n resource allocation graphs and conditions for deadlock existence n.
Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.
Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community.
Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.
6. Deadlocks 6.1 Deadlocks with Reusable and Consumable Resources
Deadlocks in Distributed Systems Ryan Clemens, Thomas Levy, Daniel Salloum, Tagore Kolluru, Mike DeMauro.
Distributed Process Management
Chapter 18 Distributed Process Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Chapter 18 Distributed Process Management Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Chapter 6 Concurrency: Deadlock and Starvation
Chapter 6 Concurrency: Deadlock and Starvation
CompSci 143aSpring, Deadlocks 6.1 Deadlocks with Reusable and Consumable Resources 6.2 Approaches to the Deadlock Problem 6.3 A System Model.
Operating Systems Principles Process Management and Coordination Lecture 6: Deadlocks 主講人:虞台文.
CSC 322 Operating Systems Concepts Lecture - 29: by
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Chapter 7: Deadlocks. 7.2 Chapter Objectives To develop a description of deadlocks, which prevent sets of concurrent processes from completing their tasks.
Distributed Process Management
Deadlock CSCI 444/544 Operating Systems Fall 2008.
Concurrency: Deadlock & Starvation
OS Spring 2004 Concurrency: Principles of Deadlock Operating Systems Spring 2004.
Witawas Srisa-an Chapter 6
CPSC 4650 Operating Systems Chapter 6 Deadlock and Starvation
OS Fall’02 Concurrency: Principles of Deadlock Operating Systems Fall 2002.
1 Concurrency: Deadlock and Starvation Chapter 6.
Chapter 6 Concurrency: Deadlock and Starvation
Silberschatz, Galvin and Gagne  Operating System Concepts Deadlock and Starvation Deadlock – two or more processes are waiting indefinitely for.
Deadlocks in Distributed Systems Deadlocks in distributed systems are similar to deadlocks in single processor systems, only worse. –They are harder to.
Distributed Deadlocks and Transaction Recovery.
Concurrency: Deadlock and Starvation Chapter 6. Goal and approach Deadlock and starvation Underlying principles Solutions? –Prevention –Detection –Avoidance.
1 Concurrency: Deadlock and Starvation Chapter 6.
Concurrency: Deadlock and Starvation
Deadlocks – An Introduction
System Model Deadlock Characterization Methods for Handling Deadlocks Deadlock Prevention, Avoidance, and Detection Recovering from Deadlock Combined Approach.
1 Announcements The fixing the bug part of Lab 4’s assignment 2 is now considered extra credit. Comments for the code should be on the parts you wrote.
1 Distributed Process Management Chapter Distributed Global States Operating system cannot know the current state of all process in the distributed.
Chapter 7 – Deadlock (Pgs 283 – 306). Overview  When a set of processes is prevented from completing because each is preventing the other from accessing.
Chapter 7 –System Model – typical assumptions underlying the study of distributed deadlock detection Only reusable resources, only exclusive access, single.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 7: Deadlocks.
Distributed and hierarchical deadlock detection, deadlock resolution
CS6502 Operating Systems - Dr. J. Garrido Deadlock – Part 2 (Lecture 7a) CS5002 Operating Systems Dr. Jose M. Garrido.
Deadlock Operating Systems: Internals and Design Principles.
DEADLOCK DETECTION ALGORITHMS IN DISTRIBUTED SYSTEMS
Chapter 6 Concurrency: Deadlock and Starvation
Lecture 12 Handling Deadlock – Prevention, avoidance and detection.
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 7 Operating Systems.
Operating Systems Unit VI Deadlocks and Protection Department of Computer Science Engineering and Information Technology.
Chapter 8 Deadlocks. Objective System Model Deadlock Characterization Methods for Handling Deadlocks Deadlock Prevention Deadlock Avoidance Deadlock Detection.
CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.
The Principles of Operating Systems Chapter 9 Distributed Process Management.
Informationsteknologi Monday, October 1, 2007Computer Systems/Operating Systems - Class 111 Today’s class Deadlock.
ICS Deadlocks 6.1 Deadlocks with Reusable and Consumable Resources 6.2 Approaches to the Deadlock Problem 6.3 A System Model –Resource Graphs –State.
Lecture 6 Deadlock 1. Deadlock and Starvation Let S and Q be two semaphores initialized to 1 P 0 P 1 wait (S); wait (Q); wait (Q); wait (S);. signal (S);
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community.
Synchronization: Distributed Deadlock Detection
Mutual Exclusion A condition in which there is a set of processes, only one of which is able to access a given resource or perform a given function at.
Concurrency: Deadlock and Starvation
ITEC 202 Operating Systems
Chapter 12: Concurrency, Deadlock and Starvation
Chapter 7 Deadlock.
Deadlocks Session - 13.
DEADLOCK.
Distributed Deadlock Detection
Operating Systems Principles Process Management and Coordination Lecture 6: Deadlocks 主講人:虞台文.
Presentation transcript:

Distributed process management: Distributed deadlock CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Distributed deadlock Problem definition Types of distributed deadlock Permanent blocking of a set of processes that either compete for system resources or communicate with each other No node has complete and up-to-date knowledge of the entire distributed system Message transfers between processes take unpredictable delays Types of distributed deadlock Resource deadlock Set of deadlocked processes, where each process waits for a resource held by another process (e.g., data object in a database, I/O resource on a server) Communication deadlocks: Set of deadlocked processes, where each process waits to receive messages (communication) from other processes in the set.  CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Distributed deadlock (cont.) System state representation with wait-for graphs (WFG) Nodes are processes, P1, P2, etc. Directed edge from P1 to P2 if P1 blocked and waiting for P2 to release a resource System is deadlocked if there is a directed cycle CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in resource allocation Conditions for deadlock in resource allocation Mutual exclusion: The resource can be used by only one process at a time Hold and wait: A process holds a resource while waiting for other resources No preemption: A process cannot be preempted to free up the resource Circular wait: A closed cycle of processes is formed, where each process holds one or more resources needed by the next process in the cycle Strategies Prevent the formation of a circular wait Detect the potential or the actual occurrence of a circular wait Types of algorithms Deadlock prevention Deadlock avoidance Deadlock detection Special issues in distributed systems Resources are distributed across many sites The control processes that control access to resources do not have complete, up-to-date information on the global state of the system CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in resource allocation: Deadlock handling strategies Deadlock prevention 1.a Prevent the circular-wait condition by defining a linear ordering of resource types A process can be assigned resources only according to the linear ordering Disadvantages Resources cannot be requested in the order that are needed Resources will be longer than necessary 1.b Prevent the hold-and-wait condition by requiring the process to acquire all needed resources before starting execution Inefficient use of resources Reduced concurrency Process can become deadlocked during the initial resource acquisition Future needs of a process cannot be always predicted CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in resource allocation: Deadlock handling strategies Deadlock prevention (cont.) 1.c Use of time-stamps Example: Use time-stamps for transactions to a database – each transaction has the time-stamp of its creation The circular wait condition is avoided by comparing time-stamps: strict ordering of transactions is obtained, the transaction with an earlier time-stamp always wins “Wait-die” method if [ e (T2) < e (T1) ] halt_T2 (‘wait’); else kill_T2 (‘die’); “Wound-wait” method kill_T1 (‘wound’); CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in resource allocation (cont.) (2) Deadlock avoidance Decision made dynamically, before allocating a resource, the resulting global system state is checked - if safe, allow allocation Disadvantages Every site has to maintain global state of system (extensive overhead in storage and communication) Different sites may determine (concurrently) that state is safe, but global state may be unsafe: verification for safe global state by different sites must be mutually exclusive Large overhead to check for every allocation (distributed system may have large number of processes and resources) Conclusion: Deadlock avoidance impractical in distributed systems CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in resource allocation (cont.) (3) Deadlock Detection Principle of operation Detection of a cycle in WFG proceeds concurrently with normal operation Requirements for the deadlock detection and resolution algorithms Detection The algorithm must detect all existing deadlock in finite time The algorithm should not report non-existent (phantom) deadlock Resolution (recovery) All existing wait-for dependencies in WFG must be removed, i.e. roll-back one or more processes that are deadlocked and give their resources to other blocked processes Observation Deadlock detection is the most popular strategy for handling deadlocks in distributed systems CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

3) Deadlock Detection (cont.) Deadlock in resource allocation: Algorithms for distributed deadlock detection 3) Deadlock Detection (cont.) Control for distributed deadlock detection can be: Centralized Distributed Hierarchical  a.1 Centralized deadlock detection algorithms A central control site constructs the global WFG and searches for cycles Control site an maintain WFG continuously (with every assignment) or when running deadlock detection (and asking all sites for WFG updates) Disadvantages: single point of failure and congestion  a.2 The completely centralized algorithm All sites request resources and release resources by sending corresponding messages to control site Control site updates WFG for each request/release For every new request edge added to WFG, control site checks WFG for deadlock Alternative: each site maintain its WFG and update control site periodically or on request CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

3) Deadlock Detection (cont.) Deadlock in resource allocation: Algorithms for distributed deadlock detection   3) Deadlock Detection (cont.) b. Hierarchical deadlock detection algorithms Sites organized in a tree structure with one site at the root of the tree Each node (except for leaf nodes) has information about the dependent nodes Deadlock is detected by the node that is the common ancestor of all sites which have resource allocations in conflict Deadlock is detected at the lowest level CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

3) Deadlock Detection (cont.) Deadlock in resource allocation: Algorithms for distributed deadlock detection (cont.) 3) Deadlock Detection (cont.) c. Distributed deadlock detection algorithms Principles All sites responsible for detecting a global deadlock Global state graph distributed over many sites: several of them participate in detection Detection initiated when a process suspected to be deadlocked Advantages: No single point of failure, no congestion Disadvantages: Difficult to implement (no shared memory) Types of algorithms Path-pushing algorithms Each node builds a WFG based on local info & info from other sites Detect and resolves local deadlocks Transmits to other sites deadlock info in form of (waiting path) Edge-chasing algorithms Special messages (probes) sent along edges of WFG to detect a cycle When blocked process receives probe, resends it on its outgoing edges of WFG When a process receives a probe it initiated, declares deadlock CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in message communication Mutual Waiting Deadlock conditions Each of a group of processes is waiting for a message from another member of the group There are no messages in transit Concepts Dependence set (DS) of process Pi is the set of all processes from which Pi is expecting a message Pi can proceed when any of the expected messages arrive Deadlock in a set S of processes All processes in S are stopped, waiting for messages S contains the dependence set of all processes in S No messages are in transit between processes in S CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in message communication Mutual Waiting (cont.) Resource deadlock vs. message deadlock Resource deadlock A deadlock exists if there is a cycle in the WFG A process Pi is dependent on process Pj if Pj holds a resource that Pi needs Message deadlock All successors Pj of a process Pi in S are also in S Example Fig. 14.16a P1 is waiting for a message from either P2 or P5 P5 is not waiting for any message; sends a message to P1, which is released Links (P1, P5) and (P1, P2) are removed No deadlock Fig. 14.16b P5 is now waiting for a message from P2 P2 is waiting for a message from P3 P3 is waiting for a message from P1 P1 is waiting for a message from P2 Deadlock Solution: prevention or detection CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in message communication: Mutual waiting (cont.) CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in message communication Unavailability of Message Buffers Deadlock can occur in the allocation of buffers for the storage of messages in transit (Example: packet-switching data networks) Direct store-and-forward deadlock Example: Two packet switching nodes, each using a common buffer pool from which buffers are assigned to packets on demand Buffer space for A is filled with packets destined for B Buffer space for B is filled with packets destined for A Neither node can transmit or receive packets: deadlock Solution: Use separate buffers, one for each link CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in message communication (cont.) Unavailability of Message Buffers (cont.) Indirect store-and-forward deadlock For each node, the queue to the adjacent node in one direction is full with packets destined for the next node beyond CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in message communication (cont.) Unavailability of Message Buffers (cont.) Indirect store-and-forward deadlock: Solution Use a structured buffer pool, hierarchically organized, with N+1 levels (N is the maximum number of hops an any network path) Common pool – class 0 is unrestricted: any incoming packet can be stored there Buffers at level k (where 0 < k  N) are reserved for packets that have traveled at least k hops so far Under heavy load, buffers will fill from 0 to N When buffers fill to level k, new incoming packets that passed k hops or less are discarded No direct or indirect store-and-forward deadlock can occur CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in message communication: (cont.) Unavailability of Message Buffers (cont.) Deadlock in a distributed OS using message passing for inter-process communication A non-blocking Send operation requires a buffer to hold outgoing messages Example 1 Process X has a buffer of size n After sending n messages, buffer is full When sending message n+1, process X will block until sufficient buffer is freed Example 2 Process Y has a buffer of size m Both buffers, n and m, become full and the two processes are blocked: deadlock Possible solutions Prevention: estimate maximum number of messages in transit and allocate corresponding number of buffers Detection: detect deadlock and roll back one of the processes CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]

Deadlock in message communication: Unavailability of message buffers (cont.) CS-550 (M.Soneru): Distributed process management – Distributed deadlock [Sta’01], [SaS]