The Principles of Operating Systems Chapter 9 Distributed Process Management.

Slides:

Advertisements

Similar presentations

Chapter 7: Deadlocks.

Advertisements

Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community.

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

Deadlock Prevention, Avoidance, and Detection

Handling Deadlocks n definition, wait-for graphs n fundamental causes of deadlocks n resource allocation graphs and conditions for deadlock existence n.

Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.

1 Chapter 5 Concurrency: Mutual Exclusion and Synchronization Principals of Concurrency Mutual Exclusion: Hardware Support Semaphores Readers/Writers Problem.

Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.

Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community.

Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.

Concurrency: Mutual Exclusion and Synchronization Chapter 5.

Uncoordinated Checkpointing The Global State Recording Algorithm Cristian Solano.

Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.

Distributed Process Management

Chapter 18 Distributed Process Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.

Chapter 18 Distributed Process Management Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles,

1 Concurrency: Mutual Exclusion and Synchronization Chapter 5.

File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.

Chapter 6 Concurrency: Deadlock and Starvation

Computer Systems/Operating Systems - Class 8

Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.

1 Concurrency: Mutual Exclusion and Synchronization Chapter 5.

CS 582 / CMPE 481 Distributed Systems

What we will cover…  Distributed Coordination 1-1.

Causality & Global States. P1 P2 P Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order.

Distributed Process Management

Concurrency: Deadlock & Starvation

Witawas Srisa-an Chapter 6

20101 Synchronization in distributed systems A collection of independent computers that appears to its users as a single coherent system.

CPSC 4650 Operating Systems Chapter 6 Deadlock and Starvation

1 Concurrency: Deadlock and Starvation Chapter 6.

EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.

Chapter 6 Concurrency: Deadlock and Starvation

1 Distributed Process Management: Distributed Global States and Distributed Mutual Exclusion.

Distributed process management: Distributed deadlock

1 Distributed Systems: Distributed Process Management – Process Migration.

Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.

CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.

Concurrency: Deadlock and Starvation Chapter 6. Goal and approach Deadlock and starvation Underlying principles Solutions? –Prevention –Detection –Avoidance.

1 Concurrency: Deadlock and Starvation Chapter 6.

Concurrency: Deadlock and Starvation

Concurrency: Mutual Exclusion and Synchronization Chapter 5.

Transparent Process Migration: Design Alternatives and the Sprite Implementation Fred Douglis and John Ousterhout.

SWE202 Review. Processes Process State As a process executes, it changes state – new: The process is being created – running: Instructions are being.

1 Announcements The fixing the bug part of Lab 4’s assignment 2 is now considered extra credit. Comments for the code should be on the parts you wrote.

1 Distributed Process Management Chapter Distributed Global States Operating system cannot know the current state of all process in the distributed.

Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 1 Instructor: Dr. Khalil Distributed.

Concurrency: Mutual Exclusion and Synchronization Chapter 5.

1 Concurrency: Mutual Exclusion and Synchronization Chapter 5.

Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.

1 Concurrency: Mutual Exclusion and Synchronization Chapter 5.

Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.

CS6502 Operating Systems - Dr. J. Garrido Deadlock – Part 2 (Lecture 7a) CS5002 Operating Systems Dr. Jose M. Garrido.

Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings.

Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.

Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.

Chapter 8 Deadlocks. Objective System Model Deadlock Characterization Methods for Handling Deadlocks Deadlock Prevention Deadlock Avoidance Deadlock Detection.

1 Chapter 11 Global Properties (Distributed Termination)

Chapter pages1 Distributed Process Management Chapter 14.

Distributed Mutual Exclusion Synchronization in Distributed Systems Synchronization in distributed systems are often more difficult compared to synchronization.

Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.

CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.

Process Synchronization Presentation 2 Group A4: Sean Hudson, Syeda Taib, Manasi Kapadia.

Distributed Systems Lecture 6 Global states and snapshots 1.

Concurrency: Deadlock and Starvation

G.Anuradha Reference: William Stallings

Concurrency: Mutual Exclusion and Synchronization

Outline Distributed Mutual Exclusion Introduction Performance measures

Concurrency: Mutual Exclusion and Process Synchronization

Presentation transcript:

The Principles of Operating Systems Chapter 9 Distributed Process Management

Contents  Process Migration Process Migration  Distributed Global States Distributed Global States  Mutual Exclusion Mutual Exclusion  Distributed Deadlock Distributed Deadlock

Process Migration  Transfer of sufficient amount of the state of a process from one machine to another.  The process executes on the target machine.  Interest in this concept grew out of research into methods of load balancing across multiple networked systems.

Motivation  Load sharing  Move processes from heavily loaded to lightly load systems  Load can be balanced to improve overall performance  Communications performance  Processes that interact intensively can be moved to the same node to reduce communications cost  May be better to move process to where the data reside when the data is large

Motivation (cont.)  Availability  Long-running process may need to move because the machine it is running on will be down  Utilizing special capabilities  Process can take advantage of unique hardware or software capabilities

Initiation of Migration  Who initiates process migration?  Operating system When goal is load balancing  Process When goal is to reach a particular resource

What is Migrated?  Must destroy the process on the source system and create it on the target system  Process control block and any links must be moved  The transfer of the process from one machine to another is invisible to both the migrated process and its communication partners

Example of Process Migration

What is Migrated? (cont.)  The difficulty of process migration concerns the process address space and any open files assigned to the process. Several strategies have been considered.  Eager (all): Transfer entire address space  No trace of process is left behind  If address space is large and if the process does not need most of it, then this approach my be unnecessarily expensive

What is Migrated? (cont.)  Precopy: Process continues to execute on the source node while the address space is copied  Pages modified on the source during precopy operation have to be copied a second time  Reduces the time that a process is frozen and cannot execute during migration

What is Migrated? (cont.)  Eager (dirty): Transfer only that portion of the address space that is in main memory and have been modified  Any additional blocks of the virtual address space are transferred on demand  The source machine is involved throughout the life of the process

What is Migrated? (cont.)  Copy-on-reference: Pages are only brought over on reference  Variation of eager (dirty)  Has lowest initial cost of process migration  Flushing: Pages of the migrated process are cleared from main memory by flushing dirty pages to disk  Relieves the source of holding any pages of the migrated process in main memory

Negotiation of Migration  Some systems allow the designated target system to participate in the decision about migration  An example of a negotiation mechanism  Migration policy is responsibility of Starter utility  Starter utility is also responsible for long- term scheduling and memory allocation  Decision to migrate must be reached jointly by two Starter processes (one on the source and one on the destination)

Negotiation of Process Migration

Eviction  System evict a process that has been migrated to it  If a workstation is idle, process may have been migrated to it  Once the workstation is active, it may be necessary to evict the migrated processes to provide adequate response time  Foreign process vs native process

Distributed Global States  Operating system cannot know the current state of all process in the distributed system  A process can only know the current state of all processes on the local system  Remote processes only know state information that is received by messages  These messages represent the state in the past

Example  Bank account is distributed over two branches  The total amount in the account is the sum at each branch  At 3:00 PM the account balance is determined  Messages are sent to request the information

Example (cont.) Example of Determining Global States

Example (cont.)  If at the time of balance determination, the balance from branch A is in transit to branch B  The result is a false reading

Example (cont.)  All messages in transit must be examined at time of observation  Total consists of balance at both branches and amount in message

Example (cont.)  If clocks at the two branches are not perfectly synchronized  Transfer amount at 3:01 from branch A  Amount arrives at branch B at 2:59  At 3:00 the amount is counted twice

Example (cont.)

Some Terms  Channel  Exists between two processes if they exchange messages. Channels are viewed as unidirectional.  State  Sequence of messages that have been sent and received along channels incident with the process.

Some Terms (cont.)  Snapshot  Records the state of a process  Global state  The combined state of all processes  A true global state cannot be determined because of the time lapse associated with message transfer  Distributed Snapshot  A collection of snapshots, one for each process

Global State  The current distributed snapshot indicates that Message 3 has been received but not yet sent.

Global State (cont.)  A global state is consistent if for every process state that records the receipt of a message, the sending of that message has been recorded in the process state of the process that sent the message.

 We cannot determine the true global state because of the time lapse associated with message transfer. But in practice, to describe the distributed system, we can attempt to define a consistent global state by collecting snapshots from all processes.

Distributed Snapshot Algorithm  Assumes reliable, in-order messages  Some process initiates the algorithm by recording its state and sending a marker on all outgoing channels.  When P receives a marker from Q  Record local state S p  Mark state of Q  P channel as empty  Propagate marker on all outgoing channels  Record incoming messages on other channels until a marker is received

 The algorithm terminates at a process once the marker has been received along every incoming channel.  Combine snapshots for global state

Mutual Exclusion Requirements  Mutual exclusion must be enforced: only one process at a time is allowed in its critical section  A process that halts in its noncritical section must do so without interfering with other processes  It must not be possible for a process requiring access to a critical section to be delayed indefinitely: no deadlock or starvation

Mutual Exclusion Requirements  When no process is in a critical section, any process that requests entry to its critical section must be permitted to enter without delay  No assumptions are made about relative process speeds or number of processors  A process remains inside its critical section for a finite time only

Model for Mutual Exclusion Problem in Distributed Process Management

Centralized Algorithm for Mutual Exclusion  One node is designated as the control node  This node control access to all shared objects  If control node fails, mutual exclusion breaks down

Distributed Algorithm  All nodes have equal amount of information, on average  Each node has only a partial picture of the total system and must make decisions based on this information  All nodes bear equal responsibility for the final decision

Distributed Algorithm  All nodes expend equal effort, on average, in effecting a final decision  Failure of a node, in general, does not result in a total system collapse  There exits no system-wide common clock with which to regulate the time of events

Ordering of Events  Events must be order to ensure mutual exclusion and avoid deadlock  Clocks are not synchronized  Communication delays  State information for a process is not up to date

Ordering of Events  Need to consistently say that one event occurs before another event  Messages are sent when want to enter critical section and when leaving critical section  Time-stamping  Orders events on a distributed system  System clock is not used

Time-Stamping  Each system on the network maintains a counter which functions as a clock  Each site has a numerical identifier  When a message is received, the receiving system sets is counter to one more than the maximum of its current value and the incoming time-stamp (counter)

Time-Stamping  If two messages have the same time-stamp, they are ordered by the number of their sites  For this method to work, each message is sent from one process to all other processes  Ensures all sites have same ordering of messages  For mutual exclusion and deadlock all processes must be aware of the situation

Example of Operation of Timestamping Algorithm

Another Example of Operation of Timestamping Algorithm

State Diagram for Algorithm in [RICA81]

Token-Passing Approach  Pass a token among the participating processes  The token is an entity that at any time is held by one process  The process holding the token may enter its critical section without asking permission  When a process leaves its critical section, it passes the token to another process

Deadlock in Resource Allocation  Mutual exclusion  Hold and wait  No preemption  Circular wait

Deadlock Prevention  Circular-wait condition can be prevented by defining a linear ordering of resource types  Hold-and-wait condition can be prevented by requiring that a process request all of its required resource at one time, and blocking the process until all requests can be granted simultaneously

Deadlock Avoidance  Distributed deadlock avoidance is impractical  Every node must keep track of the global state of the system  The process of checking for a safe global state must be mutually exclusive  Checking for safe states involves considerable processing overhead for a distributed system with a large number of processes and resources

Distributed Deadlock Detection  Each site only knows about its own resources  Deadlock may involve distributed resources  Centralized control – one site is responsible for deadlock detection  Hierarchical control – lowest node above the nodes involved in deadlock  Distributed control – all processes cooperate in the deadlock detection function

Deadlock in Message Communication  Mutual Waiting  Deadlock occurs in message communication when each of a group of processes is waiting for a message from another member of the group and there are no messages in transit

Deadlock in Message Communication

 Unavailability of Message Buffers  Well known in packet-switching data networks  Example: buffer space for A is filled with packets destined for B. The reverse is true at B.

Direct Store-and-Forward Deadlock

Deadlock in Message Communication  Unavailability of Message Buffers  For each node, the queue to the adjacent node in one direction is full with packets destined for the next node beyond

Store-and-Forward Deadlock

Structured Buffer Pool