Operating Systems, 112 Practical Session 14, Distributed synchronization.

Operating Systems, 112 Practical Session 14, Distributed synchronization

Motivation Interest in distributed computation models is rapidly growing (Grids, Cloud computation, internet relay, etc) A good model for computation is difficult to come up with: 1.Concurrent computation 2.No global time and no global state 3.Hard to capture effects of possible failures

The model Each instance is executed on a different processor Assume no shared memory, communication is handled with messages of the following format: Sending is non-blocking and reliable A processor waits for events (messages) – a timeout mechanism is possible but we will not discuss it here

Global states and causality It is impossible to determine the global state of a distributed system: 1.Noninstantaneous communication (delays, lost messages, etc) 2.Can’t synchronize with a timer mechanisms (drift, initial synchronization) 3.Local interruptions (can’t trust simultaneous reactions) causal order of events Thus, we must find global system properties which we can depend on – causal order of events

Happened before Same processor event Send – receive event Transitivity of < H happened before < H We would like to define some order over system events – a “happened before” relation (denoted < H ): 1.If (e 1 < p e 2 )then e 1 < H e 2 2.If (e 1 < m e 2 )then e 1 < H e 2 3.If (e 1 < H e 2 && e 2 < H e 3 )then e 1 < H e 3 partial order Defines a partial order Can be defined as a DAG

Defines a global (and total) order on events Order is consistent with < H Created on the fly Will assume that each event has a timestamp attached to it An ID is appended to the timestamp and allows for tie breaking An ID is appended to the timestamp and allows for tie breaking Lamport’s algorithm: If e 1 < H e 2 then e 1.TS < e 2.TS Global time – Lamport’s timestamps 1 Initially my_TS=0 2 Upon event e, 3 if e is the receipt of message m 4 my_TS=max(m.TS, my_TS) 5 my_TS++ 6 e.TS=my_TS 7 If e is the sending of message m 8 m.TS=my_TS

Causality violation and vector timestamps Lamport’s algorithm does not guarantee that if e.TS < e’.TS then e < H e’ This make it difficult to detect causality violation causality violation m p m’pm’m m< c m’ and r(m’)< p r(m) A causality violation occurs if a message m is sent to a remote processor p before another message m’ is sent, but p receives m’ before m Written as: m< c m’ and r(m’)< p r(m) We will use a vector of timestamps to overcome this problem

Global time – vector timestamps 1 Initially my_VT=[0,…,0] 2 Upon event e, 3 if e is the receipt of message m 4 for i=1 to M 5 my_VT[i] = max(m.VT[i],my_VT[i]) 6My_VT[self]++ 7e.VT=my_VT 8if e is the sending of message m 9 m.VT=my_VT Vector timestamp: iff – e.VT ≤ v e’.VT iff e.VT[i] ≤ e’.VT[i], 1 ≤ i ≤ M iff – e.VT < v e’.VT iff e.VT ≤ v e’.VT and e.VT≠e’.VT Can be used to detect causality violations iff VT algorithm: e 1 < H e 2 iff e 1 < VT e 2

Question 1 Consider the following interaction between four processors: Time P1P1 P2P2 P3P3 P4P4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8 e9e9 e 10 e 11 e 12 e 13 e 14 e 15 e 16 e 17 e 18 e 19 e 20 e 21 e 22 e 23 e 24

Question 1 1.What is the largest Lamport’s timestamp value? (hint: you can answer without calculating all time stamps) 2.List the Lamport timestamp of each event. 3.List the vector timestamp of each event. 4.Is there a potential causality violation? What can indicate this violation?

Question 1 The Lamport timestamp mechanism calculates the longest chain of event occurring within a system. Thus, the largest timestamp value would be the number of vertices included in the longest path of events of the underlying DAG. In this case the answer is 12. P1P1 P2P2 P3P3 P4P4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8 e9e9 e 10 e 11 e 12 e 13 e 14 e 15 e 16 e 17 e 18 e 19 e 20 e 21 e 22 e 23 e 24

Question 1 Time P1P1 P2P2 P3P3 P4P4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8 e9e9 e 10 e 11 e 12 e 13 e 14 e 15 e 16 e 17 e 18 e 19 e 20 e 21 e 22 e 23 e 24 EventTS e1e1 1 e2e2 2 e3e3 3 e4e4 4 e5e5 6 e6e6 10 e7e7 11 e8e8 1 e9e9 2 e 10 3 e 11 4 e 12 5 e 13 8 e 14 1 e 15 4 e 16 7 e 17 8 e 18 9 e 19 12 e 20 2 e 21 3 e 22 5 e 23 6 e 24 7

Question 1 Time P1P1 P2P2 P3P3 P4P4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8 e9e9 e 10 e 11 e 12 e 13 e 14 e 15 e 16 e 17 e 18 e 19 e 20 e 21 e 22 e 23 e 24 EventTS e1e1 (1,0,0,0) e2e2 (2,1,0,0) e3e3 (3,1,0,0) e4e4 (4,1,0,0) e5e5 (5,5,1,2) e6e6 (6,5,5,4) e7e7 (7,5,5,4) e8e8 (0,1,0,0) e9e9 (0,2,1,0) e 10 (0,3,1,0) e 11 (1,4,1,2) e 12 (1,5,1,2) e 13 (4,6,1,5) e 14 (0,0,1,0) e 15 (0,3,2,0) e 16 (4,3,3,4) e 17 (4,3,4,4) e 18 (4,3,5,4) e 19 (7,5,6,4) e 20 (1,0,0,1) e 21 (1,0,0,2) e 22 (4,1,0,3) e 23 (4,1,0,4) e 24 (4,1,0,5)

Question 1 A possible causality violation exist. Note that the send event e 3 (a message from p 1 to p 3 ) happens before e 23 (a message from p 4 to p 3 ) but is received afterward. When using VT, e 3.VT=(3,1,0,0) but right before the reception of this message (e 17 ) the clock’s VT is e 16.VT=(4,3,3,4). Thus, when P 3 receives this message it knows that e 23 arrived “too soon”. P1P1 P2P2 P3P3 P4P4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8 e9e9 e 10 e 11 e 12 e 13 e 14 e 15 e 16 e 17 e 18 e 19 e 20 e 21 e 22 e 23 e 24

The Ricart - Agrawala algorithm An algorithm for handling distributed mutual exclusion Uses Lamport’s timestamps Each process only uses the following set of variables / data structures: Timestamp  Timestamp current_time Timestamp  Timestamp my_timestamp integer  integer reply_count boolean  boolean isRequesting boolean  boolean reply_deferred[M]

The Ricart - Agrawala algorithm The following code is used to enter the critical section:Request_CS: 1my_timstamp := current_time 2isRequesting := TRUE 3Reply_count := M-1 4for every processor j≠i 5send(REMOTE_REQUEST; my_timestamp) 6wait until reply_count = 0

The Ricart - Agrawala algorithm A listener thread is used so that the node responds to requests from others:CS_monitoring: Wait until a REMOTE_REUQUEST or REPLY message is received REMOTE_REQUEST(sender; request_time) 1.Let j be the sender of the REMOTE_REQUEST message is_requestingmy_timestamp > request_time 2.if (not is_requesting or my_timestamp > request_time) 3. send(j, REPLY) 4.else 5. reply_deferred[j] = TRUEREPLY 6.reply_count := reply_count-1 Ties are broken with processor IDs

The Ricart - Agrawala algorithm Releasing the CS:Release_CS_monitoring: 1.is_requesting := false 2.For j=1 through M (other than this processor's ID) 3. if (reply_deferred[i]=TRUE) 4. send(j, REPLY) 5. reply_deferred[j]=FALSE

Question 2 Assume that N processors are handling mutual exclusion with the aid of the Ricart-Agrawala’s mutual exclusion algorithm. 1.How many messages will be passed in the system whenever a processor wishes to enter the critical section? Are there scenarios where this number is lower/greater? 2.Why is this algorithm deadlock free? 3.What can happen if a single message is lost?

Question 2 1.To enter the CS a processor must request permission from all other processors – i.e. N-1 messages are sent. Only after the processor received a REPLY response from all other processors may it enter the CS. Note, that these may be deferred for a while… That is, entering the CS will require a total of 2(N-1) messages passed. One means to reduce this network load is by keeping several requests deferred for a while. This will allow agents to enter the CS more than once without having to send messages to all N-1 nodes in the system [Roucairol & Carvalho].

Question 2 2. The algorithm relies on the fact that each timestamp is unique (based on the Lamport’s time and processor ID). Thus, a total order over request can be easily deduced, and CS access is handled through this order. 3. The algorithm assumes that the system is failure free and its correctness heavily relies on this condition. It is easy to see that if a single message is lost a deadlock can easily occur.

Raymond’s algorithm Solve the mutual exclusion problem via a token (only the token holder may enter CS) Communication is based on an underlying tree structure of all nodes  The tree is always oriented towards token holder Uses a FIFO queue to prevent starvation Good performance (number of messages per CS entry decreases as the load increases!)  Uses O(log n) messages only

Raymond’s algorithm Each process only uses the following set of variables / data structures: Boolean  Boolean token_holder Boolean  Boolean inCS Processor  Processor current_dir Queue  Queue requests_queue

Raymond’s algorithm Request_CS Request_CS: 1If not token_holder 2 if requests_queue.isEmpty( ) 3 send(current_dir, REQUEST) 4 requests_queue.enqueue(self) 5 wait until token_holder is true 6inCS := true Release_CS Release_CS: 7.inCS := false 8.If not requests_queue.isEmpty( ) 9. current_dir := requests_queue.dequeue( ) 10. send(current_dir, TOKEN) 11. token_holder := false 12. if not requests_queue.isEmpty( ) 13. send(current_dir, REQUEST)

Raymond’s algorithm Monitor_CS: 1while (true) REQUESTTOKEN 2 wait for a REQUEST or a TOKEN message REQUEST 3. if token_holder 4. if inCS 5. requests_queue.enqueue(sender) 6. else 7. current_dir := sender 8. send(current_dir, TOKEN) 9. token_holder := false 10. else 11. if requests_queue.isEmpty() 12. send(current_dir,REQUEST) 13. requests_queue.enqueue(sender)

Raymond’s algorithm (cont.) TOKEN 14. current_dir := requests_queue.dequeue( ) 15. if current_dir = self 16. token_holder := true 17. else 18. send(current_dir, TOKEN) 19. if not requests_queue.isEmpty( ) 20. send(current_dir, REQUEST)

Question 3, Moed B 2006 Raymond’s algorithm The following 8 processor network is using Raymond’s algorithm to solve the mutual exclusion problem. In the initial state, the token is with processor A at the root of the tree (and wants to enter the critical section), and no requests for the CS are recorded.

Question 3, Moed B 2006 AA BBFF CCDDEEGGHH Directed edges correspond to the current_direction var

Question 3, Moed B 2006 steps To allow for a convenient representation we define agent steps as the invocation of a procedure or an action. Use the sketch above to describe the result of applying Raymond’s algorithm if nodes C,D,F and G request the token. Provide a detailed description of all concurrent steps (in which a single step is taken by all relevant nodes) by sketching the system’s state after each one and up until three of the four agents receive the token. Note: assume that ties are broken based on ID.

Question 3, Moed B 2006 AA BBFF CCDDEEGGHH CDGF REQUEST

Question 3, Moed B 2006 AA BBFF CCDDEEGGHH FC D CDGF G REQUEST

Question 3, Moed B 2006 AA BBFF CCDDEEGGHH F B C D CDGF G

AA BBFF CCDDEEGGHH BC D CDGF G REQUEST

Question 3, Moed B 2006 AA BBFF CCDDEEGGHH BC D CDGG A

AA BBFF CCDDEEGGHH BC D CDGA REQUEST

Question 3, Moed B 2006 AA BBFF CCDDEEGGHH BC D CDFA

AA BBFF CCDDEEGGHH BC D CDA

AA BBFF CCDDEEGGHH BC D CD

AA BBFF CCDDEEGGHH C D CD

AA BBFF CCDDEEGGHH DCD REQUEST

Question 3, Moed B 2006 AA BBFF CCDDEEGGHH DBD

Operating Systems, 112 Practical Session 14, Distributed synchronization.

Similar presentations

Presentation on theme: "Operating Systems, 112 Practical Session 14, Distributed synchronization."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Operating Systems, 112 Practical Session 14, Distributed synchronization.

Similar presentations

Presentation on theme: "Operating Systems, 112 Practical Session 14, Distributed synchronization."— Presentation transcript:

Similar presentations

About project

Feedback