1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create (

Slides:



Advertisements
Similar presentations
Model checking with Message Sequence Charts Doron Peled Collaborators: R. Alur, E. Gunter, G. Holzmann, A. Muscholl, Z. Su Department of Computer Science.
Advertisements

Visual Formalisms Message Sequence Charts Book: Chapter 10.
Time, Clocks, and the Ordering of Events in a Distributed System
COS 461 Fall 1997 Group Communication u communicate to a group of processes rather than point-to-point u uses –replicated service –efficient dissemination.
CS542 Topics in Distributed Systems Diganta Goswami.
Logical Clocks (2).
Chapter 8 Fault Tolerance
CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CS 542: Topics in Distributed Systems Diganta Goswami.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
Lecture 8: Asynchronous Network Algorithms
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.
CS542 Topics in Distributed Systems Diganta Goswami.
Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Consistency and Replication (3). Topics Consistency protocols.
1 Algorithms and protocols for distributed systems We have defined process groups as having peer or hierarchical structure and have seen that a coordinator.
Distributed Systems Spring 2009
LEADER ELECTION CS Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.
CS 582 / CMPE 481 Distributed Systems
Causality & Global States. P1 P2 P Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Lecture 13 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Last quiz Max: 69 / Median: 52 / Min: 24 In a box outside.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Lecture 14 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Project P01 deadline on Wednesday November 3 rd. Non-blocking.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Logical Clocks (2). Topics r Logical clocks r Totally-Ordered Multicasting r Vector timestamps.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
Lamport’s Logical Clocks & Totally Ordered Multicasting.
Vector Clock Each process maintains an array of clocks –vc.j.k denotes the knowledge that j has about the clock of k –vc.j.j, thus, denotes the clock of.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
Synchronization Chapter 5.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Event Ordering Greg Bilodeau CS 5204 November 3, 2009.
Lecture 10 – Mutual Exclusion Distributed Systems.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CS603 Fault Tolerance - Communication April 17, 2002.
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Lecture 10: Coordination and Agreement (Chap 12) Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
Election Distributed Systems. Algorithms to Find Global States Why? To check a particular property exist or not in distributed system –(Distributed) garbage.
Logical Clocks. Topics r Logical clocks r Totally-Ordered Multicasting.
Event Ordering. CS 5204 – Operating Systems2 Time and Ordering The two critical differences between centralized and distributed systems are: absence of.
Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
Distributed Systems Lecture 7 Multicast 1. Previous lecture Global states – Cuts – Collecting state – Algorithms 2.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
More on Fault Tolerance
Coordination and Agreement
Lecture 9: Asynchronous Network Algorithms
Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013
Advanced Operating System
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Reliable Multicast --- 2
Presentation transcript:

1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create ( name ), kill ( name ) join ( name, process ), leave ( name, process ) internal structure? NO ( peer structure ) – failure tolerant, complex protocols YES ( a single coordinator and point of failure ) – simpler protocols e.g. all join requests must go to the coordinator – concurrent joins avoided closed or open? OPEN – a non-member can send a message to the group CLOSED – only members can send to the group failures? a failed process leaves the group without executing leave robustness leave, join and failures happen during normal operation – algorithms must be robust Message ordering – Process groups

2 Message delivery for a process group - assumptions ASSUMPTIONS messages are multicast to named process groups reliable channels: a given message is delivered reliably to all members of the group FIFO from a given source to a given destination processes dont crash (failure and restart not considered) processes behave as specified e.g. send the same values to all processes - we are not considering so-called Byzantine behaviour (when malicious or erroneous processes do not behave according to their specifications see Lamports Byzantine Generals problem). Message ordering – Process groups

3 Ordering message delivery application process message service OS comms. interface may reorder delivery to application (on request for some order) by buffering messages may specify delivery order to message service e.g. arrival (no) order, total order, causal order assume FIFO from each source at this level (done by lower levels) total order = every process receives all messages in the same order (including its own). We first consider causal order Message ordering – Process groups

4 Message delivery – causal order P1 sent message m provably before P2 sent message m' The above diagram shows a violation of causal delivery order Causal delivery order requires that, at P3, m is delivered before m' The definition relates to POTENTIAL causality, not application semantics DEFINITION of causal delivery order (where < means happened before) send i ( m ) deliver k ( m ) < deliver k ( m') application processes time P1 P2 P3 First, define causal order in terms of one-to-one messages; later, multicast to a process group m m Message ordering – Process groups

5 Message delivery – causal order for a process group application processes, time P1 P2 P3 If we know that all processes in a group receive all messages, the message delivery service can implement causal delivery order (for total order, see later) m m application process message service OS comms. interface the message service can postpone the delivery of messages to the application process Message ordering – Process groups

6 Message delivery – causal order using Vector Clocks application processes P1 P2 P3 A vector clock is maintained by the message service at each node for each process: 1,0,02,1,0 0,1,01,2,0 1,0,11,1,2 vector notation: - fixed number of processes N - each processs message service keeps a vector of dimension N - for each process, each entry records the most up-to-date value of the state counter delivered to the application process, for the process at that position Message ordering – Process groups

7 Vector Clocks – message service operation application processes P1 P2 P3 1,0,02,1,0 0,1,01,2,0 1,0,11,1,2 Message service operation: before send increment local process state value in local vector on send, timestamp message with sending processs local vector on receive by message service – see below on deliver to receiving application process, increment receiving processs state value in its local vector and update the other fields of the vector by comparing its values with the incoming vector (timestamp) and recording the higher value in each field, thus updating this processs knowledge of system state 1,3,0 3,3,0 1,3,3 4,3,4 1,4,4 1,3,4 Message ordering – Process groups

8 Implementing causal order using Vector Clocks application processes P1 P2 P3 1,0,02,2,0 1,1,01,2,0 0,0,0 P3s vector is at (0,0,0) and a message with timestamp (1,2,0) arrives from P2 i.e. P2 has received a message from P1 that P3 hasnt seen. More detail of P3s message service: receiver vector sender sender vector decision new receiver vector 0,0,0 P2 1,2,0 buffer 0,0,0 P3 is missing a message from P1 that sender P2 has already received 0,0,0 P1 1,0,0 deliver 1,0,1 1,0,1 P2 1,2,0 deliver 1,2,2 In each case: do the sender and receiver agree on the state of all other processes? If the sender has a higher state value for any of these others, the receiver is missing a message, so buffer the message ? ?= message service Message ordering – Process groups

9 Vector Clocks - example application processes P1 P2 P3 P4 1,0,0,02,0,2,0 1,0,1,0 1,0,2,2 1,0,2,0 sender sender vector receiver receiver vector decision new receiver vector P3 1,0,2,0 P1 1,0,0,0 deliver P1 -> 2,0,2,0 same states for P2 and P4 P3 1,0,2,0 P4 1,0,0,1 deliver P4 -> 1,0,2,2 same states for P1 and P2 P3 1,0,2,0 P2 0,0,0,0 buffer P2 -> 0,0,0,0 same state for P4, different for P1- missing message P1 1,0,0,0 P2 0,0,0,0 deliver P2 -> 1,1,0,0 reconsider buffered message: P3 1,0,2,0 P2 1,1,0,0 deliver P2 -> 1,2,2,0 same states for P1 and P4 1,0,0,1 ? Message ordering – Process groups

10 Total order is not enforced by the vector clocks algorithm application processes P1 P2 P3 P4 1,0,0,0 3,2,2,0 1,0,1,0 1,0,2,2 1,0,2,0 m2 and m3 are not causally related P1 receives m1, m2, m3 P2 receives m1, m2, m3 P3 receives m1, m3, m2 P4 receives m1, m3, m2 If the application requires total order this could be enforced by modifying the vector clock algorithm to include ACKs and delivery to self. 1,0,0,1 1,2,0,01,1,0,0 2,2,0,0 1,2,2,3 1,3,2,0 1,2,3,0 m1 m3m2 Message ordering – Process groups

11 Totally ordered multicast The vectors can be a large overhead on message transmission and a simpler algorithm can be used if only total order is required. Recall the ASSUMPTIONS messages are multicast to named process groups reliable channels: a given message is delivered reliably to all members of the group FIFO from a given source to a given destination processes dont crash (failure and restart not considered) no Byzantine behaviour total order algorithm - sender multicasts to all including itself - all acknowledge receipt as a multicast message - message is delivered in timestamp order after all ACKs have been received If the delivery system must support both, so that applications can choose, vector clocks can achieve both causal and total ordering. Message ordering – Process groups

12 Total ordered multicast – outline of approach application processes P1 P2 P3 P1 increments its clock to 1 and multicasts a message with timestamp 1 All delivery systems collect the message, multicast ACK and collect all ACKs - no contention – deliver message to application processes and increment local clocks to 2. P2 and P3 both multicast messages with timestamp 3 All delivery systems collect messages, multicast ACKs and collect ACKs. - contention – so use a tie-breaker (e.g. lowest process ID wins) and deliver P2s message before P3s This is just a sketch of an approach. In practice, timeouts would have to be used to take account of long delays due to congestion and/or failure of components and/or communication links ACKs Message ordering – Process groups