EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

Slides:



Advertisements
Similar presentations
1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create (
Advertisements

Reliable Communication in the Presence of Failures Kenneth Birman, Thomas Joseph Cornell University, 1987 Julia Campbell 19 November 2003.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
LEADER ELECTION CS Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.
EEC-484/584 Computer Networks Lecture 12 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
EEC-484/584 Computer Networks Lecture 12 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
EEC 688/788 Secure and Dependable Computing Lecture 14 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 14 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering.
Lecture 13 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Last quiz Max: 69 / Median: 52 / Min: 24 In a box outside.
EEC 688/788 Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 688/788 Secure and Dependable Computing Lecture 14 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao Cleveland State University.
EEC 688 Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 14 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Commit Protocols. CS5204 – Operating Systems2 Fault Tolerance Causes of failure: process failure machine failure network failure Goals : transparent:
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Chapter 4 Reliable, Atomic and Causal Broadcast Presented By Kiran Simon.
TOTEM: A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM L. E. Moser, P. M. Melliar Smith, D. A. Agarwal, B. K. Budhia C. A. Lingley-Papadopoulos University.
EEC 688/788 Secure and Dependable Computing Lecture 14 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
EEC 688/788 Secure and Dependable Computing Lecture 7 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.
Synchronization Physical clocks Logical clocks. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
Hwajung Lee. A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types of groups:
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
The Totem Single-Ring Ordering and Membership Protocol Y. Amir, L. E. Moser, P. M Melliar-Smith, D. A. Agarwal, P. Ciarfella.
EEC 688/788 Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Reliable Communication in the Presence of Failures Kenneth P. Birman and Thomas A. Joseph Presented by Gloria Chang.
Fault Tolerance (2). Topics r Reliable Group Communication.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Reliable Multicast --- 2
Presentation transcript:

EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Outline Upcoming schedule  This Wed: spread toolkit lab  Next Monday: discussion#2  Next Wed: guest seminar, at Foxes Den  Nov. 4, Monday, Midterm#2 Group communication systems  Reliable, ordered multicast  Types of total ordering  GCS services  How to implement GCS

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Group Communication System Services provided by the GCS  Membership service: who is up and who is down Deals with failure detection and more  Reliable, ordered, multicast service FIFO, causal, total  Virtual synchrony service Virtual synchrony synchronizes membership change with multicasts GCS makes the implementation of state machine replication much easier

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Reliable Multicast Reliable multicast – the message is targeted to multiple receivers, and all receivers receive the message reliably  Positive or negative acknowledgement  Need to avoid ack/nack implosion Distinguish receiving from delivery! Application Middleware Receiving Delivering

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Ordered Reliable Multicast Ordered reliable multicast – if many messages are multicast by many senders, in what order the messages are delivered at the receivers?  First in first out (FIFO)  Causal – the causal relationship among msgs preserved  Total – all msgs are delivered at all receivers in the same order Event ordering (Section 4.3.2)  Happens before relationship: given two events E and E’ at processes i and j, respectively, we say E happens before E’, denoted as E -> E’, provided that i=j and E occurred ahead of E’, or i≠j and E is the sending of a message and E’ is the receiving of the msg

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Event Ordering Ordered reliable multicast – if many messages are multicast by many senders, in what order the messages are delivered at the receivers?  First in first out (FIFO)  Causal – the causal relationship among msgs preserved  Total – all msgs are delivered at all receivers in the same order Event ordering (Section 4.3.2)  Happens before relationship: given two events E and E’ at processes i and j, respectively, we say E happens before E’, denoted as E -> E’, provided that i=j and E occurred ahead of E’, or i≠j and E is the sending of a message and E’ is the receiving of the msg

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao FIFO Ordered Multicast FIFO or sender ordered multicast: Messages are delivered in the order they were sent (by any single sender) pqrspqrs a bcd e delivery of c to p is delayed until after b is delivered

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Causally Ordered Multicast Causal or happens-before ordering: If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a b

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Causally Ordered Multicast Causal or happens-before ordering: If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a bc delivery of c to p is delayed until after b is delivered

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Causally Ordered Multicast Causal or happens-before ordering: If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a bc e delivery of c to p is delayed until after b is delivered e is sent (causally) after b

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Causally Ordered Multicast Causal or happens-before ordering: If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a bcd e delivery of c to p is delayed until after b is delivered delivery of e to r is delayed until after b&c are delivered

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Totally Ordered Multicast Total ordering: Messages are delivered in same order to all recipients (including the sender) pqrspqrs a b c d e all deliver a, b, c, d, then e

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Two Types of Total Ordering Uniform total ordering  Given any msg that is broadcast, if it is delivered by a node according to some total order, it is delivered in every other node in the same total order unless the node has failed Nonuniform total ordering  Given a set of messages that have been broadcast and totally ordered, no node delivers any of them out of the total order.  However, there is no guarantee that if a node delivers a message, then all other nodes deliver the same message.

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Example

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Implementing Total Ordering Use a sequencer to order all multicast  Sequencer determines the order for the message  Each sender can take turn to serve as the sequencer (rotating sequencer) Use a token that moves around  Token has a sequence number  Sender determines the total order: when you hold the token you can send the next burst of multicasts Use vector clocks  Each process maintains a vector clock  Each msg is piggybacked with a vector timestamp

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Sequencer Based GCS First practical solution for GCS A system is structured into a combination of two subsystems  Multiple senders with a single receiver  A single sender with multiple receivers  The single receiver and single sender are collocated at the same node => all msgs are funneled through this node, i.e., sequencer

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Sequencer Based GCS The sequencer is responsible to assign a global sequence number to each message funneled Each node deliver a msg if it has received and delivered all msgs with smaller sequence numbers Sequencer: a single point of failure Rotating sequencer: overcoming single point of failure  Assume up to f nodes could fail, total number of nodes N > 2f  Each node takes turn to act as a sequencer (e..g, one msg at a time)  A node does not deliver a msg until it receives f+1 sequencing msgs  Achieves fault tolerance as well as uniform total ordering

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer: Data Structure View number v, list of node ids in the current view  Each node has a rank: it knows when it should take over as the next sequencer A local sequence number vector M[], each element representing the expected local seq # for the corresponding node: for reliable delivery  M[i] refers to the expected local seq# carried by the next msg sent by node i  Init each element to 0 Expected global seq# s carried in the nex t sequencing msg sent by the sequencer node: for total ordering

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer: Normal Operation Transmitting phase  A node i broadcasts a msg, B(v,i,n), to all nodes n: local seq#, initial 0, incremented for each msg broadcast => reliable broadcast Waits for a sequencing msg for the broadcast msg  A node j accepts a msg B(v,i,n) if it is in the same view and buffer it Sequencing phase Committing phase

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer: Normal Operation Sequencing phase  When the sequencer receives a broadcast msg B(v,i,n) It verifies that it is the next expected msg from node i, M[i] = n Assigns the current global seq# s to B(v,i,n) Broadcasts a sequencing msg: SEQ(s,v,[i,n])  When a node j receives SEQ(s,v,[i,n]), it accepts it provided S is the expected global seq# It has B(v,i,n) in its buffer, otherwise, request retransmission Updates its data structures:  Increment expected global seq#  Increment expected local seq#  SEQ(s,v,[i,n]) also serves as positive ack for broadcast msg B(v,i,n)

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer: Normal Operation Committing phase  A node does not deliver a broadcast msg B(v,i,n) until it receives SEQ(s,v,[i,n]) and f subsequent SEQ msgs  Ensuring uniform total ordering Even if f nodes failed, at least one node would have received both B(v,i,n) and SEQ(s,v,[i,n]) This node ensures that the message is delivered at other nodes in the same total order How to transfer the sequencer role  The transfer of the sequencer role can be achieved implicitly by the sending of a new sequencing message  The next node i assumes the sequencer role when it receives a SEQ(s) msg and the following conditions are met (s+1)%N=i It has received all previous SEQ msgs and B msgs What if no one broadcasts B msgs, sequencer sends null SEQ msgs

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Normal Operation: Example N=5, f=1 Can a node delivers B as soon as it receives the corresponding SEQ msg? When B(v,4,20) will be delivered?

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer: Membership Change A membership change is triggered by  The detection of a failure. A node fails to receive the corresponding SEQ msg for its B msg => sequencer failed  The recovery of a failed node. When a node recovers from a failure, it tries to rejoin the membership Objective of membership change protocol  Only one valid membership view can be formed by the system  If a B msg is committed at some nodes in a view, then all nodes in the new view must commit B in the same total order

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer: Membership Change Operates in three phases Phase I:  The node that detected a failure (originator) set new view# = v+1, and broadcasts an invitation msg Invitation msg carries the new view#  A node accepts the invitation and ack it provided that It has not accepted an invitation for a competing view Note: a node joins at most one membership view at a time The ack carries the node’s current view# and the next expected global seq#

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer : Membership Change, Phase II The originator keeps collecting acks until  Either it has received ack from every node in the new membership, or  It has collected at least N-f acks and a timeout occurred (for liveness) If all acks are positive, the originator proceeds to building a node list for the new view and broadcast it The originator also learns the msg ordering history of previous view  Highest global seq#: s max  Originator’s expected global seq#: s 0 If s max > s 0, the originator is missing msgs  S max ≥ than that of the last msg committed in previous view  Request retransmission  Use s max as starting global seq# for new view provided that it can receive all missing msgs, otherwise, use largest s with the corresponding B received If negative responses received, abort and retry  A node aborts when (1) receives an abort msg from originator, or (2) it times out membership change

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer : Membership Change, Phase III The originator collects responses to its new membership view msg If receives positive responses from every node in new view, commits to the new view Otherwise, abort, waits for a random amount time, and retry with a larger view number

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer : Membership Change Examples Competing originators

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer : Membership Change Examples Premature timeout

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Rotating Sequencer : Membership Change Examples Network partitioning

12/7/2015 EEC688/788: Secure & Dependable Computing Wenbing Zhao Token Based GCS: Totem Totem consists of:  Total ordering protocol  Membership protocol  Recovery protocol  Flow control mechanisms Total ordering msg delivery types  Safe delivery: a message is delivered only when all correct processes have received it => uniform total ordering  Agreed delivery: a message is delivered as long as it is the next message in total order => nonuniform total ordering