LEADER ELECTION CS 2711. Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.

Slides:



Advertisements
Similar presentations
1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create (
Advertisements

Chapter 12 Message Ordering. Causal Ordering A single message should not be overtaken by a sequence of messages Stronger than FIFO Example of FIFO but.
COS 461 Fall 1997 Group Communication u communicate to a group of processes rather than point-to-point u uses –replicated service –efficient dissemination.
CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CS 542: Topics in Distributed Systems Diganta Goswami.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 6 Instructor: Haifeng YU.
CS542 Topics in Distributed Systems Diganta Goswami.
Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
1 Algorithms and protocols for distributed systems We have defined process groups as having peer or hierarchical structure and have seen that a coordinator.
Page 1 Mutual Exclusion* Distributed Systems *referred to slides by Prof. Paul Krzyzanowski at Rutgers University and Prof. Mary Ellen Weisskopf at University.
Distributed Systems Spring 2009
CS 582 / CMPE 481 Distributed Systems
CS 582 / CMPE 481 Distributed Systems Replication.
Computer Science Lecture 11, page 1 CS677: Distributed OS Last Class: Clock Synchronization Logical clocks Vector clocks Global state.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
EEC 688/788 Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
Clock Synchronization and algorithm
Lecture 14 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Project P01 deadline on Wednesday November 3 rd. Non-blocking.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Election Algorithms and Distributed Processing Section 6.5.
Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm.
Computer Science 425 Distributed Systems (Fall 2009) Lecture 5 Multicast Communication Reading: Section 12.4 Klara Nahrstedt.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 6 Synchronization.
Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.
Synchronization CSCI 4900/6900. Importance of Clocks & Synchronization Avoiding simultaneous access of resources –Cooperate to grant exclusive access.
DC6: Chapter 12 Coordination Election Algorithms Distributed Mutual Exclusion Consensus Group Communication.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 8 Leader Election Section 12.3 Klara Nahrstedt.
Global State (1) a)A consistent cut b)An inconsistent cut.
Synchronization CSCI 4780/6780. Mutual Exclusion Concurrency and collaboration are fundamental to distributed systems Simultaneous access to resources.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Vector Clock Each process maintains an array of clocks –vc.j.k denotes the knowledge that j has about the clock of k –vc.j.j, thus, denotes the clock of.
Synchronization Chapter 5.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Lecture 11-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 28, 2010 Lecture 11 Leader Election.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
Distributed Process Coordination Presentation 1 - Sept. 14th 2002 CSE Spring 02 Group A4:Chris Sun, Min Fang, Bryan Maden.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Election Distributed Systems. Algorithms to Find Global States Why? To check a particular property exist or not in distributed system –(Distributed) garbage.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
Reliable Communication in the Presence of Failures Kenneth P. Birman and Thomas A. Joseph Presented by Gloria Chang.
Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Systems Lecture 9 Leader election 1. Previous lecture Middleware RPC and RMI – Marshalling 2.
Lecture 11: Coordination and Agreement Central server for mutual exclusion Election – getting a number of processes to agree which is “in charge” CDK4:
Distributed Systems 31. Theoretical Foundations of Distributed Systems - Coordination Simon Razniewski Faculty of Computer Science Free University of Bozen-Bolzano.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Lecture 17: Leader Election
Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Leader Election
Chapter 5 (through section 5.4)
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Reliable Multicast --- 2
CSE 486/586 Distributed Systems Leader Election
Presentation transcript:

LEADER ELECTION CS 2711

Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the job, just need to pick one Election algorithms: technique to pick a unique coordinator (aka leader election) Types of election algorithms: Bully and Ring algorithms CS 2712

Bully Algorithm Each process has a unique numerical ID Processes know Ids and address of all other process Communication is assumed reliable Key Idea: select process with highest ID Process initiates election if it just recovered from failure or if coordinator failed 3 message types: election, OK, I won Processes can initiate elections simultaneously – Need consistent result CS 2713

Bully Algorithm Details Any process P can initiate an election P sends Election messages to all process with higher Ids and awaits OK messages If no OK messages, P becomes coordinator & sends I won to all process with lower Ids If it receives OK, it drops out & waits for I won If a process receives Election msg, it returns OK and starts an election If a process receives I won then sender is coordinator CS 2714

Bully Algorithm Example a)Process 4 holds an election b)Process 5 and 6 respond, telling 4 to stop c)Now 5 and 6 each hold an election CS 2715

Bully Algorithm Example d)Process 6 tells 5 to stop e)Process 6 wins and tells everyone CS 2716

Simple Ring-based Election Processes have unique Ids and arranged in a logical ring Each process knows its neighbors Select process with highest ID as leader Begin election if just recovered or coordinator has failed Send Election to closest downstream node that is alive – Sequentially poll each successor until a live node is found Each process tags its ID on the message Initiator picks node with highest ID and sends a coordinator message Multiple elections can be in progress —no harm. CS 2717

Ring Algorithm Example CS 271 8

Ring Algorithm Example CS 271 9

Comparison Assume n processes and one election in progress Bully algorithm – Worst case: initiator is node with lowest ID Triggers n-2 elections at higher ranked nodes: O(n 2 ) msgs – Best case: immediate election: n-2 messages Ring – 2 (n-1) messages always CS 27110

Highlights of Leader Election Basic idea: each process has a unique process-id. Once leader is discovered died, elect process with highest (lowest) process-id. CS 27111

BROADCAST PROTOCOLS CS 27112

Broadcast Protocols Why Broadcast protocols? – Data replication – Highly available servers – Cluster management – Distributed logging – …… Sometimes, message is received, but delivered later to satisfy some order requirements. CS 27113

Ordering properties: FIFO(Cornell) Fifo or sender ordered multicast: fbcast Messages are delivered in the order they were sent (by any single sender) pqrspqrs ae CS 27114

Ordering properties: FIFO pqrspqrs a bcd e delivery of c to p is delayed until after b is delivered CS 27115

Limitations of FIFO Broadcast Scenario: User A broadcasts a message to a mailing list B delivers that message B broadcasts reply C delivers B’s response without A´s original message and misinterprets the message CS 27116

Ordering properties: Causal Causal or happens-before ordering: cbcast If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a b CS 27117

Ordering properties: Causal pqrspqrs a bc delivery of c to p is delayed until after b is delivered CS 27118

Ordering properties: Causal pqrspqrs a bc e delivery of c to p is delayed until after b is delivered e is sent (causally) after b CS 27119

Ordering properties: Causal pqrspqrs a bcd e delivery of c to p is delayed until after b is delivered delivery of e to r is delayed until after b&c are delivered CS 27120

Limitation of Causal Broadcast Causal broadcast does not impose any order on unrelated messages. Two replicas can deliver operations/request in different order. CS 27121

Ordering properties: Total Total or locally total multicast: atomic bcast Messages are delivered in same order to all recipients (including the sender) pqrspqrs a b c d e all deliver a, b, c, d, then e CS 27122

Simple Causal broadcast protocol Each broadcast message carries all causally preceding messages Before delivery, ensure causality by delivering any missed causally preceding messages. CS 27123

Isis Causal Broadcast Each process maintains a time vector of size n. Initially VT[i] = 0. When p sends a new message m: VT[p]++ Each message is piggybacked with VT m which is the current VT of the sender. When p delivers a message, p updates its vector: for k in 1..n: – VT p [k] = max{ VT p [k], VT m [k] }. CS 27124

Isis Causal Order Requirement for delivery at node j: – VT sender [sender] = VT receiver [sender]+1 This is the next message from sender – VT sender [k] =< VT receiver [k] for all k not sender Receiver has received all causally preceding messages send er recei ver VT sender VT receiver CS 27125

Total order Different classes of total order broadcast: – Fixed sequencer – Moving sequencer using Token – Dstributed agreement using Timestamp CS 27126

Using Sequencer (Amoeba) Delivery algorithm similar to FIFO except for using a special “sequencer” to order messages Sender attaches unique id i to each message m and sends to the sequencer as well as to all destinations Sequencer maintains sequence number S (consecutive and increasing) and broadcast to all destinations. Message(k) is delivered – if all messages(j) (0  j < k) are received CS 27127

Distributed Total Order Protocol (ISIS) Processes collectively agree on sequence numbers (priority) in three rounds Sender sends message to all receivers; Receivers suggest priority (sequence number) and reply to sender with proposed priority; Sender collects all proposed priorities; decides on final priority (breaking ties with process ids), and resends the agreed final priority for message m Receivers deliver message m according to decided final priority CS 27128

ISIS algorithm for total ordering Message 2 Proposed Seq P 2 P 3 P 1 P 4 3 Agreed Seq 3 3 Group g: P1, P2, P3, P4 CS 27129