CSE 486/586 Distributed Systems Leader Election

Slides:



Advertisements
Similar presentations
CS 542: Topics in Distributed Systems Diganta Goswami.
Advertisements

CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.
1 Indranil Gupta (Indy) Lecture 8 Paxos February 12, 2015 CS 525 Advanced Distributed Systems Spring 2015 All Slides © IG 1.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 11: Leader Election All slides © IG.
CS 582 / CMPE 481 Distributed Systems
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.
Election Algorithms and Distributed Processing Section 6.5.
Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Wrap-up Steve Ko Computer Sciences and Engineering University at Buffalo.
Coordination and Agreement. Topics Distributed Mutual Exclusion Leader Election.
Page 1 Distributed Systems Election Algorithms* *referred to slides by Prof. Hugh C. Lauer at Worcester Polytechnic Institute.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 8 Leader Election Section 12.3 Klara Nahrstedt.
Synchronization CSCI 4780/6780. Mutual Exclusion Concurrency and collaboration are fundamental to distributed systems Simultaneous access to resources.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 10, 2005 Session 9.
ADITH KRISHNA SRINIVASAN
Lecture 11-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 28, 2010 Lecture 11 Leader Election.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion & Leader Election Steve Ko Computer Sciences and Engineering University.
Lecture 10: Coordination and Agreement (Chap 12) Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
Election Distributed Systems. Algorithms to Find Global States Why? To check a particular property exist or not in distributed system –(Distributed) garbage.
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
Lecture 12-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) October 4, 2012 Lecture 12 Mutual Exclusion.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Systems 14. Coordination in Distributed Systems Simon Razniewski Faculty of Computer Science Free University of Bozen-Bolzano A.Y. 2014/2015.
Distributed Systems Lecture 9 Leader election 1. Previous lecture Middleware RPC and RMI – Marshalling 2.
Lecture 11: Coordination and Agreement Central server for mutual exclusion Election – getting a number of processes to agree which is “in charge” CDK4:
Distributed Systems 31. Theoretical Foundations of Distributed Systems - Coordination Simon Razniewski Faculty of Computer Science Free University of Bozen-Bolzano.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Coordination and Agreement
CSE 486/586 Distributed Systems Wrap-up
CSE 486/586 Distributed Systems Failure Detectors
CSE 486/586 Distributed Systems Leader Election
CS 525 Advanced Distributed Systems Spring 2013
Simon Razniewski Faculty of Computer Science
Chapter 11 Coordination Election Algorithms
CSE 486/586 Distributed Systems Wrap-up
Election algorithm Who wins? God knows.
Lecture 17: Leader Election
CSE 486/586 Distributed Systems Global States
CSE 486/586 Distributed Systems Time and Synchronization
CSE 486/586 Distributed Systems Failure Detectors
CSE 486/586 Distributed Systems Failure Detectors
Leader Election (if we ignore the failure detection part)
CS 525 Advanced Distributed Systems Spring 2018
Alternating Bit Protocol
Parallel and Distributed Algorithms
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)
CSE 486/586 Distributed Systems Leader Election
CSE 486/586 Distributed Systems Concurrency Control --- 3
CSE 486/586 Distributed Systems Mutual Exclusion
Lecture 10: Coordination and Agreement
Lecture 11: Coordination and Agreement
CSE 486/586 Distributed Systems Failure Detectors
CSE 486/586 Distributed Systems Reliable Multicast --- 2
CSE 486/586 Distributed Systems Mutual Exclusion
CSE 486/586 Distributed Systems Byzantine Fault Tolerance
CSE 486/586 Distributed Systems Time and Synchronization
CSE 486/586 Distributed Systems Concurrency Control --- 3
CSE 486/586 Distributed Systems Reliable Multicast --- 1
CSE 486/586 Distributed Systems Mid-Semester Overview
Presentation transcript:

CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo

Recap: Mutual Exclusion Centralized Ring-based Ricart and Agrawala’s Maekawa’s

Why Election? Example 1: sequencer for TO multicast Example 2: leader for mutual exclusion Example 3: group of NTP servers: who is the root server?

What is Election? In a group of processes, elect a leader to undertake special tasks. What happens when a leader fails (crashes) Some process detects this (how?) Then what? Focus of this lecture: election algorithms 1. Elect one leader only among the non-faulty processes 2. All non-faulty processes agree on who is the leader We’ll look at 3 algorithms

Assumptions Any process can call for an election. A process can call for at most one election at a time. Multiple processes can call an election simultaneously. All of them together must yield a single leader only The result of an election should not depend on which process calls for it. Messages are eventually delivered.

Problem Specification At the end of the election protocol, the non-faulty process with the best (highest) election attribute value is elected. Attribute examples: CPU speed, load, disk space, ID Must be unique Each process has a variable elected. A run (execution) of the election algorithm should ideally guarantee at the end: Safety:  non-faulty p: (p's elected = (q: a particular non-faulty process with the best attribute value) or ) Liveness:  election: (election terminates) &  p: non-faulty process, p’s elected is eventually not 

Ring-Based Election: Example An election message is sent & contains the highest ID encountered so far. Scenario The election was started by process 17. The highest process identifier encountered so far is 24 (final leader will be 33) Q: when do we stop? 33 17 4 24 9 1 15 24 28

Algorithm 1: Ring Election [Chang & Roberts’79] N Processes are organized in a logical ring pi has a communication channel to pi+1 mod N. All messages are sent clockwise around the ring. To start election Send election message with my ID When receiving message (election, id) If id > my ID: forward message Set state to participating If id < my ID: send (election, my ID) Skip if already participating If id = my ID: I am elected (why?) send elected message elected message forwarded until it reaches leader

Ring-Based Election: Example The election was started by process 17. The highest process identifier encountered so far is 24 (final leader will be 33) 33 17 4 24 9 1 15 24 28

Ring-Based Election: Example The worst-case scenario occurs when? the counter-clockwise neighbor (@ the initiator) has the highest attr. In the example: The election was started by process 17. The highest process identifier encountered so far is 24 (final leader will be 33) 33 17 4 24 9 1 15 24 28

Ring-Based Election: Analysis In a ring of N processes, in the worst case: N-1 election messages to reach the new coordinator Another N election messages before coordinator decides it’s elected Another N elected messages to announce winner Total Message Complexity = 3N-1 Turnaround time = 3N-1 24 15 9 4 33 28 17 1

Correctness? Safety: highest process elected Liveness: complete after 3N-1 messages Clicker question?

Example: Ring Election P2 initiates election after old leader P5 failed P1 P2 P3 P4 P0 P5 2. P2 receives "election", P4 dies Election: 4 P1 P2 P3 P4 P0 P5 3. Election: 4 is forwarded forever? Election: 2 Election: 4 Election: 4 Election: 3 May not terminate when process failure occurs during the election! Consider above example where attr==highest id

CSE 486/586 Administrivia Midterm grading going on PA2B grading also going on

Algorithm 2: Modified Ring Election election message tracks all IDs of nodes that forwarded it, not just the highest Each node appends its ID to the list Once message goes all the way around a circle, new coordinator message is sent out Coordinator chosen by highest ID in election message Each node appends its own ID to coordinator message When coordinator message returns to initiator Election a success if coordinator among ID list Otherwise, start election anew

Example: Ring Election 1. P2 initiates election P1 P2 P3 P4 P0 P5 2. P2 receives "election", P4 dies Election: 2, 3,4,0,1 P1 P2 P3 P4 P0 P5 3. P2 selects 4 and announces the result Election: 2 Coord(4): 2 Election: 2,3,4 Election: 2,3 Coord(4): 2,3 Coord(4) 2, 3,0,1 P1 P2 P3 P4 P0 P5 4. P2 receives "Coord", but P4 is not included Election: 2, 3,0,1 P1 P2 P3 P4 P0 P5 6. P3 is finally elected Coord(3): 2, 3,0,1 P1 P2 P3 P4 P0 P5 5. P2 re-initiates election Coord(3): 2,3,0 Election: 2,3,0 Coord(3): 2 Election: 2 Election: 2,3 Coord(3): 2,3

Modified Ring Election How many messages? 2N Is this better than original ring protocol? Messages are larger What if initiator fails? Successor notices a message that went all the way around (how?) Starts new election What if two people initiate at once Discard initiators with lower IDs #2 is clicker question

What about that Impossibility? Can we have a totally correct election algorithm in a fully asynchronous system (no bounds) No! Election can solve consensus Where might you run into problems with the modified ring algorithm? Detect leader failures Ring reorganization (member failures)

Algorithm 3: Bully Algorithm Assumptions: Synchronous system attr=id Each process knows all the other processes in the system (and thus their id's)

Algorithm 3: Bully Algorithm 3 message types election – starts an election answer – acknowledges a message coordinator – declares a winner Start an election Send election messages only to processes with higher IDs than self If no one replies after timeout: declare self winner If someone replies, wait for coordinator message Restart election after timeout When receiving election message Send answer Start an election yourself If not already running

Example: Bully Election answer=OK P1 P2 P3 P4 P0 P5 1. P2 initiates election 2. P2 receives replies P1 P2 P3 P4 P0 P5 3. P3 & P4 initiate election P1 P2 P3 P4 P0 P5 Election OK Election P1 P2 P3 P4 P0 P5 4. P3 receives reply P1 P2 P3 P4 P0 P5 5. P4 receives no reply P1 P2 P3 P4 P0 P5 5. P4 announces itself coordinator OK

The Bully Algorithm The coordinator p4 fails and p1 detects this 2 3 4 C coordinator Stage 4 election Stage 2 answer Stage 1 timeout Stage 3 Eventually..... The coordinator p4 fails and p1 detects this p3 fails

Analysis of The Bully Algorithm Best case scenario? The process with the second highest id notices the failure of the coordinator and elects itself. N-2 coordinator messages are sent. Turnaround time is one message transmission time.

Analysis of The Bully Algorithm Worst case scenario? When the process with the lowest id in the system detects the failure. N-1 processes altogether begin elections, each sending messages to processes with higher ids. The message overhead is O(N2).

Turnaround time T: Message bound---all messages arrive within T units of time (synchronous) Tprocess: Processing bound---bound on the processing time at each process Turnaround time: election message from lowest process (T) Timeout at 2nd highest process (X) coordinator message from 2nd highest process (T) How long should the timeout be? X = 2T + Tprocess Total turnaround time: 4T + 3Tprocess

Summary Coordination in distributed systems sometimes requires a leader process Leader process might fail Need to (re-) elect leader process Three Algorithms Ring algorithm Modified Ring algorithm Bully Algorithm

Acknowledgements These slides contain material developed and copyrighted by Indranil Gupta (UIUC).