CS603 Clock Synchronization February 4, 2002. What is the best we can do? Lundelius and Lynch ‘84 Assumptions: –No failures –No drift –Fully connected.

Slides:

Advertisements

Similar presentations

Causal Delivery (Thomas) Matt Guinn CS523, Spring 2006.

Advertisements

Lecture 1: Logical, Physical & Casual Time (Part 2) Anish Arora CSE 763.

CS 603 Process Synchronization: The Colored Ticket Algorithm February 13, 2002.

Impossibility of Distributed Consensus with One Faulty Process

CS 542: Topics in Distributed Systems Diganta Goswami.

CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.

Lecture 8: Asynchronous Network Algorithms

Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:

6.852: Distributed Algorithms Spring, 2008 Class 7.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

Clock Synchronization. Problem 5:33 5:57 5:20 4:53 6:01.

Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus

Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.

Time and Clock Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

Termination Detection. Goal Study the development of a protocol for termination detection with the help of invariants.

Byzantine Generals Problem Anthony Soo Kaim Ryan Chu Stephen Wu.

The Byzantine Generals Problem (M. Pease, R. Shostak, and L. Lamport) January 2011 Presentation by Avishay Tal.

CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.

CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:

Computer Science Lecture 11, page 1 CS677: Distributed OS Last Class: Clock Synchronization Logical clocks Vector clocks Global state.

Clock Synchronization Ken Birman. Why do clock synchronization?  Time-based computations on multiple machines Applications that measure elapsed time.

Josef WidderBooting Clock Synchronization1 The  - Model, and how to Boot Clock Synchronization in it Josef Widder Embedded Computing Systems Group

Wait-Free Consensus with Infinite Arrivals James Aspnes Gauri Shah Jatin Shah Yale University STOC 2002.

Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.

Josef Widder1 Why, Where and How to Use the  - Model Josef Widder Embedded Computing Systems Group INRIA Rocquencourt, March 10,

Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.

Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.

On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

Add Fault Tolerance – order & time Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport Optimal Clock Synchronization T.K. Srikanth.

1 Clock Synchronization Ronilda Lacson, MD, SM. 2 Introduction Accurate reliable time is necessary for financial and legal transactions, transportation.

1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:

Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.

Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.

Foundations of Communication on Multiple-Access Channel Dariusz Kowalski.

Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.

CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.

1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.

Agenda Fail Stop Processors –Problem Definition –Implementation with reliable stable storage –Implementation without reliable stable storage Failure Detection.

1 Leader Election in Rings. 2 A Ring Network Sense of direction left right.

Efficient Fork-Linearizable Access to Untrusted Shared Memory Presented by: Alex Shraer (Technion) IBM Zurich Research Laboratory Christian Cachin IBM.

CS294, Yelick Consensus revisited, p1 CS Consensus Revisited

Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.

Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,

UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department

CS603 Clock Synchronization February 4, What is Clock Synchronization? All nodes agree on time What do we mean by time? –Monotonic –Any observation.

CS 3471 CS 347: Parallel and Distributed Data Management Notes13: Time and Clocks.

CS 347Notes 121 CS 347: Parallel and Distributed Data Management Notes12: Time and Clocks Hector Garcia-Molina.

Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.

1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.

Hwajung Lee. Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.

CSE 486/586 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.

Fail-Stop Processors UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau One paper: Byzantine.

Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.

Proof of liveness: an example

The consensus problem in distributed systems

When Is Agreement Possible

Distributed Computing

Agreement Protocols CS60002: Distributed Systems

Maya Haridasan April 15th

PERSPECTIVES ON THE CAP THEOREM

Physical clock synchronization

Presentation transcript:

CS603 Clock Synchronization February 4, 2002

What is the best we can do? Lundelius and Lynch ‘84 Assumptions: –No failures –No drift –Fully connected network of n nodes –Uncertainty of ε in message delivery time Best guarantee: –ε(1 – 1/n) –This is a tight lower bound

Lower bound proof Idea: Based on view of each node –Views indistinguishable even if real time not the same –Shift execution of a node relative to real time Shift of global view and local view equivalent if message delays changed –Can always shift by at least ε(1 – 1/n) without changing local views

Proof: Induction Clocks synchronized to within γ Assume messages one way take time μ, return takes time μ+ε (e 1 ) Induction: Assume node i-1 sends with delay μ, receives with delay μ+ε –Shift processes < i by ε Let V 1,…,V n be local times at termination of e 1. –In e 1, V n ≤ V 1 + γ –In e i, V i-1 ≤ V i + y – ε ∑ V i ≤ ∑ V i +nγ – (n-1) ε –(n-1) nγ –γ ≥ ε(1-1/n)

Synchronization with Faulty Clocks (Dolev, Halpern, Strong ‘84) Problem: What if some sites are really bad? –Bad clocks –Don’t follow protocol Notation –C: Logical clock –D: Physical clock –TAR: Time Adjustment Register C = D + TAR –Δ: Uncertainty in message delay –C(t), D(t) – value of clock at REAL time t

Assumptions Fully connected, but not necessarily complete Recipient knows source of message Given nodes p,q; H(p,q) and L(p,q) are upper/lower bounds on transmission time –ρ is min(H/L) A real time frame (not directly observable) Correct physical clock has bounded drift rate:  R such that  time u>v, (1/R)(u-v) ≤ D(u)-D(v) ≤ R(U-v) Correct processor has correct clock, implements algorithm No assumptions on behavior of faulty processor –Don’t care if faulty processor knows correct time All processors start within time B (can easily show B ≤ R(n-1)H)

Weak Synchronization Weak Clock Synchronization Condition: Constants PER, DMAX, ADJ such that: –TAR changes only at times that are multiples of PER by amount less than ADJ –Difference between clocks bounded by DMAX Theorem: There is an algorithm that achieves WCSC, independent of faults, for which C(t) is unbounded Proof: Set TAR(t’) = log PER (D(t))-D(t)

Real clock synchronization Clock Synchronization Condition: Add –PER > ADJ –Changes occur only first time C reads iPER If change when C(t)=iPER, then C(t’) ≠ iPER  t’<t Gives Linear Envelope Synchronization: –at+b 0 Theorem:Linear Envelope Synchronization impossible if  1/3 processors faulty

Proof Sketch Construct algorithm that forces a correct processor to run at rate greater than aρ n Idea: faulty processor p uses one algorithm for processor q, other for others –Two-faced behavior –Can’t tell which is two-faced –Correct processor caught in the middle – follow fast clock or slow clock?

Three-processor case (p, q, r) Assume algorithm A synchronizes in time N and tolerates one fault F 0 = A F m+1 : p pretends its clock runs at ρ times q’s rate p pretends r sends messages so C p (t) > aρ m D p (t)+b-mDMAX –F m gives these messages q cannot distinguish from case where p’s clock is fast, r is sending p messages according to F m C q (t) > C p (t) – DMAX > aρ m D p (t) + b – (m+1) DMAX = aρ m+1 D q (t)+b-(m+1) DMAX (since D p (t) = ρD q (t)

Possibility (Fischer, Lynch, Merritt) If no uncertainty in message delay, f faulty, can do with 2f+1 processors –Send messages to all neighbors –Send all messages back –Round trip gives time –Faulty processor will be detected if it tries to be worse than round-trip time Messages out of order

Possibility (Dolev Halpern Simons Strong) We CAN do better –Requires authentication Assumptions: –Messages will be received with bounded delay –Bounded drift –Digital signature –If p has set of messages M at time t with more than f distinct signers, one signer was correct at time signed –2ρ(f+1) < 1 Key: Synchronization time known in advance –At time, send signed “time is now” –If receive f+1 messages saying “time is now” before getting to that time, update local time

Recruiting Bulletin Harris Corporation is in the CS lobby until 3pm today