Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Chinese University, CSE Dept. Distributed Systems / 9 - 1 Distributed Systems Topic 9: Time, Coordination and Replication Dr. Michael R. Lyu Computer.

Similar presentations


Presentation on theme: "© Chinese University, CSE Dept. Distributed Systems / 9 - 1 Distributed Systems Topic 9: Time, Coordination and Replication Dr. Michael R. Lyu Computer."— Presentation transcript:

1 © Chinese University, CSE Dept. Distributed Systems / 9 - 1 Distributed Systems Topic 9: Time, Coordination and Replication Dr. Michael R. Lyu Computer Science & Engineering Department The Chinese University

2 © Chinese University, CSE Dept. Distributed Systems / 9 - 2 Outline 1 Time 2 Coordination 3 Replication  The gossip architecture  The process groups approach 4 Summary

3 © Chinese University, CSE Dept. Distributed Systems / 9 - 3 1 Time  The notation of time –story: 12/9/1949  External synchronization  Internal synchronization  Physical clocks and their synchronization  Logical time and logical clocks

4 © Chinese University, CSE Dept. Distributed Systems / 9 - 4 1.1 Synchronizing Physical Clocks  Computer each contain their physical clock.  Physical clock is limited by its resolution - the period between updates of the clock register.  Clock drift often happens to physical clocks.  To compensate for clock drifts, computers are synchronized to a time service, e.g., UTC - Coordinated universal time.  Several other algorithms for synchronization.

5 © Chinese University, CSE Dept. Distributed Systems / 9 - 5 1.1 Compensating for Clock Drift  S(t) = H(t) +  (t) ; S = application time, H = hardware clock time,  = compensating factor.  Assuming linear relation  (t) = aH(t) + b.  Let the value of the software clock be T skew when H = h, and let the actual time be T real.  If S is to give the actual time after N further ticks, we have T skew = (1 + a)h + b, and T real + N = (1 + a)(h + N) + b.  a = (T real - T skew ) / N and b = T skew - (1 + a)h

6 © Chinese University, CSE Dept. Distributed Systems / 9 - 6 1.1 Cristian’s Clock Synchronization  Let the time returned in S ’s message m t be t. P should set its clock to t + T round / 2.  The time by S ’s clock when the reply message arrives is [ t + min, t + T round - min ], with width T round - 2 min and accuracy ±( T round / 2 - min ).

7 © Chinese University, CSE Dept. Distributed Systems / 9 - 7 1.1 The Berkeley Algorithm  A coordinator computer is chosen to act as the master. Master periodically polls to slaves whose clocks are to be synchronized.  The master estimates their local clock times by observing the round-trip times, and it averages the values obtained.  The master takes a fault-tolerant average.  Should the master fail, then another can be elected to take over.

8 © Chinese University, CSE Dept. Distributed Systems / 9 - 8 1.1 The Network Time Protocol  NTP distributes time information to provide: –a service to synchronize clients in Internet –a reliable service that survives loss of connection –a frequent resynchronization for client’s clock drift –protection against interference with time server  NTP service is provided by various servers: –Primary servers, secondary servers, and servers of other levels (called strata).  Synchronization subnet: the servers which are connected in a logical hierarchy.

9 © Chinese University, CSE Dept. Distributed Systems / 9 - 9 1.1 NTP Synchronization Modes  Estimating delay and offset in NTP protocol: –a = T i-2 - T i-3 –b = T i –T i-1 –d i = a + b –o i = (a-b)/2  NTP servers synchronize in three modes: –Multicast mode –Procedure-call mode –Symmetric mode

10 © Chinese University, CSE Dept. Distributed Systems / 9 - 10 1.2 Logical Time and Logical Clocks  The order of the events –two events occurred in the order they appear in a process. –event of sending occurred before event of receiving.  happened-before relation, denoted by  HB1: If  process p: x  p y, then x  y. HB2: For any message m, send(m)  rcv(m), HB3: If x, y and z are events such that x  y and y  z, then x  z.

11 © Chinese University, CSE Dept. Distributed Systems / 9 - 11 1.2 Logical Timestamps Example  Events occurring at three processes

12 © Chinese University, CSE Dept. Distributed Systems / 9 - 12 1.2 Logical Timestamps  Logical clock - a monotonically increasing software counter.  C p : logical clock for process p; C p (a): timestamp of event a at p; C(b): timestamp of event b  LC1: event issued at process p: C p := C p + 1 LC2: a) p sends message m to q with value t = C p b) C q := max(C q,t) and applies LC1 to rcv(m).  If a  b then C(a) < C(b), but not visa versa!  Total order logical clock and vector clock.

13 © Chinese University, CSE Dept. Distributed Systems / 9 - 13 1.2 Logical Timestamps Example  Events occurring at three processes 1 2 3 4 51 3 6 => 7

14 © Chinese University, CSE Dept. Distributed Systems / 9 - 14 2 Coordination  Distributed processes need to coordinate their activities.  Distributed mutual exclusion is required for safety, liveness, and ordering properties.  Election algorithms: methods for choosing a unique process for a particular role.

15 © Chinese University, CSE Dept. Distributed Systems / 9 - 15 2.1 Distributed Mutual Exclusion  The basic requirements for mutual exclusion: –ME1 (safety): At most one process may execute in the critical section (CS) at a time. –ME2 (liveness): A process requesting entry to the CS is eventually granted. –ME3 (ordering): Entry to the CS should be granted in happened-before order.  The central server algorithm.  A ring-based algorithm.  A distributed algorithm using logical clocks.

16 © Chinese University, CSE Dept. Distributed Systems / 9 - 16 2.2 Elections  An election is a procedure carried out to choose a process from a group.  A ring-based election algorithm.  The bully algorithm.

17 © Chinese University, CSE Dept. Distributed Systems / 9 - 17 3 Replication  Replication is the maintenance of on-line copies of data and resources  For performance, availability, fault tolerance.  Basic Architectural Model.  Consistency and request ordering.  The gossip architecture.  The process group approach.

18 © Chinese University, CSE Dept. Distributed Systems / 9 - 18 3 Bulletin Board Example

19 © Chinese University, CSE Dept. Distributed Systems / 9 - 19 3 Replication Issues  Replica management models consider trade- off between accuracy and response time. –Simple asynchronous model –Totally synchronous model –Quorum-based schemes –Causality-ordered  Multicast updates to a process group.  Read/write ratio.

20 © Chinese University, CSE Dept. Distributed Systems / 9 - 20 3.1 Basic Architectural Model

21 © Chinese University, CSE Dept. Distributed Systems / 9 - 21 3.1 The Gossip Architecture

22 © Chinese University, CSE Dept. Distributed Systems / 9 - 22 3.1 The Primary Copy Model

23 © Chinese University, CSE Dept. Distributed Systems / 9 - 23 3.2 Consistency and Request Ordering  Criteria: correctness vs. expenses.  Total, causal, and sync ordering requirements.  Implementing request ordering.  Implementing total ordering.  Implementing causal ordering with vector timestamps.

24 © Chinese University, CSE Dept. Distributed Systems / 9 - 24 3.2.1 Total, Causal, and Sync Ordering  Let r 1 and r 2 be requests.  Total ordering: Either r 1 is processed before r 2 or r 2 is processed before r 1, at all RMs.  Causal ordering: If r 1 happened-before r 2 then r 1 is processed before r 2 at all RMs.  FIFO ordering: If r 1 is issued before r 2 then r 1 is processed before r 2 at all RMs.  Sync-ordering: If r 1 is sync-ordered, then either r 1 is processed before r 2 at all RMs or r 2 is processed before r 1 at all RMs.

25 © Chinese University, CSE Dept. Distributed Systems / 9 - 25 3.2.1 Example 1

26 © Chinese University, CSE Dept. Distributed Systems / 9 - 26 3.2.1 Example 2

27 © Chinese University, CSE Dept. Distributed Systems / 9 - 27 3.2.2 Implementing Request Ordering  Hold-back: A received request is not processed by RM until ordering constraints can be met.  Stable message: all prior requests processed.  Hold-back queue vs. delivery queue.  Safety property: no message will be delivered out of order by being prematurely transferred.  Liveness property: no message should wait on the hold-back queue forever.

28 © Chinese University, CSE Dept. Distributed Systems / 9 - 28 3.2.3 Implementing Total Ordering  Basic approach: assign totally ordered identifiers to requests.  Sequencer  Distributed agreement in assigning request ids.

29 © Chinese University, CSE Dept. Distributed Systems / 9 - 29 3.2.4 Implementing Causal Ordering  Vector timestamp: a list of counts of update events, one for each of the replica managers.  Merging vector timestamps: choose the largest values from the two vectors, component-wise. e.g., FE time vector: (2,3,4)

30 © Chinese University, CSE Dept. Distributed Systems / 9 - 30 3.3 The Gossip Architecture Query & Update Ops.Clients Communication via FE

31 © Chinese University, CSE Dept. Distributed Systems / 9 - 31 3.4 Process Group Approach  Process group and group communication.  Group structure –peer group –server group –client-server group –subscription group –hierarchical groups

32 © Chinese University, CSE Dept. Distributed Systems / 9 - 32 3.4 Process Group Services  Group membership management –Create –Join –Leave  Group address expansion  Multicast communication –unreliable multicast –reliable multicast –atomic multicast

33 © Chinese University, CSE Dept. Distributed Systems / 9 - 33 3.4 Multicast Communication  Sample multicasting operation void Multicast (in orderType order, in groupId group, in msg m, in int nReplies, out msgSeq replies) raises (…);  Order types: –unordered –total ordering –causal ordering –sync-ordering

34 © Chinese University, CSE Dept. Distributed Systems / 9 - 34 4 Summary  Timing issues –Synchronizing physical clocks. –Logical time and logical clocks.  Distributed coordination and mutual exclusions.  Replication to providing good performance, high availability and fault tolerance.  The gossip approach and the process group approach.  CORBA replication service is a research topic.


Download ppt "© Chinese University, CSE Dept. Distributed Systems / 9 - 1 Distributed Systems Topic 9: Time, Coordination and Replication Dr. Michael R. Lyu Computer."

Similar presentations


Ads by Google