# DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013.

## Presentation on theme: "DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013."— Presentation transcript:

DCS 6. Basic Distributed Algorithms Fundamentals Wei Yuan November,21,2013

Outline Physical Clocks Logical Clocks – Lamport’s Logical Clock – Vector Clock Global Snapshots 2

Physical Clocks Most computers today keep track of the passage of time with a battery-backed-up CMOS clock circuit, driven by a quartz oscillator. – battery backup to continue measuring time when power is off Two registers with quartz: counter, holding register A Programmable Interval Timer, to generate an interrupt (clock tick) periodically The interrupt service procedure simply adds one to a counter in memory. 3

Problem Getting two systems to agree on time – Two clocks hardly ever agree – Quartz oscillators oscillate at slightly different frequencies Clocks tick at different rates – Create ever-widening gap in perceived time – Clock Drift （时钟漂移） Difference between two clocks at one point in time – Clock Skew （时钟偏移） 4

Solution 国际原子时间（ international atomic time ， TAI ） 统一协调时间（ Universal coordinated time ， UTC ） …… 时间同步算法 5

Outline Physical Clocks Logical Clocks – Lamport’s Logical Clock – Vector Clock Global Snapshots 6

Lamport’s Logical Clock A distributed system consists of a collection of distinct processes which are spatially separated, and which communicate with one another by exchanging messages. – A network of interconnected computers, the ARPA net – A single computer :the central control unit, the memory units, and the input-output channels are separate processes Lamport L. Time, clocks, and the ordering of events in a distributed system[J]. Communications of the ACM, 1978, 21(7): 558-565. 7

Lamport’s happened before ( → ) relation Define the "happened before" relation without using physical clocks(partial ordering) Assumption – the system is composed of a collection of processes – Each process consists of a sequence of events – the execution of a subprogram on a computer – the execution of a single machine instruction We are assuming that the events of a process form a sequence, where a occurs before b in this sequence if a happens before b. 8

9

space-time diagram horizontal: space vertical: time dots: events vertical lines: process wavy lines: messages 10

11

A process’ clock “ticks” – （ 1 ） means that there must be a tick line between any two events on a process line – （ 2 ） means that every message line must cross a tick line 12

Event counting example 13

Lamport’s logical timestamps 14

Event counting example Applying Lamport’s algorithm 15

Problem: Identical timestamps Concurrent events (e.g., b & g; i & k) may have the same timestamp … or not Total ordering: every event is assigned a unique timestamp (number), every such timestamp is unique. 16

Unique timestamps (total ordering) 17

Unique (totally ordered) timestamps 18

Problem: Detecting causal relations 19

Outline Physical Clocks Logical Clocks – Lamport’s Logical Clock – Vector Clock Global Snapshots 20

Vector clocks 21

Comparing vector timestamps 22

Vector timestamps 23 (0,0,0)

Vector timestamps 24 (1,0,0) (0,0,0)

Vector timestamps 25 (0,0,0) (1,0,0)(2,0,0)

Vector timestamps 26 (1,0,0)(2,0,0) (0,0,0)

Vector timestamps 27 (1,0,0)(2,0,0) (0,0,0)

Vector timestamps 28 (1,0,0)(2,0,0) (0,0,0)

Vector timestamps 29 (1,0,0)(2,0,0) (0,0,0)

Vector timestamps 30 (1,0,0)(2,0,0) (0,0,0) Two events are concurrent if neither V(e)≤V(e’) nor V(e’)≤ V(e) Two events are concurrent if neither V(e)≤V(e’) nor V(e’)≤ V(e)

Vector timestamps 31 (1,0,0)(2,0,0) (0,0,0)

Vector timestamps 32 (1,0,0)(2,0,0) (0,0,0) (2,1,0)

Vector timestamps 33 (1,0,0)(2,0,0) (0,0,0) (2,2,0)

Outline Physical Clocks Logical Clocks – Lamport’s Logical Clock – Vector Clock Global Snapshots 34

“Distributed snapshots: determining global states of distributed systems”, K. Mani Chandy and Leslie Lamport, ACM TOCS 1985 35

Model of a Distributed System Finite set of processes as nodes. Finite set of channels as edges. Channels have infinite buffers, are error-free and FIFO. The delay experienced by a message is arbitrary but finite. 36 pq r c1 c2 c3 c4

A banking example to illustrate recording of consistent states 37

Global State of a Distributed System Global State: Union of the local states of the individual processes and the state of the channels. The state of a channel is determined by “Message in transit” where the message is sent along the channel but not yet received. Initial global state for system: – each process is in initial state – the state of each channel is empty sequence 38 分布式系统的每个组件都有一个本地状态。 进程状态：由本地存储器和活动历史描述。 通道状态：由沿通道发送的消息减去沿通道接收消息 的序列描述。 分布式系统的每个组件都有一个本地状态。 进程状态：由本地存储器和活动历史描述。 通道状态：由沿通道发送的消息减去沿通道接收消息 的序列描述。

Global State Detection Many problems in distributed systems can be solved by detecting a global state of system. Stable property detection – A stable property which once becomes true, remains true forever. – E.g. termination, deadlock, token loss etc. Checkpointing in distributed systems – E.g.debugging, failure recovering etc. 39 分布式系统中没有共享的存储器和全局时钟，本地时钟和本地存储器 这样的分布式特性使得有效记录系统全局状态很困难。 检测如死锁和终止这样的稳态特性时，就需要检查系统全局状态。 对于故障恢复，需要周期性地保存分布式系统的全局状态（称检查 点），并通过把系统还原到最近保存的全局状态使恢复工作从进程故 障点开始。 检测如死锁和终止这样的稳态特性时，就需要检查系统全局状态。 对于故障恢复，需要周期性地保存分布式系统的全局状态（称检查 点），并通过把系统还原到最近保存的全局状态使恢复工作从进程故 障点开始。

Distributed Computation A distributed computation is the sequence of events. There are three kind of events: local, send, receive. An event is an atomic action that may change the state of the process p and the state of at most one channel that is incident on p. Definition of Event e Event is a five-tuple e =, where p is the process in which the event occur, s is the state of p immediately before the event, s' is the state of p immediately after the event, M is the message sent or received along the channel c. 40

Consistent Global State Consistency: every message that is recorded as received has also been recorded as sent. Consistent global states determined by a snapshots are the states that may have occurred during the computation. 41 同时满足以下两个条件： C1. 消息守恒。记录在进程 p i 的本地状态中发送的消息 m ij 必 须出现在通道 C ij 的状态中，或是出现在接收方进程 p j 的本地 状态中。 C2. 在得到的全局状态中，对于每一个结果，引起结果的 原因也必须出现。 同时满足以下两个条件： C1. 消息守恒。记录在进程 p i 的本地状态中发送的消息 m ij 必 须出现在通道 C ij 的状态中，或是出现在接收方进程 p j 的本地 状态中。 C2. 在得到的全局状态中，对于每一个结果，引起结果的 原因也必须出现。

Chandy–Lamport Algorithm Each process in the system records its local state and the state of its incoming channels. Recorded states form a consistent global state. Snapshot algorithm runs concurrently with the computation but does not alter the underlying computation. Snapshot algorithm uses marker as a recording signal. Any process can initiate the snapshot by sending a marker for all outgoing channels. On receiving a marker a process records its own local state and the states of all incoming channels. 42

Chandy–Lamport Algorithm contd. Marker-Sending Rule for Process p i (1) Process p i records its state. (2) For each outgoing channel C on which a marker has not been sent, p i sends a marker along C before p i sends further messages along C. 43

Chandy–Lamport Algorithm contd. Marker-Receiving Rule for Process p j On receiving a marker along channel C: if p j has not recorded its state then Record the state of C as the empty set Execute the “marker sending rule” else Record the state of C as the set of messages received along C after p j ’s state was recorded and before p j received the marker along C 44

Thanks! Q&A 45