Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Synchronization Synchronization in centralized systems is easy. Synchronization in distributed systems is much more difficult to achieve. Why do we need.

Similar presentations


Presentation on theme: "1 Synchronization Synchronization in centralized systems is easy. Synchronization in distributed systems is much more difficult to achieve. Why do we need."— Presentation transcript:

1 1 Synchronization Synchronization in centralized systems is easy. Synchronization in distributed systems is much more difficult to achieve. Why do we need synchronization in distributed systems? –Distributed mutual exclusion –Distributed Concurrency and Deadlock –Leader/Coordinator election Basic Issues examined here: –Clock synchronization –Logical clocks –Global State Algorithms –Distributed transactions

2 2 Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time. Example: using makefile to develop a program.. Different machines are used for creation/compilation

3 3 Physical Clocks 1.Basic mechanism: Timer 2.A computer timer is often an oscillating quartz crystal at well defined frequencies 3. With the crystal there are two registers: counter, holding register. 4.Each oscillation of the crystal decreases the counter by one. 5.When the counter is ZERO an interrupt is sent to the CPU The counter is loaded the value of the holding counter 6.In this way, the crystal can create an interrupt 60 times a second. 7.Each such interrupt is a clock tick (and constitutes the basic timing mechanism is a centralized system).

4 4 Multiple Physical Clocks If many CPUs are introduced time skew may develop! Two fundamental problems need to be addressed: –How do we synchronize clocks with real-time clocks –How do we synchronize clocks with each other.

5 5 Physical Clock Computation of the mean solar day. Transit of the sun – solar day definition – solar second (1/864000) The earth’s rotation is not constant! Some days are longer/shorter than others – This lead to the introduction of the mean solar second. TAI seconds are produced by cesium-133-atom clocks.

6 6 TAI Clocks & Leap Seconds TAI seconds are of constant length, unlike solar seconds. Leap seconds are introduced when necessary to keep in phase with the sun. The introduced correction (based on TAI seconds and stays in sync with the sun’s rotation) is called Universal Coordinated Time (UTC)

7 7 UTC Services Short wave radio stations broadcast a short pulse at the start of each UTC second. MSF Station (UK) NIST (US) Geo-stationary Environment Operational Satellite (accurate to 0.5 msec)

8 8 Clock Synchronization Algorithms The relation between clock time and UTC when clocks tick at different rates. Maximum drift rate 1-p <= dC/dt <= 1+p If two clocks are drifting from UTC they can be as far apart as 2p at any given time  t. In an ideal world, C p (t) = t where C p (t) value clock on the machine p and t is the UTC time

9 9 Cristian's Clock Synch Algorithm Getting the current time from a time server. Requirement: if two clocks differ more than  must be resynchronized (in software) This must happen at least every  seconds.. Cristian’s Algorithm: send the current time from a server.

10 10 Christian’s Algorithm Problems Two problems: If the sender’s clock runs faster, the UTC time provided will be “earlier” – this could lead to inconsistencies (recompilation of source files etc). –Such a change should be introduced gradually –Slow down the timer of the CPU.. How to estimate the delays for shipping messages. –(T1-T2)/2 –If you can estimate the time it takes the time server to handle the interrupt and process the incoming message I –(T1-T2-I)/2

11 11 The Berkeley Algorithm a)The time daemon asks all the other machines for their clock values b)The machines answer c)The time daemon tells everyone how to adjust their clock

12 12 Averaging Distributed Algorithms One class of algorithms works by dividing time into fixed-length resynchronization intervals. The I-th interval starts at T 0 + iR and runs until T 0 +(i+1)*R –T 0 is an agreed upon moment in the past and –R is a system parameter. At the beginning of each interval every machine broadcasts its current time. After this broadcast, each machine starts a local timer to collect all other broadcasts with time S. When time S elapsed the average (in each machine) is computed. A slight variation: m lowest and n highest values (from the set collected in S period) are discarded. Why? Examples of such protocol: NTP (Network Time Protocol).

13 13 Logical Clocks In a network, it is important that all machines agree upon a time This time does not need to be in sync with the time broadcasted by the radio (all the time). In the make example, even if machines agree that it is 17:00 it does not really matter whether the UTC is 17:00:02.. Notion of logical clock (17:00).

14 14 Logical clocks Lamport defined the relation “happened-before” a-> b (event a happened before event b) The happened before relation can be observed in two settings: –If a and b are events in the same process, and a occurs before b, then a->b holds. –If a is the event of a message sent by one process, and b is the event of the message being received by another process b, then a->b holds (ie, a message cannot be received unless it has been sent)..

15 15 Logical Clocks Happened before is transitive If x and y happen in different processes that do not exchange messages then neither x->y nor y->x is true The time for an event a is C(a) If a->b then C(a) < C(b) Logical times go always forward (corrections can be made by additions – never subtractions!)

16 16 Lamport’s Algorithm Three processes each with its own clock-Clocks run at different frequencies. 0 6 12 18 24 30 36 42 48 54 60 0 8 16 24 32 40 48 56 64 72 80 100 0 10 20 30 40 90 80 70 60 50 A B C D

17 17 Lamport’s Algorithm-Solution Lamport’s algorithm corrects the clocks and provideds a way for total ordering of events If a happens before b in the same process C(a)<C(b) If a and b represent the sending and receing of a message respectively the C(a) < C(b) For all distinctive events a and b C(a) != C(b) 0 6 18 24 30 36 42 48 70 60 76 0 8 16 24 32 4040 48 61 69 7 85 100 0 10 20 30 40 90 80 70 60 50 A B C D 12

18 18 Lamport Timestamps There is obviously a problem here.. Queries run faster when work off replicas of data Two users (customer in San Fran and admin in NYC) 1. the customer from San Fran adds $100.00 to her account (at $1000 now) 2. the admin (from NYC) gives an increase of 1% to all accounts.

19 19 Problem with Replicated Data Problem: Updating a replicated database may leave it in an inconsistent state. The two copies should be exactly the same!! (no matter what the order of the operations – the order does not say much about the consistency of the data; simply says that one order, or the other, should be followed). This situation calls for a totally-orderd multicast (of operations). How can this be done?? Can we use Lamport’s algorithm?

20 20 Sketch of the Solution Group of processes multicasting messages to each other Each message is always time-stamped with the time of the sender Assume that messages from the same sender are received in the order they were sent and no messages are lost. When a process receives a msg, put it into the local queue and the receiver multicasts a ACK to the other processes. All processes will have the same copy (ordered) in their local queue! Lamport’s clocks ensure that NO two messages have the same timestamps!

21 21 Global State Global State = local states of the processes + message currently in transit. Why knowing the Global State is useful? –If local processes have stopped and no more msgs are in transit, then we have developed a stale situation where nonone can progress (ie, something needs to be done). Take a “distributed snapshot” –Reflects a consistent global state. –If a message has been received then it must have been sent from somewhere before! (otherwise something is wrong). –A global state can be represented by what is known as the cut. –Cuts can be consistent or inconsistent.

22 22 Cuts-Snapshots of Global State a)A consistent cut: one that does not include received but not sent messages! b)An inconsistent cut What we want to define here is an algorithm that provides an consistent Cut (snapshot) of the distributed system.

23 23 An Algorithm for Deriving a Distributed Snapshot Assumptions: each process (in the DS) is connected to each other via unidirectional point-2-point comm. channels (TCP connections) Any process may initiate the algorithm The initiating process starts by recording its local state and then sends a MARKER along each outgoing channel (indicating that the receiver should participate in the recording of the global state).

24 24 Global State Algorithm When a process Q receives a marker through its incoming channel C –If it has not record its own local state, it does so and sends Markers along its outgoing channels. –Otherwise, the marker that appeared on incoming channel signals that the state of the channel must be recorded (this is done by forming the sequence of messages received by Q since the last time Q recorded its state and before it received the marker). A process has finished when it has received a marker along each of its incoming channels and processed all of them. At that point, local state and messages in transit can be sent to a coordinator that assembles the global state.

25 25 Global State a)Organization of a process and channels for a distributed snapshot

26 26 Global State b)Process Q receives a marker for the first time and records its local state c)Q records all incoming message d)Q receives a marker for its incoming channel and finishes recording the state of the incoming channel

27 27 Distributed Computation Termination Algorithm When a process finishes its part of the snapshot returns either a DONE or a CONTINUE message to its predecessor. A DONE message is returned (both conds must be true) –All of Q’s successors have returned DONE messages. –Q has not received any message(s) between the point it recorded its state, and the point it had received the marker along each of its incoming channels. In all other cases, a CONTINUE messages is sent to Q’s predecessor. If the original initiator P receives only DONE from its successors –It means there are NO messages in transit –Therefore, computation is complete.

28 28 Election Algorithms Many distributed applications require that one site undertakes the role of the coordinator or master Problem: how to come up with such a master? Each process has a unique id –Network address + id in the local space.

29 29 The Bully Algorithm 7 was the coordinator and has just crashed.. The bully election algorithm Process 4 holds an election Process 5 and 6 respond, telling 4 to stop Now 5 and 6 each hold an election The process with the higher ID (or attribute) takes over..

30 30 Bully Algorithm d)Process 6 tells 5 to stop e)Process 6 wins and tells everyone If 7 wakes-up it can hold an election and “bully” all others (takes over).

31 31 A Ring Algorithm Election algorithm using a ring. Assumption: processes are physically or logically ordered Two phases: start an ELECTION (this can be doen by more than one sites) Once the circle is done determine the COORDINATOR (largest?) Circulate the name of the coordinator (ie, inform everyone)

32 32 Mutual Exclusion: A Centralized Algorithm a)Process 1 asks the coordinator for permission to enter a critical region. Permission is granted b)Process 2 then asks permission to enter the same critical region. The coordinator does not reply. c)When process 1 exits the critical region, it tells the coordinator, when then replies to 2

33 33 A Distributed Algorithm[RicartAgra81] a)Two processes want to enter the same critical region at the same moment. b)Process 0 has the lowest timestamp, so it wins. c)When process 0 is done, it sends an OK also, so 2 can now enter the critical region.

34 34 A Toke Ring Algorithm a)An unordered group of processes on a network. b)A logical ring constructed in software. Circulate a token – whoever has the token can get into its critical section

35 35 Comparison A comparison of three mutual exclusion algorithms. The infinity indicates that the token may be aimlessly circulated in a network (if no-one wants to make use of it). AlgorithmMessages per entry/exit Delay before entry (in message times) Problems Centralized32Coordinator crash Distributed2 ( n – 1 ) Crash of any process Token ring 1 to  0 to n – 1 Lost token, process crash

36 36 The Transaction Model Being able to group a number of statements together in an entity that its being executed ONLY in its logical entirety. A transaction may be concurrently executing with others in the same (or distributed) system. Examples of transactions (xactions) –Get Euro 100.00 from your own account –Deposit Euro 25.00 in account with number 356533 –Increase all accounts by 2.7% of their balances. The concept of transaction is supported by a few fundamental constructs.

37 37 The Transaction Model Programming primitives for transactions. PrimitiveDescription BEGIN_TRANSACTIONMake the start of a transaction END_TRANSACTIONTerminate the transaction and try to commit ABORT_TRANSACTIONKill the transaction and restore the old values READRead data from a file, a table, or otherwise WRITEWrite data to a file, a table, or otherwise

38 38 The Transaction Model a)Transaction to reserve three flights commits b)Transaction aborts when third flight is unavailable BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi; END_TRANSACTION (a) BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full => ABORT_TRANSACTION (b)

39 39 Xactions Properties ACID- properties (or known as ACIDity). A: atomicity C: consistency I : isolation D: durability

40 40 Distributed Transactions a)A nested transaction: for each one a fork is used by the parent transaction What happens in case of failure? b)A distributed transaction Separate distributed algorithms are needed to handle management (locking) of data and commitment of the whole transaction.

41 41 Implementation of Transactions using Shadows (shadow blocks) a)The file index and disk blocks for a three-block file b)The situation after a transaction has modified block 0 and appended block 3 c)After committing

42 42 Write Ahead Log (WAL) a) A transaction b) – d) The log before each statement is executed x = 0; y = 0; BEGIN_TRANSACTION; x = x + 1; y = y + 2 x = y * y; END_TRANSACTION; (a) Log [x = 0 / 1] (b) Log [x = 0 / 1] [y = 0/2] (c) Log [x = 0 / 1] [y = 0/2] [x = 1/4] (d) If a xaction succeeds, it commits (point of no return) Otherwise, the WAL is used to rollback to a consistent database state.

43 43 Concurrency Control General organization of managers for handling transactions.

44 44 Concurrency Control General organization of managers for handling distributed transactions.

45 45 Principle of Serializability a) – c) Three transactions T 1, T 2, and T 3 d) Possible schedules BEGIN_TRANSACTION x = 0; x = x + 1; END_TRANSACTION (a) BEGIN_TRANSACTION x = 0; x = x + 2; END_TRANSACTION (b) BEGIN_TRANSACTION x = 0; x = x + 3; END_TRANSACTION (c) Schedule 1x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3Legal Schedule 2x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3;Legal Schedule 3x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3;Illegal (d) Time

46 46 Two-Phase Locking Two-phase locking.

47 47 Strict Two-Phase Locking A transaction always reads committed values Avoids cascading aborts Distributed 2PL: –Schedulers on each machine take care of the locks (grant/release); –Operations are forwarded to local managers.

48 48 Time Stamp Ordering Each database item has a TS R (x) and a TS W (x) TS R (x) is set by the xaction that most recently read the item x TS W (x) is set by the xaction that most recently changed the value of x. Timestamp Algorithm Suppose that xaction T i with TS(T i ) issues read(x) –If TS(T i ) < TS w (x) then Read needs to read a value of x that was already written by another subsequent xaction; read is rejected and T i is rolled back. –If TS(T i ) >= TS w (x) then read is executed and TS R (x)=max{TS R (x), TS(T i )} Suppose that xaction T i with TS(T i ) issues write(x) –If TS(T i ) < TS R (x) then the xaction is rejected and T i is rolled back. –If TS(T i ) < TS w (x) then the write is rejected and T i is rolled back. –Otherwise, the write operation is executed, and TS w (x)=TS(T i ).

49 49 Timestamp Ordering Example T1T2A 150 160RT=0; WT=0 read(A)RT=150 read(A)RT=160 A:=A+1 write(A)WT=160 write(A) T1 aborts!!

50 50 Timestamp Ordering T1T2T3ABC 200150175RT=0RT=0RT=0 WT=0WT=0WT=0 read(B)RT=200 read(A)RT=150 read(C) RT=175 write(B)WT=200 write(A)WT=200 write(C) ABORT T2 write(A)

51 51 Optimistic Concurrency Control Idea: let everything go ahead and then before the transaction commits check to see whether anyone else is affected. R/W validate commit Structure of a Transaction When a transaction fails the test, it has to be rolled back.


Download ppt "1 Synchronization Synchronization in centralized systems is easy. Synchronization in distributed systems is much more difficult to achieve. Why do we need."

Similar presentations


Ads by Google