CS4513 Distributed Computer Systems Synchronization (Ch 5)

Slides:

Advertisements

Similar presentations

Dr. Kalpakis CMSC621 Advanced Operating Systems Synchronization.

Advertisements

1 CS 194: Elections, Exclusion and Transactions Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.

Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.

1 Chapter 3. Synchronization. STEMPusan National University STEM-PNU 2 Synchronization in Distributed Systems Synchronization in a single machine Same.

Synchronization. Why Synchronize? Often important to control access to a single, shared resource. Also often important to agree on the ordering of events.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.

Distributed Systems Spring 2009

CS 582 / CMPE 481 Distributed Systems

Chapter 5  Synchronization 1 Synchronization Chapter 5.

Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.

1. Explain why synchronization is so important in distributed systems by giving an appropriate example When each machine has its own clock, an event that.

Synchronization. Physical Clocks Solar Physical Clocks Cesium Clocks International Atomic Time Universal Coordinate Time (UTC) Clock Synchronization Algorithms.

Synchronization Clock Synchronization Logical Clocks Global State Election Algorithms Mutual Exclusion.

Computer Science Lecture 11, page 1 CS677: Distributed OS Last Class: Clock Synchronization Logical clocks Vector clocks Global state.

Synchronization Part 2 REK’s adaptation of Claypool’s adaptation ofTanenbaum’s Distributed Systems Chapter 5 and Silberschatz Chapter 17.

Synchronization Chapter 5.

SynchronizationCS-4513, D-Term Synchronization in Distributed Systems CS-4513 D-Term 2007 (Slides include materials from Operating System Concepts,

Synchronization Tanenbaum Chapter 5. Synchronization Multiple processes sometimes need to agree on order of a sequence of events. This requires some synchronization,

Synchronization in Distributed Systems CS-4513 D-term Synchronization in Distributed Systems CS-4513 Distributed Computing Systems (Slides include.

EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.

EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.

1 Synchronization  Clock Synchronization  and algorithm.

Clock Synchronization and algorithm

1 Synchronization Synchronization in centralized systems is easy. Synchronization in distributed systems is much more difficult to achieve. Why do we need.

Synchronization Chapter 5. Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned.

Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.

1 Synchronization Part 1 REK’s adaptation of Claypool’s adaptation of Tanenbaum’s Distributed Systems Chapter 5.

Synchronization.

Synchronization Chapter 6 Part III Transactions. –Most of the lecture notes are based on slides by Prof. Jalal Y. Kawash at Univ. of Calgary –Some slides.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Synchronization Chapter 6

1 Synchronization. 2 Why Synchronize?  Often important to control access to a single, shared resource.  Also often important to agree on the ordering.

OS2- Sem , R. Jalili Synchronization Chapter 5.

Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Naming Name distribution: use hierarchies DNS X.500 and LDAP.

Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.

Real-Time & MultiMedia Lab Synchronization Chapter 5.

1 Mutual Exclusion: A Centralized Algorithm a)Process 1 asks the coordinator for permission to enter a critical region. Permission is granted b)Process.

Synchronization CSCI 4900/6900. Importance of Clocks & Synchronization Avoiding simultaneous access of resources –Cooperate to grant exclusive access.

Global State (1) a)A consistent cut b)An inconsistent cut.

Synchronization CSCI 4780/6780. Mutual Exclusion Concurrency and collaboration are fundamental to distributed systems Simultaneous access to resources.

Synchronization Chapter 5. Outline 1.Clock synchronization 2.Logical clocks 3.Global state 4.Election algorithms 5.Mutual exclusion 6.Distributed transactions.

Synchronization Chapter 5.

Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.

Real-Time & MultiMedia Lab Synchronization Distributed System Jin-Seung,KIM.

Synchronization Chapter Contents qClock Synchronization qLogical Clocks qGlobal State qElection Algorithms qMutual Exclusion qDistributed Transactions.

Synchronization CSCI 4900/6900. Importance of Clocks & Synchronization Avoiding simultaneous access of resources –Cooperate to grant exclusive access.

Synchronization Chapter 5. Table of Contents Clock Synchronization Logical Clocks Global State Election Algorithms Mutual Exclusion.

Distributed Process Coordination Presentation 1 - Sept. 14th 2002 CSE Spring 02 Group A4:Chris Sun, Min Fang, Bryan Maden.

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

Chapter 5 Synchronization Presenter: Maria Riaz. Distributed Systems – Fall 2004 – Prof. SY Lee2 Sequence of Presentation Synchronization Clock Synchronization.

Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time.

Election Distributed Systems. Algorithms to Find Global States Why? To check a particular property exist or not in distributed system –(Distributed) garbage.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Synchronization. Clock Synchronization In a centralized system time is unambiguous. In a distributed system agreement on time is not obvious. When each.

Synchronization Chapter 5. Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned.

Synchronization CSCI 4900/6900. Transactions Protects data and allows processes to access and modify multiple data items as a single atomic transaction.

COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.

Synchronization Chapter 5. Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned.

Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.

Lecture on Synchronization Submitted by

Synchronization Tanenbaum Chapter 5. Synchronization Multiple processes sometimes need to agree on order of a sequence of events. This requires some synchronization,

Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.

Atomic Tranactions. Sunmeet Sethi. Index  Meaning of Atomic transaction.  Transaction model Types of storage. Transaction primitives. Properties of.

Last Class: Canonical Problems

Distributed Computing

Mutual Exclusion What is mutual exclusion? Single processor systems

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

Last Class: Naming Name distribution: use hierarchies DNS

Presentation transcript:

CS4513 Distributed Computer Systems Synchronization (Ch 5)

Introduction Communication not enough. Need cooperation  Synchronization Distributed synchronization needed for –transactions (bank account via ATM) –access to shared resource (network printer) –ordering of events (network games where players have different ping times)

Outline Intro(done) Clock Synchronization(next) Global Time and State Election Algorithms Mutual Exclusion Distributed Transactions

Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time Consider make –Compiling machine compares time stamps Same holds when using NFS mount Can we set all clocks in a distributed system to have the same time?

Physical Clocks “Exact” time was computed by astronomers –Take “noon” for two days, divide by 24*60*60  Mean solar second But … –Earth is slowing! (35 days over 300 million years) –Short term fluctuations (Magma core, and such) –Could take many days for average, but still erroneous Physicists take over (Jan 1, 1958) –Count transitions of cesium 133 atom 9,192,631,770 == 1 solar second –50 cesium 133 clocks averaged International Atomic Time (TAI) –To stop day from “shifting” (remember, earth is slowing) translate TAI into Universal Coordinated Time (UTC) UTC is broadcast (shortwave radio pulses)

Clock Synchronization Algorithms Not every machine has UTC receiver –If one, then keep others synchronized Computer timers go off H times/sec, incr counter Ideally, if H=60, 216,000 per hour (dC/dt = 0) But typical errors, 10 –5, so 215,998 to 216,002 Specs can give you maximum drift rate (  ) Every  t seconds, will be at most 2  t apart If want drift of , re- synchronize every  /2   Various algs (next)

Cristian's Algorithm Every  /2 , ask server for time What are the problems? Major –Client clock is fast –What to do? Minor –Non-zero amount of time to sender –What to do?

Cristian's Algorithm Want one-way  (T 1 – T 0 )/2. Problems? –T 0 != T 1 ? Ignore. –Variance? Take average. Or smallest. –I? Can subtract, but need to determine time.

The Berkeley Algorithm a)The time daemon asks all the other machines for their clock values b)The machines answer c)The time daemon tells everyone how to adjust their clock Cristian’s and Berkeley’s are centralized. Problems?

Decentralized Algorithms Periodically (every R seconds), each machine broadcasts current time Collect time samples for some time time (S) Take average and set time Can discard m so m faulty clocks don’t hurt Can improve by computing (T 1 – T 0 )/2 –Need probes to obtain Used by Network Time Protocol (NTP) –Worldwide accuracy of 1-50 msec

Outline Intro(done) Clock Synchronization(done) Global Time and State(next) Election Algorithms Mutual Exclusion Distributed Transactions

Lamport Timestamps a)Each processes with own clock with different rates. b)Lamport's algorithm corrects the clocks. c)Can add machine ID to break ties Often don’t need time, but ordering a  b (happens before) (impossible)

Use Example: Totally-Ordered Multicasting San Fran customer adds $100, NY bank adds 1% interest –San Fran will have $1,111 and NY will have $1,110 Updating a replicated database and leaving it in an inconsistent state. Can use Lamport’s to totally order (San Francisco)(New York) (+$100)(+1%)

Consistent Global State a)A consistent cut b)An inconsistent cut How do ensure always a consistent cut? Need for state of distributed system, say, for termination detection

Consistent Global State (2) Processes all connected. Can initiate state message (M) a)Organization of a process and channels for a distributed snapshot

Consistent Global State (3) b)Process Q receives M for the first time and records its local state. Sends M on all outgoing links c)Q records all incoming messages d)Q receives M for its incoming channel and finishes recording the state of the incoming channel Can then send state to initiating process System can still proceed normally

Outline Intro(done) Clock Synchronization(done) Global Time and State(done) Election Algorithms(next) Mutual Exclusion Distributed Transactions

Election Algorithms Often need one process as a coordinator All processes in distributed systems may be equal –Assume have some “ID” that is a number Need way to “elect” process with the highest number as leader

The Bully Algorithm (1) Process 4 notices 7 down Process 4 holds an election Process 5 and 6 respond, telling 4 to stop Now 5 and 6 each hold an election

The Bully Algorithm (2) d)Process 6 tells process 5 to stop e)Process 6 wins and tells everyone Eventually “biggest” (bully) wins If processes 7 comes up, starts elections again

A Ring Algorithm Coordinator down, start ELECTION –Send message down ring, add ID –Once around, change to COORDINATOR (biggest) Even if two ELECTIONS started at once, everyone will pick same leader

Outline Intro(done) Clock Synchronization(done) Global Time and State(done) Election Algorithms(done) Mutual Exclusion(next) Distributed Transactions

Mutual Exclusion: A Centralized Algorithm a)Process 1 asks the coordinator for permission to enter a critical region. Permission is granted b)Process 2 then asks permission to enter the same critical region. The coordinator does not reply. (Or, can say “denied”) c)When process 1 exits the critical region, it tells the coordinator, when then replies to 2. But centralized, single point of failure

A Distributed Algorithm a)Processes 0 and 2 want to enter the same critical region at the same moment. b)Process 1 doesn’t want to, says “OK”. Process 0 has the lowest timestamp, so it wins. Queues up “OK” for 2. c)When process 0 is done, it sends an OK to 2 so can now enter the critical region. (Again, can modify to say “denied”)

A Token Ring Algorithm a)An unordered group of processes on a network. b)A logical ring constructed in software. Process must have token to enter. If don’t want to enter, pass token along. If host down, recover ring. If token lost, regenerate token. If in critical section long?

Mutual Exclusion Algorithm Comparison Centralized most efficient Token ring efficient when many want to use critical region Algorithm Messages per entry/exit Delay before entry (in message times) Problems Centralized 32Coordinator crash Distributed 2 ( n – 1 ) Process crash Token ring 1 to  0 to n – 1 Lost token, process crash

Outline Intro(done) Clock Synchronization(done) Global Time and State(done) Election Algorithms(done) Mutual Exclusion(done) Distributed Transactions(next)

The Transaction Model Gives you mutual exclusion plus… Consider using PC (Quicken) to: –Withdraw $a from account 1 –Deposit $a to account 2 If interrupt between 1) and 2), $a gone! Multiple items in single, atomic action –It all happens, or none –If process backs out, as if never started

Transaction Primitives Above may be system calls, libraries or statements in a language (Sequential Query Language or SQL) PrimitiveDescription BEGIN_TRANSACTIONMake the start of a transaction END_TRANSACTIONTerminate the transaction and try to commit ABORT_TRANSACTIONKill the transaction and restore the old values READRead data from a file, a table, or otherwise WRITEWrite data to a file, a table, or otherwise

Example: Reserving Flight from White Plains to Nairobi a)Transaction to reserve three flights commits b)Transaction aborts when third flight is unavailable The “all-or-nothing” is one property. Others… BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi; END_TRANSACTION (a) BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full => ABORT_TRANSACTION (b)

Transaction Properties 1)Atomic – Others don’t see intermediate results, either 2)Consistent System invariants not violated Ex: no money lost after operations) 3)Isolated Operations can happen in parallel but as if were done serially 4)Durability Once commits, move forward (Ch 7, won’t cover more) ACID

Classification of Transactions Flat Transactions –Limited –Example: what if want to keep first part of flight reservation? If abort and then restart, those might be gone. –Example: what if want to move a Web page. All links pointing to it would need to be updated. It could lock resources for a long time Also Distributed and Nested Transactions

Nested vs. Distributed Transactions Nested transaction gives you a hierarchy –Can distribute (example: WP  JFK, JFK  Nairobi) –But may require multiple databases Distributed transaction is “flat” but across distributed data (example: JFK and Nairobi dbase)

Outline Intro(done) Clock Synchronization(done) Global Time and State(done) Election Algorithms(done) Mutual Exclusion(done) Distributed Transactions –Overview(done) –Implementation(next)

Private Workspace (1) File system with transaction across multiple files –Normally, updates seen + No way to undo Private Workspace  Copy files Only update Public Workspace once done If abort transaction, remove private copy. But copy can be expensive! –How to fix?

Private Workspace (2) a)Original file index (descriptor) and disk blocks b)Copy descriptor only. Copy blocks only when written. Modified block 0 and appended block 3 c)Replace original file (new blocks plus descriptor) after commit

Writeahead Log a)A transaction b) – d) log before each statement is executed If transaction commits, nothing to do If transaction is aborted, use log to rollback x = 0; y = 0; BEGIN_TRANSACTION; x = x + 1; y = y + 2 x = y * y; END_TRANSACTION; (a) Log [x = 0 / 1] (b) Log [x = 0 / 1] [y = 0/2] (c) Log [x = 0 / 1] [y = 0/2] [x = 1/4] (d) - Don’t make copies. Instead, record action plus old and new values.

Concurrency Control (1) General organization of managers for handling transactions. Allow parallel execution (ensure atomic) (ensure serial)

Concurrency Control (2) General organization of managers for handling distributed transactions.

Serializability a) – c) Three transactions T 1, T 2, and T 3. Answer could be 1, 2 or 3. All valid. BEGIN_TRANSACTION x = 0; x = x + 1; END_TRANSACTION (a) BEGIN_TRANSACTION x = 0; x = x + 2; END_TRANSACTION (b) BEGIN_TRANSACTION x = 0; x = x + 3; END_TRANSACTION (c) Schedule 1 x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3 Legal Schedule 2 x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3; Legal Schedule 3 x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3; Illegal Allow parallel execution, but end result as if serial If in parallel, only some possible schedules 2 is serialized Concurrency controller needs to manage

Two-Phase Locking Acquire locks (ex: in previous example). Perform update. Release. Can lead to deadlocks (use OS techniques to resolve) Can prove: if used by all transactions, then all schedules will be serializable

Timestamp Ordering Pessimistic –Every read and write gets a timestamp (unique, using Lamport’s alg) –If conflict, abort sub-operation and re-try Optimistic –Allow all operations since conflict rate –At end, if conflict, roll-back