Reliable Distributed Systems Real Time Systems. Topics for this lecture Adding clocks to distributed systems Also, to non-distributed ones We’ll just.

Slides:



Advertisements
Similar presentations
Lecture 1: Logical, Physical & Casual Time (Part 2) Anish Arora CSE 763.
Advertisements

IT253: Computer Organization
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
CS5412: THE REALTIME CLOUD Ken Birman 1 Lecture XXIV CS5412 Spring 2014.
Reliable Communication in the Presence of Failures Kenneth Birman, Thomas Joseph Cornell University, 1987 Julia Campbell 19 November 2003.
CS 542: Topics in Distributed Systems Diganta Goswami.
G. Alonso, D. Kossmann Systems Group
Consistent Cuts Ken Birman. Idea  We would like to take a snapshot of the state of a distributed computation  We’ll do this by asking participants to.
Time and Clock Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
1 Complexity of Network Synchronization Raeda Naamnieh.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA.
Distributed Systems Fall 2010 Time and synchronization.
Ordering and Consistent Cuts Presented By Biswanath Panda.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Teaching material based on Distributed Systems: Concepts and Design, Edition 3, Addison-Wesley Copyright © George Coulouris, Jean Dollimore, Tim.
2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Clock Synchronization Ken Birman. Why do clock synchronization?  Time-based computations on multiple machines Applications that measure elapsed time.
Chapter 13 Embedded Systems
SynchronizationCS-4513, D-Term Synchronization in Distributed Systems CS-4513 D-Term 2007 (Slides include materials from Operating System Concepts,
Lecture 13 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Last quiz Max: 69 / Median: 52 / Min: 24 In a box outside.
Synchronization in Distributed Systems CS-4513 D-term Synchronization in Distributed Systems CS-4513 Distributed Computing Systems (Slides include.
Distributed Systems 2006 Group Membership * *With material adapted from Ken Birman.
CS533 - Concepts of Operating Systems
EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao Cleveland State University.
Real-Time Operating System Chapter – 8 Embedded System: An integrated approach.
EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao Cleveland State University.
Code Generation CS 480. Can be complex To do a good job of teaching about code generation I could easily spend ten weeks But, don’t have ten weeks, so.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Composition Model and its code. bound:=bound+1.
SM3121 Software Technology Mark Green School of Creative Media.
Time Supriya Vadlamani. Asynchrony v/s Synchrony Last class: – Asynchrony Event based Lamport’s Logical clocks Today: – Synchrony Use real world clocks.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
Add Fault Tolerance – order & time Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport Optimal Clock Synchronization T.K. Srikanth.
1 Physical Clocks need for time in distributed systems physical clocks and their problems synchronizing physical clocks u coordinated universal time (UTC)
Switching Techniques Student: Blidaru Catalina Elena.
J.H.Saltzer, D.P.Reed, C.C.Clark End-to-End Arguments in System Design Reading Group 19/11/03 Torsten Ackemann.
CS5412: THE CLOUD UNDER ATTACK! Ken Birman 1 Lecture XXIV.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Naming Name distribution: use hierarchies DNS X.500 and LDAP.
Agenda Fail Stop Processors –Problem Definition –Implementation with reliable stable storage –Implementation without reliable stable storage Failure Detection.
Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS?
A. Haeberlen Fault Tolerance and the Five-Second Rule 1 HotOS XV (May 18, 2015) Ang Chen Hanjun Xiao Andreas Haeberlen Linh Thi Xuan Phan Department of.
Time This powerpoint presentation has been adapted from: 1) sApr20.ppt.
Hwajung Lee. One of the selling points of a distributed system is that the system will continue to perform even if some components / processes fail.
Real-Time & MultiMedia Lab Synchronization Distributed System Jin-Seung,KIM.
Operating Systems 1 K. Salah Module 1.2: Fundamental Concepts Interrupts System Calls.
CSE 60641: Operating Systems Implementing Fault-Tolerant Services Using the State Machine Approach: a tutorial Fred B. Schneider, ACM Computing Surveys.
Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,
Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
Hwajung Lee. Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
For a good summary, visit:
Faults and fault-tolerance One of the selling points of a distributed system is that the system will continue to perform even if some components / processes.
Fault Tolerance (2). Topics r Reliable Group Communication.
Distributed Systems Lecture 9 Leader election 1. Previous lecture Middleware RPC and RMI – Marshalling 2.
Distributed Systems Lecture 5 Time and synchronization 1.
Fail-Stop Processors UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau One paper: Byzantine.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Real-Time Operating Systems RTOS For Embedded systems.
REAL-TIME OPERATING SYSTEMS
Distributed Computing
CS514: Intermediate Course in Operating Systems
CS514: Intermediate Course in Operating Systems
Maya Haridasan April 15th
Presentation transcript:

Reliable Distributed Systems Real Time Systems

Topics for this lecture Adding clocks to distributed systems Also, to non-distributed ones We’ll just touch on real-time operating systems and scheduling; an active area until recently Recent machines are so fast that importance of topic is now reduced… Using time in broadcast protocols Comparison of CASD with gossip

Many systems need real-time Air traffic control talks about “when” planes will be at various locations Doctors talk about how patients responded after drug was given, or change therapy after some amount of time Process control software runs factor floors by coordinating what machine tools do, and when

Many systems need real-time Video and multi-media systems need isochronous communication protocols that coordinate video, voice, and other data sources Telecommunications systems must guarantee real-time response despite failures, for example when switching telephone calls

Real time in real systems These real systems combine the need for logical consistency and coordination (for control) with the need for physical coordination and timely behavior Issue becomes one of integrating real-time tools and support with logical tools and support such as we have already considered Not real-time or logical time, but want both for different aspects of a single application!

Clock Synchronization and Synchronous Systems Up to now, we restricted attention to logical notions of time: “a happens before b” But recall that we also touched on real clocks Two views of real-time: Supporting clocks that programs can consult as needed Making direct use of real-time inside protocols

Clock Synchronization Topic was extremely active during early 1980’s Best known algorithms include the one in OSF/1 UNIX (based on one by Marzullo at Xerox), the “optimally accurate” algorithm of Srikanth and Toueg, and the “probabilistic” algorithm of Cristian Introduction of Global Positioning System is eliminating the need for this form of synchronization

Clock synchronization protocols Would like to think of network as having a single “clock” that all processes can consult But need to implement this using many “local” clocks that can: Initially be set incorrectly Drift over the course of time Clock synchronization tries to keep clocks close to true real-time and minimize their tendency to drift

Precision and Accuracy Accuracy measures local clocks relative to an external source of accurate real-time. Accurate clocks are close to real-time Precision measures local clocks relative to each other. Precise clocks are close to each other Skew is the numerical limit on the maximum dis- tance that correct clocks can drift apart. E.g. could say the “maximum skew is 1sec” for some system

How clock synchronization used to work Periodically, all processors would run a clock sync protocol, for example by broadcasting the reading from their clocks Each receives a set of values from the others (sets may differ due to faults!) Algorithm would pick a synchronized value from the set; analysis used to prove properties of clocks

Global Positioning System Satellite system launched by military in early 1990’s, became public and inexpensive Can think of satellites as broadcasting the time Small radio receiver picks up signals from three satellites and triangulates to determine position Same computation also yields extremely accurate clock (accurate to a few milliseconds)

Clock synchronization with GPS Put two GPS receivers (or more) on a network Periodically, receivers broadcast the “true time” Other machines only need to adjust their clocks to overcome propagation delays for clock sync messages through the network! Well matched to the a-posteriori clock synchronization approach

Basic idea GPS receiver broadcasts “the time is now 10:00” on a broadcast network (ethernet) Receivers note the time when they receive the message: [10:01, 9:58,....] and reply with values GPS receiver subtracts the median value Differences [1, -2,...] now give the drift of the clock of the destination relative to the median clock

A-posteriori method, adjustment stage Now we distribute these drift values back to the processors, which compensate for the rate of drift over the time period to the next synchronization Can show that this yields clocks that are optimal both in accuracy and precision A processor with a broken clock has a good chance of discovering it during synchronization

Using real-time One option is to use a real-time operating system, clock synchronization algorithm, and to design protocols that exploit time Example: MARS system uses pairs of redundant processors to perform actions fault-tolerantly and meet deadlines. Has been applied in process control systems. (Another example: Delta-4)

Features of real-time operating systems The O/S itself tends to be rather simple Big black boxes behave unpredictably They are structured in terms of “tasks” A task is more or less a thread But typically come with expected runtime, deadlines, priorities, “interruptability”, etc User decomposes application into task-like component parts and then expresses goals in a form that RTOS can handle Widely used on things like medical devices

RTOS can be beneficial Lockheed Martin ATL timed CORBA method invocations Variation in response time was huge with a normal Linux OS When using a Timesys RTOS the variability is eliminated!

Real-time broadcast protocols Can also implement broadcast protocols that make direct use of temporal information Examples: Broadcast that is delivered at same time by all correct processes (plus or minus the clock skew) Distributed shared memory that is updated within a known maximum delay Group of processes that can perform periodic actions

A real-time broadcast p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 t t+at+b * * * * * Message is sent at time t by p 0. Later both p 0 and p 1 fail. But message is still delivered atomically, after a bounded delay, and within a bounded interval of time (at non-faulty processes)

A real-time distributed shared memory p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 t t+at+b At time t p 0 updates a variable in a distributed shared memory. All correct processes observe the new value after a bounded delay, and within a bounded interval of time. set x=3 x=3

Periodic process group: Marzullo p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 Periodically, all members of a group take some action. Idea is to accomplish this with minimal communication

The CASD protocols Also known as the “  -T” protocols Developed by Cristian and others at IBM, was intended for use in the (ultimately, failed) FAA project Goal is to implement a timed atomic broadcast tolerant of Byzantine failures

Basic idea of the CASD protocols Assumes use of clock synchronization Sender timestamps message Recipients forward the message using a flooding technique (each echos the message to others) Wait until all correct processors have a copy, then deliver in unison (up to limits of the clock skew)

CASD picture p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 t t+at+b * * * * * p 0, p 1 fail. Messages are lost when echoed by p 2, p 3

Idea of CASD Assume known limits on number of processes that fail during protocol, number of messages lost Using these and the temporal assumptions, deduce worst-case scenario Now now that if we wait long enough, all (or no) correct process will have the message Then schedule delivery using original time plus a delay computed from the worst-case assumptions

The problems with CASD In the usual case, nothing goes wrong, hence the delay can be very conservative Even if things do go wrong, is it right to assume that if a message needs between 0 and  ms to make one hope, it needs [0,n*  ] to make n hops? How realistic is it to bound the number of failures expected during a run?

CASD in a more typical run p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 t t+at+b * * * * * *

... leading developers to employ more aggressive parameter settings p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 t t+at+b * * * * * *

CASD with over-aggressive paramter settings starts to “malfunction” p0p0 p1p1 p2p2 p3p3 p4p4 p5p5 t t+at+b * all processes look “incorrect” (red) from time to time * * *

CASD “mile high” When run “slowly” protocol is like a real-time version of abcast When run “quickly” protocol starts to give probabilistic behavior: If I am correct (and there is no way to know!) then I am guaranteed the properties of the protocol, but if not, I may deliver the wrong messages

How to repair CASD in this case? Gopal and Toueg developed an extension, but it slows the basic CASD protocol down, so it wouldn’t be useful in the case where we want speed and also real-time guarantees Can argue that the best we can hope to do is to superimpose a process group mechanism over CASD (Verissimo and Almeida are looking at this).

Why worry? CASD can be used to implement a distributed shared memory (“delta-common storage”) But when this is done, the memory consistency properties will be those of the CASD protocol itself If CASD protocol delivers different sets of messages to different processes, memory will become inconsistent

Why worry? In fact, we have seen that CASD can do just this, if the parameters are set aggressively Moreover, the problem is not detectable either by “technically faulty” processes or “correct” ones Thus, DSM can become inconsistent and we lack any obvious way to get it back into a consistent state

Using CASD in real environments Would probably need to set the parameters close to the range where CASD can malfunction, but rarely Hence would need to add a self-stabilization algorithm to restore consistent state of memory after it becomes inconsistent Problem has not been treated in papers on CASD pbcast protocol does this

Using CASD in real environments Once we build the CASD mechanism how would we use it? Could implement a shared memory Or could use it to implement a real-time state machine replication scheme for processes US air traffic project adopted latter approach But stumbled on many complexities…

Using CASD in real environments Pipelined computation Transformed computation

Issues? Could be quite slow if we use conservative parameter settings But with aggressive settings, either process could be deemed “faulty” by the protocol If so, it might become inconsistent Protocol guarantees don’t apply No obvious mechanism to reconcile states within the pair Method was used by IBM in a failed effort to build a new US Air Traffic Control system

Similar to MARS Research system done in Austria by Hermann Kopetz Basic idea is that everything happens twice Receiver can suppress duplicates but is guaranteed of at least one copy of each message Used to overcome faults without loss of real-time guarantees MARS is used in the BMW but gets close to a hardware f.tol. scheme

Many more issues…. What if a process starts to lag? What if applications aren’t strictly deterministic? How should such a system be managed? How can a process be restarted? If not, the system eventually shuts down! How to measure the timing behavior of components, including the network

FAA experience? It became too hard to work all of this out Then they tried a transactional approach, also had limited success Finally, they gave up! $6B was lost… A major fiasco, ATC is still a mess

Totem approach Start with extended virtual synchrony model Analysis used to prove real-time delivery properties Enables them to guarantee delivery within about ms on a standard broadcast LAN Contrast with our 85us latency for Horus!

Tradeoffs between consistency, time Notice that as we push CASD to run faster we lose consistency Contrast with our virtual synchrony protocols: they run as fast as they can (often, much faster than CASD when it is not malfunctioning) but don’t guarantee real-time delivery

A puzzle Suppose that experiments show that 99.99% of Horus or Ensemble messages are delivered in 85us +/- 10us for some known maximum load Also have a theory that shows that 100% of Totem messages are delivered in about 150ms for reasonable assumptions And have the CASD protocols which work well with  around 250ms for similar LAN’s

A puzzle Question: is there really a difference between these forms of guarantees? We saw that CASD is ultimately probabilistic. Since Totem makes assumptions, it is also, ultimately, probabilistic But the experimentally observed behavior of Horus is also probabilistic... so why isn’t Horus a “real-time” system?

What does real-time mean? To the real-time community? A system that provably achieves its deadlines under stated assumptions Often achieved using delays! To the pragmatic community? The system is fast enough to accomplish our goals Experimentally, it never seems to lag behind or screw up

Some real-time issues Scheduling Given goals, how should tasks be scheduled? Periodic, a-periodic and completely ad-hoc tasks What should we do if a system misses its goals? How can we make components highly predictable in terms of their real-time performance profile?

Real-time today Slow transition Older, special purpose operating systems and components, carefully hand-crafted for predictability Newer systems are simply so fast (and can be dedicated to task) that what used to be hard is now easy In effect, we no longer need to worry about real- time, in many cases, because our goals are so easily satisfied!