The 14 th IEEE Real-Time and Embedded Technology and Applications Symposium, April 2008 Real-Time Distributed Discrete-Event Execution with Fault Tolerance.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Knowledge Based Synthesis of Control for Distributed Systems Doron Peled.
Uncoordinated Checkpointing The Global State Recording Algorithm.
Parallel and Distributed Simulation Time Warp: Basic Algorithm.
Lookahead. Outline Null message algorithm: The Time Creep Problem Lookahead –What is it and why is it important? –Writing simulations to maximize lookahead.
Leveraging Synchronized Clocks in Distributed Applications Edward A. Lee Robert S. Pepper Distinguished Professor UC Berkeley Swarm Lab Retreat January.
February 21, 2008 Center for Hybrid and Embedded Software Systems Cyber-Physical Systems (CPS): Orchestrating networked.
PTIDES Project Overview
Overview of PTIDES Project
Synthesis of Embedded Software Using Free-Choice Petri Nets.
2/11/2010 BEARS 2010 On PTIDES Programming Model John Eidson Jeff C. Jensen Edward A. Lee Slobodan Matic Jia Zou PtidyOS.
Page 1 Building Reliable Component-based Systems Chapter 13 -Components in Real-Time Systems Chapter 13 Components in Real-Time Systems.
PTIDES: Programming Temporally Integrated Distributed Embedded Systems Yang Zhao, EECS, UC Berkeley Edward A. Lee, EECS, UC Berkeley Jie Liu, Microsoft.
Architecture Modeling and Analysis for Embedded Systems Oleg Sokolsky CIS700 Fall 2005.
Ordering and Consistent Cuts Presented By Biswanath Panda.
Integrated Design and Analysis Tools for Software-Based Control Systems Shankar Sastry (PI) Tom Henzinger Edward Lee University of California, Berkeley.
NSF Foundations of Hybrid and Embedded Software Systems UC Berkeley: Chess Vanderbilt University: ISIS University of Memphis: MSI A New System Science.
April 16, 2009 Center for Hybrid and Embedded Software Systems PtidyOS: An Operating System based on the PTIDES Programming.
Causality Interface  Declares the dependency that output events have on input events.  D is an ordered set associated with the min ( ) and plus ( ) operators.
February 12, 2009 Center for Hybrid and Embedded Software Systems Encapsulated Model Transformation Rule A transformation.
The Case for Precision Timed (PRET) Machines Edward A. Lee Professor, Chair of EECS UC Berkeley With thanks to Stephen Edwards, Columbia University. National.
1 Quasi-Static Scheduling of Embedded Software Using Free-Choice Petri Nets Marco Sgroi, Alberto Sangiovanni-Vincentelli Luciano Lavagno University of.
Dataflow Process Networks Lee & Parks Synchronous Dataflow Lee & Messerschmitt Abhijit Davare Nathan Kitchen.
Design of Fault Tolerant Data Flow in Ptolemy II Mark McKelvin EE290 N, Fall 2004 Final Project.
Chess Review November 21, 2005 Berkeley, CA Edited and presented by Causality Interfaces and Compositional Causality Analysis Rachel Zhou UC Berkeley.
7th Biennial Ptolemy Miniconference Berkeley, CA February 13, 2007 Cyber-Physical Systems: A Vision of the Future Edward A. Lee Robert S. Pepper Distinguished.
NSF Foundations of Hybrid and Embedded Software Systems UC Berkeley: Chess Vanderbilt University: ISIS University of Memphis: MSI A New System Science.
SEC PI Meeting Annapolis, May 8-9, 2001 Component-Based Design of Embedded Control Systems Edward A. Lee & Jie Liu UC Berkeley with thanks to the entire.
February 12, 2009 Center for Hybrid and Embedded Software Systems Model Transformation Using ERG Controller Thomas H. Feng.
7th Biennial Ptolemy Miniconference Berkeley, CA February 13, 2007 PTIDES: A Programming Model for Time- Synchronized Distributed Real-time Systems Yang.
NSF Foundations of Hybrid and Embedded Software Systems UC Berkeley: Chess Vanderbilt University: ISIS University of Memphis: MSI Program Review May 10,
Designing Predictable and Robust Systems Tom Henzinger UC Berkeley and EPFL.
User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
SUPERB-IT Center For Hybrid & Embedded Software Systems COLLEGE OF ENGINEERING, UC BERKELEY July 29, 2005 SUPERB-IT.
DATE Optimizations of an Application- Level Protocol for Enhanced Dependability in FlexRay Wenchao Li 1, Marco Di Natale 2, Wei Zheng 1, Paolo Giusto.
EMBEDDED SOFTWARE Team victorious Team Victorious.
Advanced Operating Systems CIS 720 Lecture 1. Instructor Dr. Gurdip Singh – 234 Nichols Hall –
Low Contention Mapping of RT Tasks onto a TilePro 64 Core Processor 1 Background Introduction = why 2 Goal 3 What 4 How 5 Experimental Result 6 Advantage.
An efficient active replication scheme that tolerate failures in distributed embedded real-time systems Alain Girault, Hamoudi Kalla and Yves Sorel Pop.
FPGA FPGA2  A heterogeneous network of workstations (NOW)  FPGAs are expensive, available on some hosts but not others  NOW provide coarse- grained.
A Simple Distributed Method for Control over Wireless Networks Authors: Miroslav Pajic, Shereyas Sundaram, George J. Pappas and Rahul Mangharam Presented.
Invitation to Computer Science 5 th Edition Chapter 6 An Introduction to System Software and Virtual Machine s.
Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.
Advanced Principles of Operating Systems (CE-403).
CPSC 372 John D. McGregor Module 3 Session 1 Architecture.
Conformance Test Experiments for Distributed Real-Time Systems Rachel Cardell-Oliver Complex Systems Group Department of Computer Science & Software Engineering.
Design Languages in 2010 Chess: Center for Hybrid and Embedded Software Systems Edward A. Lee Professor UC Berkeley Panel Position Statement Forum on Design.
A flexible simulator for control- dominated distributed real-time systems Johannes Petersson IDA/SaS/ESLAB Johannes Petersson IDA/SaS/ESLAB Master’s Thesis.
Constraints Assisted Modeling and Validation Presented in CS294-5 (Spring 2007) Thomas Huining Feng Based on: [1]Constraints Assisted Modeling and Validation.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Incremental Checkpointing with Application to Distributed Discrete Event Simulation Thomas Huining Feng and Edward A. Lee {tfeng,
A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.
UML Activity Diagrams.
Wireless sensor and actor networks: research challenges
Design-Directed Programming Martin Rinard Daniel Jackson MIT Laboratory for Computer Science.
Scheduling Messages with Deadlines in Multi-hop Real- time Sensor Networks Wei Song.
“Distributed Algorithms” by Nancy A. Lynch SHARED MEMORY vs NETWORKS Presented By: Sumit Sukhramani Kent State University.
ECE 526 – Network Processing Systems Design Programming Model Chapter 21: D. E. Comer.
Invitation to Computer Science 6th Edition
CS/CE/TE 6378 Advanced Operating Systems
Advanced Operating Systems CIS 720
Dynamo: A Runtime Codesign Environment
EECS 498 Introduction to Distributed Systems Fall 2017
Parallel and Distributed Simulation
Model Transformation with the Ptera Controller
An overview of the CHESS Center
Parallel Exact Stochastic Simulation in Biochemical Systems
Presentation transcript:

The 14 th IEEE Real-Time and Embedded Technology and Applications Symposium, April 2008 Real-Time Distributed Discrete-Event Execution with Fault Tolerance Thomas Huining Feng and Edward A. Lee Center for Hybrid and Embedded Software Systems EECS, UC Berkeley {tfeng,

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley …, e 2 = (null, t 2 ) Distributed Discrete-Event Execution Strategy Execution strategy decides whether/when it is safe to process an input event. Conventional: Compute can process top event e 1 if e 2 has a greater time stamp. Null message (null, t 2 ) Cons: overhead, sensitive to faults, lack of real-time property 2 A A B B Sensor Source Merge … … …, e 1 = (v 1, t 1 ) C C Compute Actuator … …, e 2 = (v 2, t 2 ) i1i1 i2i2

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Overview of Our Approach Leverage time-synchronized platforms Eliminate null messages Potentially improves concurrency Decompose assertions of real-time properties Recover software components from faults 3 A A B B Sensor Source Merge C C Compute Actuator

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley n cameras located around a football field, all connected to a central computer. Events at blue ports satisfy t ≤ τ (t – time stamp of any event; τ – real time) Events at red ports satisfy t ≥ τ Reference Application: Distributed Cameras 4

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Reference Application: Distributed Cameras Problems to solve: – Make event-processing decisions locally – Guarantee timely command delivery to the Devices – Guarantee real-time update at the Display – Tolerate images loss or corruption at Image Processor 5

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Minimum Model-Time Delay δ δ : P × P → R + U {∞} returns the minimum model-time delay between any two ports. (P – set of ports; R + – set of non-negative reals.) Example: δ(i 5, o 1 ) = min{δ 5 +δ 1, δ 5 +δ 4 +δ 2 }, where δ 1, …, δ 6  R + are pre-defined. 6 i1i1 i2i2 i3i3 o1o1 o2o2 i5i5 o5o5 δ5δ5 δ6δ6 i6i6 i4i4 δ4δ4 δ4'δ4' o3o3 o4o4 δ1δ1 δ2δ2 δ3δ3

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Intuition of Execution Strategy When is it safe to process e = (v, t) at i 1 ? 1.future events at i 1, i 2 and i 3 have time stamps ≥ t (conventional), or 2.future events at i 1 and i 2 have time stamps ≥ t, or 3.future events at i 1 have time stamps ≥ t, and future events at i 2 depend on events at i 4 with time stamps ≥ t – δ 4, or 4.future events at i 1 and i 2 depend on events at i 5 and i 6 with time stamps ≥ t – min{δ 5, δ 6, δ 5 + δ 4, δ 6 + δ 4 }. e = (v, t) 7 i1i1 i2i2 i3i3 o1o1 o2o2 i5i5 o5o5 δ5δ5 δ6δ6 i6i6 i4i4 δ4δ4 δ4'δ4' o3o3 o4o4 δ1δ1 δ2δ2 δ3δ3

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Relevant Dependency i ~ i' iff they are input of the same actor and affect a common output. An equivalence class is a transitive closure of ~. Construct a collapsed graph, and compute relevant dependency between equivalence classes. d(ε', ε) = min i'  ε', i  ε {δ(i', i)} 8 i1i1 i2i2 i3i3 o1o1 o2o2 i5i5 o5o5 δ5δ5 δ6δ6 i6i6 i4i4 δ4δ4 δ4'δ4' o3o3 o4o4 δ1δ1 δ2δ2 δ3δ3 min{δ 5, δ 6 } ε3ε3 ε1ε1 ε2ε2 min{δ 5, δ 6, δ 5 + δ 4, δ 6 + δ 4 } ε4ε4 δ4δ4 δ4'δ4' min{δ 5 + δ 4 ', δ 6 + δ 4 '}

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Dependency Cut A dependency cut for ε is a minimal but complete set of equivalence classes that needs to be considered to process an event at ε. Example: C 1 and C 2 are both dependency cuts for ε 1. 9 min{δ 5, δ 6 } ε3ε3 ε1ε1 ε2ε2 min{δ 5, δ 6, δ 5 + δ 4, δ 6 + δ 4 } ε4ε4 δ4δ4 δ4'δ4' min{δ 5 + δ 4 ', δ 6 + δ 4 '} C1C1 C2C2

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Execution Strategy Determine top event e = (v, t) at ε 1 safe to process If we choose C 1 : future events at ε 1 have time stamps ≥ t. If we choose C 2 : for any ε  C 2, future events at in ε 1 depend on events at ε with time stamps ≥ t – d(ε, ε 1 ). In general, we can freely choose any dependency cut. 10 min{δ 5, δ 6 } ε3ε3 ε1ε1 ε2ε2 min{δ 5, δ 6, δ 5 + δ 4, δ 6 + δ 4 } ε4ε4 δ4δ4 δ4'δ4' min{δ 5 + δ 4 ', δ 6 + δ 4 '} C1C1 C2C2

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Implementation of the Execution Strategy n + 1 platforms with synchronized clocks (IEEE 1588). Choose dependency cuts at platform boundary. A queue stores events local to the platform. At real time τ, future events have time stamps ≥ τ – d n. 11 network latency d n … Queue

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Tolerating Loss of Images Start the composition as soon as the starting packets are received. Create a checkpoint at the beginning (small constant overhead) Backtrack when fault is detected (linear in memory locations) In most cases, discard the checkpoint (garbage collection) connection lost Camera 1 Camera 2 12 start

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley A Program Transformation Approach Before TransformationAfter Transformation int s; void f(int i) { s = i; } int s; void f(int i) { $ASSIGN$s(i); } An assignment is transformed into a function call to record the old value: private final int $ASSIGN$s(int newValue) { if ($CHECKPOINT != null && $CHECKPOINT.getTimestamp() > 0) { $RECORD$s.add(null, s, $CHECKPOINT.getTimestamp()); } return s = newValue; } This incurs a constant overhead. 13

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Observation: The overhead for each basic operation is constant. A Program Transformation Approach Before TransformationAfter Transformation int s; void f(int i) { s = i; } int s; void f(int i) { $ASSIGN$s(i); } Image img; int partNum; void consume(Packet p1, Packet p2) { if (img == null) { img = new Image(); partNum = 0; } img.parts[partNum] = compose(p1, p2); partNum++; } Image img; int partNum; void consume(Packet p1, Packet p2) { if (img == null) { $ASSIGN$img(new Image()); $ASSIGN$partNum(0); } img.$ASSIGN$parts(partNum, compose(p1, p2)); $ASSIGN$SPECIAL$partNum(11, -1); } 0:+= 1:-=... 11:++ 12:-- 14 Value, not used for ++.

RTAS 2008Feng & Lee, CHESS, EECS, UC Berkeley Conclusion and Future Work Advantages – Eliminate null messages – Decompose real-time schedulability analysis – Advance the system even when some platforms fail – Tolerate faults without sacrificing real-time properties Future Work – Examine different choices of dependency cuts – Develop static WCET (worst-case execution time) analysis to guarantee real-time properties on each platform – Build an implementation to support a variety of real applications – Exploit parallelism with multi-core platforms 15