0 Deterministic Replay for Real- time Software Systems Alice Lee Safety, Reliability & Quality Assurance Office JSC, NASA Yann-Hang.

Slides:



Advertisements
Similar presentations
Automating Software Module Testing for FAA Certification Usha Santhanam The Boeing Company.
Advertisements

A Sample RTOS Presentation 4 Group A4: Sean Hudson, Manasi Kapadia Syeda Taib.
Real-time Systems Lab, Computer Science and Engineering, ASU Please Standby Galileo Tech Talk at ASU Will Begin Shortly.
7/23 CSE 325 Embedded Microprocessor System Design Fall 2010 Computer Science & Engineering Department Arizona State University Tempe, AZ Dr. Yann-Hang.
Last update: August 9, 2002 CodeTest Embedded Software Verification Tools By Advanced Microsystems Corporation.
1 Tornado: An Embedded System Development Tool Maung Wynn Aung Han CIS 642, Spring 2001 Prof. Insup Lee.
Real-Time Kernels and Operating Systems Basic Issue - Purchase commercial “off-the- shelf” system or custom build one Basic Functions –Task scheduling.
MotoHawk Training Model-Based Design of Embedded Systems.
Model for Supporting High Integrity and Fault Tolerance Brian Dobbing, Aonix Europe Ltd Chief Technical Consultant.
1SAS_06_Testing_Framework_Chen A Testing Framework for Reproducible Execution and Race Condition Detection in Real-time Embedded Systems Ken Chen, JSC.
Chapter 13 Embedded Systems
Continuously Recording Program Execution for Deterministic Replay Debugging.
© ABB Group Jun-15 Evaluation of Real-Time Operating Systems for Xilinx MicroBlaze CPU Anders Rönnholm.
Lecture 1: History of Operating System
Esterel Overview Roberto Passerone ee249 discussion section.
Chapter 13 Embedded Systems
REAL-TIME SOFTWARE SYSTEMS DEVELOPMENT Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Concurrency, Threads, and Events Robbert van Renesse.
Real-Time Kernels and Operating Systems. Operating System: Software that coordinates multiple tasks in processor, including peripheral interfacing Types.
CprE 458/558: Real-Time Systems
1 Chapter 13 Embedded Systems Embedded Systems Characteristics of Embedded Operating Systems.
INPUT/OUTPUT ORGANIZATION INTERRUPTS CS147 Summer 2001 Professor: Sin-Min Lee Presented by: Jing Chen.
EMBEDDED SOFTWARE Team victorious Team Victorious.
Performance Evaluation of Real-Time Operating Systems
Chapter 1 Embedded And Real-Time System Department of Computer Science Hsu Hao Chen Professor Hsung-Pin Chang.
Evolution of Microcontroller Firmware Development David Benjamin.
CSE 486/586 CSE 486/586 Distributed Systems PA Best Practices Steve Ko Computer Sciences and Engineering University at Buffalo.
Timing and Race Condition Verification of Real-time Systems Yann–Hang Lee, Gerald Gannod, and Karam Chatha Dept. of Computer Science and Eng. Arizona State.
REAL-TIME SOFTWARE SYSTEMS DEVELOPMENT Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Slides created by: Professor Ian G. Harris Test and Debugging  Controllability and observability are required Controllability Ability to control sources.
Ch. 9 Interrupt Programming and Real-Time Sysstems From Valvano’s Introduction to Embedded Systems.
1 Software Development Configuration management. \ 2 Software Configuration  Items that comprise all information produced as part of the software development.
Real Time Operating Systems Lecture 10 David Andrews
OPERATING SYSTEMS Goals of the course Definitions of operating systems Operating system goals What is not an operating system Computer architecture O/S.
Chapter 101 Multiprocessor and Real- Time Scheduling Chapter 10.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Hardware-software Interface Xiaofeng Fan
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
REAL-TIME SOFTWARE SYSTEMS DEVELOPMENT Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Real Time Operating Systems Memory Management & Device Control.
13-Nov-15 (1) CSC Computer Organization Lecture 7: Input/Output Organization.
CS333 Intro to Operating Systems Jonathan Walpole.
25 April 2000 SEESCOASEESCOA STWW - Programma Evaluation of on-chip debugging techniques Deliverable D5.1 Michiel Ronsse.
CSE 451: Operating Systems Section 5 Midterm review.
Deadlock Detection and Recovery
Seminar of “Virtual Machines” Course Mohammad Mahdizadeh SM. University of Science and Technology Mazandaran-Babol January 2010.
Embedded Systems OS. Reference Materials The Concise Handbook of Real- Time Systems TimeSys Corporation.
Operating Systems 1 K. Salah Module 1.2: Fundamental Concepts Interrupts System Calls.
1 VxWorks 5.4 Group A3: Wafa’ Jaffal Kathryn Bean.
31 Oktober 2000 SEESCOASEESCOA STWW - Programma Work Package 5 – Debugging Task Generic Debug Interface K. De Bosschere e.a.
Time Management.  Time management is concerned with OS facilities and services which measure real time.  These services include:  Keeping track of.
4.1 Introduction to Threads Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads.
Embedded Computer - Definition When a microcomputer is part of a larger product, it is said to be an embedded computer. The embedded computer retrieves.
Flashback : A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging Sudarshan M. Srinivasan, Srikanth Kandula, Christopher.
Hardware/Software Co-Verification with RTOS Application Code Michael Bradley, Mentor Graphics Kainian Xie, Hyperchip Inc.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
Where Testing Fails …. Problem Areas Stack Overflow Race Conditions Deadlock Timing Reentrancy.
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
Outlines  Introduction  Kernel Structure  Porting.
Computer System Structures
Benefits of a Virtual SIL
REAL-TIME OPERATING SYSTEMS
Background on the need for Synchronization
Concurrency.
CS 501: Software Engineering Fall 1999
Multithreaded Programming
EE 472 – Embedded Systems Dr. Shwetak Patel.
Presentation transcript:

0 Deterministic Replay for Real- time Software Systems Alice Lee Safety, Reliability & Quality Assurance Office JSC, NASA Yann-Hang Lee Computer Science & Eng Arizona State University, Tempe, AZ

lee_IV&V-1 Background  Major difficulties of building real-time embedded applications m handling concurrent events (real-world events occur in parallel) m timing control and temporal dependence in program behavior m asynchronous operations  Non-deterministic operation, Time-dependent behavior, and race condition m difficult to model, analyze, test, and re-produce.  Example: NASA Pathfinder spacecraft m Total system resets in Mars Pathfinder m An overrun of data collection task  a priority inversion in mutex semaphore  failure of communication task  a system reset. m Took 18 hours to reproduce the failure in a lab replica  the problem became obvious and a fix was installed

lee_IV&V-2 Background (Cont’d)  Other examples m select(2)/accept(2) Race Condition in TCP Servers of NetBSD  the bug depends on a specific event and is sometimes difficult to reproduce, particularly if the server is very fast and the network is relatively slow. m The Delphi Bug Report 459  difficult to reproduce the bug since the timing of the two threads (one is being destroyed and one is being created) has to be “right” for it to occur.  it is easy to identify the faults and fix them once the failing sequences are reproduced (or observed).  The failures are rooted in the interaction of multiple concurrent operations/threads and are based on timing dependencies.

lee_IV&V-3 Deterministic Replay  Can we re-produce the exact execution behavior with additional delays in a controlled environment m the delays may be caused by instrumentation and break points  For multiple purposes: m Test analysis m Debugging m Recovery Execution/ Instrumentation D. replay/ Instrumentation Execution Execution/ Observation/ Assertion D. replay/ Observation/ Assertion Execution Execution/ Checkpointing/ Msg logging Rollback/ D. replay

lee_IV&V-4 Deterministic Replay (Cont’d)  Programs read in the same input values (timer, DAQ, status, etc.)  Interrupts occurs in the same program execution instances  Need to log external events during real-time execution and re- submit the events during replay m recording and replaying stages real-time execution interrupt_1 interrupt_2 PC=1000 PC=2000 deterministic replay interrupt_1 interrupt_2 PC=1000 PC=2000 time intrusions

lee_IV&V-5 Testing Analysis and Timing Intrusion  Software quality analysis and test coverage m Instrumentation at source programs m program behavior may be changed due to timing intrusion  test a robotic controller in the target system – hardware and human-in-the loop operations m some solutions :  hardware-based trace collection (Applied Microsystems)  special data logging, monitoring, and test facility (SVF for NASA ISS)  Apply instrumentation during deterministic replay m if the overhead of logging external events can be minimized

lee_IV&V-6 Our Approach -- A Two-stage Instrumentation  Instrumentation based on RTOS -- for context switches, interrupts, events, and task communication  Annotation for device drivers  Synchronize program execution with external events m cannot rely on program counter  an interrupt during a loop (need loop count and program counter) m simulated time  must be adjusted to match with the real execution time  determine when an event occurs if no data dependence, it can occur at any instance during a block execution else, need to know the corresponding statement

lee_IV&V-7 Software Instruction Counter  Exact instance in program execution m specified by program counter (PC) p Software instruction counter (SIC) -- m incremented when backward jump or procedure call m software or hardware implemented m Has been applied to recovery and debugging read I/O check value I/O status changed read I/O check value

lee_IV&V-8 Current Status source program code analyzer ESIC, system, and event instrumentation instrumented program_1 target - record environment code instrumentation ESIC and replay instrumentation instrumented program_2 event trace_1 event trace_2 PC stamp converter target - replay environment execution trace

lee_IV&V-9 Current Status (Cont’d)  Works for single execution thread in the whole system (vxWork + MPC860)  There are kernel and non-instrumented threads m test analysis of one program in a multitasking environment m debug a program which calls library routines m system calls to RTOS  Can we still reach deterministic replay if the execution of the instrumented thread is interleaved with other threads?  If interrupts (input)  thread_1  thread_2, then, both threads must be instrumented instrumented program RTOS The other thread semTake() interrupt semGive() ISR

lee_IV&V-10 Current Status (Cont’d)  If interrupts (input)  thread_2 and thread_1  thread_2, m thread_1 doesn’t need to be instrumented m however, interrupts can occur while thread_1 is running (I.e. execution is not in the instrumentation region due to a blocked system call or library call)  Solution: m check thread id when an interrupt occurs m if the interrupted instruction is in the instrumentation region, use PC+SIC for replay m else, replay the interrupt just before the call (RTOS or library)

lee_IV&V-11 Current Tasks  Tool integration and GUI  Experiments m joystick program with input and timer m DC motor controller with a LabView-based simulator  Applications in JSC m X38 m AERCam  Porting m vxWorks and  Suds on MBX860 embedded controller m porting to RT-linux and other platforms  Documentation and dissemination