October 24, 2003 SEESCOASEESCOA STWW - Programma Debugging Components Koen De Bosschere RUG-ELIS.

Slides:



Advertisements
Similar presentations
Debugging operating systems with time-traveling virtual machines Sam King George Dunlap Peter Chen CoVirt Project, University of Michigan.
Advertisements

Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,
4 December 2001 SEESCOASEESCOA STWW - Programma Debugging of Real-Time Embedded Systems: Experiences from SEESCOA Michiel Ronsse RUG-ELIS.
1 Architectural Complexity: Opening the Black Box Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems EECC-756.
CHESS: A Systematic Testing Tool for Concurrent Software CSCI6900 George.
Enabling Efficient On-the-fly Microarchitecture Simulation Thierry Lafage September 2000.
An efficient data race detector for DIOTA Michiel Ronsse, Bastiaan Stougie, Jonas Maebe, Frank Cornelis, Koen De Bosschere Department of Electronics and.
CS220 Software Development Lecture: Multi-threading A. O’Riordan, 2009.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Ritu Varma Roshanak Roshandel Manu Prasanna
1-1 Embedded Software Development Tools and Processes Hardware & Software Hardware – Host development system Software – Compilers, simulators etc. Target.
29-Jun-15 Java Concurrency. Definitions Parallel processes—two or more Threads are running simultaneously, on different cores (processors), in the same.
1 Chapter 4 Threads Threads: Resource ownership and execution.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
Multithreading in Java Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
Multithreading in Java Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
Light64: Lightweight Hardware Support for Data Race Detection during Systematic Testing of Parallel Programs A. Nistor, D. Marinov and J. Torellas to appear.
Deterministic Replay of Java Multithreaded Applications Jong-Deok Choi and Harini Srinivasan slides made by Qing Zhang.
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
Timing and Race Condition Verification of Real-time Systems Yann–Hang Lee, Gerald Gannod, and Karam Chatha Dept. of Computer Science and Eng. Arizona State.
Computer System Architectures Computer System Software
A Bridge to Your First Computer Science Course Prof. H.E. Dunsmore Concurrent Programming Threads Synchronization.
A Portable Virtual Machine for Program Debugging and Directing Camil Demetrescu University of Rome “La Sapienza” Irene Finocchi University of Rome “Tor.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 2: System Structures.
Slides created by: Professor Ian G. Harris Test and Debugging  Controllability and observability are required Controllability Ability to control sources.
Microsoft Research Asia Ming Wu, Haoxiang Lin, Xuezheng Liu, Zhenyu Guo, Huayang Guo, Lidong Zhou, Zheng Zhang MIT Fan Long, Xi Wang, Zhilei Xu.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
SSGRR A Taxonomy of Execution Replay Systems Frank Cornelis Andy Georges Mark Christiaens Michiel Ronsse Tom Ghesquiere Koen De Bosschere Dept. ELIS.
Real-Time Java on JOP Martin Schöberl. Real-Time Java on JOP2 Overview RTSJ – why not Simple RT profile Scheduler implementation User defined scheduling.
AADEBUG MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
Dynamic Analysis of Multithreaded Java Programs Dr. Abhik Roychoudhury National University of Singapore.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
CS 346 – Chapter 4 Threads –How they differ from processes –Definition, purpose Threads of the same process share: code, data, open files –Types –Support.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
1 Threads Chapter 11 from the book: Inter-process Communications in Linux: The Nooks & Crannies by John Shapley Gray Publisher: Prentice Hall Pub Date:
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
Copyright ©: University of Illinois CS 241 Staff1 Threads Systems Concepts.
25 April 2000 SEESCOASEESCOA STWW - Programma Evaluation of on-chip debugging techniques Deliverable D5.1 Michiel Ronsse.
Multithreading in Java Sameer Singh Chauhan Lecturer, I. T. Dept., SVIT, Vasad.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
On-Demand Dynamic Software Analysis Joseph L. Greathouse Ph.D. Candidate Advanced Computer Architecture Laboratory University of Michigan December 12,
Seminar of “Virtual Machines” Course Mohammad Mahdizadeh SM. University of Science and Technology Mazandaran-Babol January 2010.
CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Processes and Threads.
31 Oktober 2000 SEESCOASEESCOA STWW - Programma Work Package 5 – Debugging Task Generic Debug Interface K. De Bosschere e.a.
Vertical Profiling : Understanding the Behavior of Object-Oriented Applications Sookmyung Women’s Univ. PsLab Sewon,Moon.
Source Level Debugging of Parallel Programs Roland Wismüller LRR-TUM, TU München Germany.
Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using –shared variables –message passing.
Flashback : A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging Sudarshan M. Srinivasan, Srikanth Kandula, Christopher.
Evaluating the Fault Tolerance Capabilities of Embedded Systems via BDM M. Rebaudengo, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica.
Agenda  Quick Review  Finish Introduction  Java Threads.
1 The user’s view  A user is a person employing the computer to do useful work  Examples of useful work include spreadsheets word processing developing.
Kendo: Efficient Deterministic Multithreading in Software M. Olszewski, J. Ansel, S. Amarasinghe MIT to be presented in ASPLOS 2009 slides by Evangelos.
Concurrency, Processes and Threads
Chapter 2 Processes and Threads Today 2.1 Processes 2.2 Threads
CS 501: Software Engineering Fall 1999
Threads Chapter 4.
Multiprocessor and Real-Time Scheduling
Introduction to Data Structure
System calls….. C-program->POSIX call
Chapter 2 Operating System Overview
Concurrency, Processes and Threads
Presentation transcript:

October 24, 2003 SEESCOASEESCOA STWW - Programma Debugging Components Koen De Bosschere RUG-ELIS

SEESCOASEESCOA Problem description  Components are loosely coupled and do not have a common notion of time  Components have contracts (e.g. timing contracts)  Components are activated asynchronously by the scheduler  Components can be replaced at run-time Traditional debugging techniques are not adequate

SEESCOASEESCOA Traditional debugging inadequate?  Execution is non-deterministic: no two runs can be guaranteed to be identical (scheduling, timing differences, replacing components,…): cyclic debugging not applicable  Timing is part of correctness: the intrusion caused by the debugger might violate the contracts  Input might not be repeatable if generated by an external device (e.g. camera or microphone) Debugging is often a matter of trial and error, and a good portion of luck and experience is needed; the use of multithreading only adds to that.

SEESCOASEESCOA Two approaches  On-chip debugging techniques  Software debugging techniques

SEESCOASEESCOA On-chip debugging techniques  Logic Analyser  ROM monitor  ROM emulator  In-Circuit Emulator  Background Debug Mode  JTAG These add-ons take up valuable chip area (up to 10%) Hardware manufacturers believe in design for debugability

SEESCOASEESCOA Software debugging techniques  Execution must be repeatable to allow for cyclic debugging wProgram flow must be identical wInput must be identical  Execution must be observable to allow for debugging wWe must be able to use breakpoints, watch points, etc. without altering the program flow Re-execution must be deterministic

SEESCOASEESCOA Example code class G { public static int global = 5; } class Thread1 extends Thread { public void run() { G.global += 2; } } class Thread2 extends Thread { public void run() { G.global *= 3; } } class Main { public static void main(String [] args) { Thread1 t1 = new Thread1(); Thread2 t2 = new Thread2(); G.global = 5; t1.start(); t2.start(); t1.join(); t2.join(); System.out.println(“global” + G.global); }

SEESCOASEESCOA Possible executions L(5) G.global=15 L(5) S(7) S(15) +2 *3 G.global=5 G.global=7 L(5) S(7) S(15) +2 *3 G.global=5 G.global=21 L(5) L(7) S(7) S(21) +2 *3 G.global=5 G.global=17 L(15) L(5) S(17) S(15) +2 *3 G.global=5

SEESCOASEESCOA Causes of non-determinism  Sequential programs: wInput wCertain system calls (time) w…  Parallel programs: wRace conditions on shared variables, wLoad balancing w…

SEESCOASEESCOA Execution Replay  Goal: make repeated equivalent re- executions possible  Method: two phases wRecord phase: record all non- deterministic events during an execution in a trace file wReplay phase: use trace file to produce the same execution  Question: what & where to trace? wSynchronization Replay wInput Replay wData race detection

SEESCOASEESCOA Requirements execution replay  Record must have low intrusion  Replay must be accurate  Record phase must be space efficient  Replay phase must be time efficient

SEESCOASEESCOA Synchronization Replay Execution 1Execution 2 Trace file record replay (happens before relation)

SEESCOASEESCOA Input replay application kernel IO-instructions System calls

SEESCOASEESCOA Example code class G { public static int global = 5; public static Object s = new Object(); } class Thread1 extends Thread { public void run() { synchronized(G.s){G.global += 2;}} } class Thread2 extends Thread { public void run() { synchronized(G.s){G.global *= 3;}} } class Main { public static void main(String [] args) { Thread1 t1 = new Thread1(); Thread2 t2 = new Thread2(); G.global = 5; t1.start(); t2.start(); t1.join(); t2.join(); }

SEESCOASEESCOA Possible executions G.global=21 L(5) L(7) S(7) S(21) +2 *3 G.global=5 G.global=7 L(5) S(7) S(15) +2 *3 G.global=5 L(5) G.global=15 L(5) S(7) S(15) +2 *3 G.global=5 G.global=17 L(15) L(5) S(17) S(15) +2 *3 G.global=5

SEESCOASEESCOA Record phase G.global=21 L(5) L(7) S(7) S(21) +2 *3 G.global= ,2,3,7,9,10 3,4,5,6 4,6,7,8 G.global=17 L(15) S(17) +2 G.global=5 L(5) S(15) * ,2,3,10,11,1 2 3,7,8,9 4,5,6,7

SEESCOASEESCOA Replay phase G.global=21 L(5) L(7) S(7) S(21) +2 *3 G.global= G.global=17 L(15) S(17) +2 G.global=5 L(5) S(15) *

SEESCOASEESCOA Execution Replay in Java  Requires to record the choices made by synchronization constructs like synchronized, wait, signal, etc.  During replay, the synchronization operations are replaced by operations waitforlogicaltime(t). component system T

SEESCOASEESCOA Input Replay  Execution will only yield the same results if the input is repeatable too  Solution: recording input by capturing all I/O events and regenerating them during replay  Input replay generates a huge amount of data…

SEESCOASEESCOA Data race detection  Data race occurs if a store/store, load/store or store/load occurs between two threads in parallel on the same location.  Automatic data race detection: check data race condition on all load/store pairs that are not ordered. L(5) G.global=15 L(5) S(7) S(15) +2 *3 G.global=5 G.global=7 L(5) S(7) S(15) +2 *3 G.global=5

SEESCOASEESCOA Implementation  RecPlay for Solaris (SPARC) and Linux (x86) wUses JiTI for dynamic instrumentation wRecord overhead: 1.6%  JaReC for Java (on top of the JVM) wUses JVMPI for dynamic instrumentation wRecord overhead: 25% on average  Input-Replay for Linux (Tornado) wUses ptrace

SEESCOASEESCOA Performance modeling JVM  Java workload separable in different components wVirtual Machine (SUN, IBM, JikesRVM, JRockit, …) wJava application (SPECjvm98, SPECjbb2000, …) wInput to the application  Measure execution characteristics (AMD Duron) wIPC, branch & cache behavior, …  Statistical analysis wPrincipal Components Analysis wCluster Analysis  Quantify difference SPECcpu2000 and Java workloads

SEESCOASEESCOA JVM Results  Java workloads mostly clustered by wbenchmark for large workloads wVM for small workloads  SPECjvm98: small input set not significant for large input set execution behavior  Comparing Java vs. C: wNo significant difference IPC, amount of branches, data TLB wSignificant difference data cache behaviour, instruction TLB, return stack usage

SEESCOASEESCOA PCA for SPECjvm98 – s1 input set

SEESCOASEESCOA PCA for SPECjvm98 – s100 input set

SEESCOASEESCOA PCA for SPECcpu vs. Java

SEESCOASEESCOA Conclusions  Debugging multithreaded/distributed systems is not an easy task  Faithful record/replay requires extra resources (time + space)  Record/replay enables the developer the effectively debug a complex multithreaded program  The choice of Java VM has an impact on the low-level behavior of the processor. Java benchmarks should be large enough to be realistic.

SEESCOASEESCOA Output  14 refereed conference papers (OOPSLA, ParCo, WBT,…)  12 workshop papers  5 journal publications (FGCS, CACM, Parallel Computing,…)  1 PhD  12 master theses  Java and Embedded Systems Symposium Nov 2002 [150 people]  AADEBUG 2003 workshop, Sept 2003 [60 people]