Presentation is loading. Please wait.

Presentation is loading. Please wait.

Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,

Similar presentations

Presentation on theme: "Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,"— Presentation transcript:

1 Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira, Mack Stallcup, Gregory Lueck, James Cownie Intel Corporation CGO 2010, Toronto, Canada 1

2 Software & Services Group Non-Determinism Program execution is not repeatable across runs –Interactions with environment (single-threaded) –Shared-memory interleaving (multi-threaded) Source of many problems –Hard to predict and test behaviors -> leads to bugs –Very hard and unpleasant to debug –Breaks program analyses that rely on repeatability Obstacle for adoption of parallel programming 2

3 Software & Services Group Dealing with Non-Determinism Eliminate it –Deterministic program execution enforced by runtime (e.g. constrained execution [ISCA’09]) Deterministic Replay –Let it be but capture and reproduce execution if needed –Every instruction gets same input as in original run This paper: User-level Deterministic Replay –Implementation, challenges and usage examples 3

4 Software & Services Group Requirements No OS or hardware changes No changes in user environment Manageable log sizes for long runs Reasonable run-time overhead Multi-threaded and multi-processed applications Integration with other existing analysis tools (e.g. Dynamic analyzers, debuggers, profilers) No assumptions about synchronization APIs 4

5 Software & Services Group Rest of the Talk Motivation & Requirements PinPlay Overview Usage Examples Results Summary 5

6 Software & Services Group User-level deterministic replay and analysis PinPlay 6 Binary + Input Logs (pinballs) Normal Program Output + OS (Linux® or Windows®) PinPlay capture OS (Linux® or Windows®) Logs (pinballs) PinPlay Analysis Tools + Debuggers replay  Run in application’s native environment  Replays user code  OS independent: cross-OS replay!  Easily integrates w/ other tools and debuggers

7 Software & Services Group Parallel-capture and parallel-replay Replay Models 7 PinPlay T0 T1 T2 Logs (pinballs) PinPlay T0 T1 T2 Parallel-capture and isolated-replay PinPlay Logs (pinballs) PinPlay T0 T1 T2 T0 T1 T2 Logs (pinballs)

8 Software & Services Group Information Captured For Replay 3.Code executed (user and libraries) 4.Position of code and stack 5.Output of some instructions (e.g. RDTSC) 6.Subset of shared-memory access interleaving (transitive opt. - FDR [ISCA’03]) 8 1.Subset of Memory Values Shadow-memory to capture first reads without prior writes and OS side-effects automatically [Sigmetrics’06] Values changed by remote threads 2.Initial registers and OS register side-effects: Signals/Exceptions/APCs/system calls All memory Values Reads without prior writes OS side-effects used by app All other values (not captured) Values from remote threads

9 Software & Services Group PinPlay Architecture 9 Capable of logging, replaying and relogging execution (recapture from a replaying run) OS (Linux® or Windows®) Intel’s Pin (JIT compiler and instrumentor) * User Land PinPlay Lib Instrumentation and analysis to capture logs Application code and data Logger Replayer Instrumentation and analysis to inject side-effects Your Pin-based Tool * pinball

10 Software & Services Group Cross-OS Replay and Challenges Log on one OS and replay on another System call translations –Most OS activity does not happen on replay (only side- effects restored) –Semantics is translated across OSes (e.g. create thread) Memory mapping –Problem: address space different across OSes –Solution: use Pin’s Fetch API to redirect code and memory operand rewriting to redirect data 10 data address space on Windows® code address space on Linux® Remap code Remap data code data

11 Software & Services Group Usage Example: Program Analysis Sampling and checkpointing for simulation –One run for profiling and finding representative regions, another for checkpointing –Requirement: both runs must be identical 11 PinPlay Logs (pinballs) PinPlay + Profiler PinPlay + Profiler Multi-process MPI program Multi-process MPI program Logs (pinballs) Per-Process pinball Representative Regions PinPlay + Checkpointer PinPlay + Checkpointer Checkpoints for simulation Pinballs are used to share workloads for Pin- based analyses among architects

12 Software & Services Group Usage Example: Replay for Debugging Capture a buggy run and replay under debugger –Guaranteed to reproduce the bug and helps root causing –Works w/ off-the-shelf unmodified debuggers (e.g. GDB) –PinPlay based tool extends GDB commands w/ your own –Limitation: debugger can’t change control-flow Used to debug various multi-threaded applications Also using it for in-house debugging of concurrency issues with a major database vendor 12 Logs (pinballs) PinPlay Enabled Debugger Tool Intel’s Pin Binary GDB (unmodified) GDB (unmodified) remote protocol

13 Software & Services Group Results 13 Benchmark/ApplicationAverage Icount (Billions)Size (MB) SPEC2006 (single-threaded)92439 SPECOMP2001 (4-threaded openmp)30791 McBench (4-threaded RMS)156396 MILC-8p (numerical simulator/MPI)1092140 POP-8p (ocean circulator model/MPI)9521116 WRF-8p (Weather Prediction/MPI)7555222 EnergyApp-8p (Energy Exploration/MPI)6931996 Isolated replay

14 Software & Services Group Sources of Slowdown Instrumentation of every memory operation to identify system call side-effects and log data –Could be done by OS at the cost of OS modification or OS-specific analysis (doesn’t work on Windows®) Locks for shadow-memory accesses –Could be eliminated by using a shadow-copy per thread at the cost of significant increase in log sizes Other optimizations possible (please look at the paper) 14

15 Software & Services Group Summary User-level deterministic capture and replay –No OS changes, special hardware, or virtualization –Integrates w/ other Pin-tools for repeatable analysis and debugging Replay occurs on any machine and works across OSes (Windows to Linux) Pinballs are OS-independent and self-contained –Ideal for sharing workloads among researchers, for Pin-based analyses We will release PinPlay libraries in future 15

16 Software & Services Group 16 Q&A

Download ppt "Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,"

Similar presentations

Ads by Google