Presentation is loading. Please wait.

Presentation is loading. Please wait.

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture.

Similar presentations


Presentation on theme: "PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture."— Presentation transcript:

1 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay Guoqing Xu, Atanas Rountev, Yan Tang, Feng Qin Ohio State University ESEC/FSE 07

2 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 2 Outline  Motivation - Challenges for checkpointing/replaying Java software - Summary of our approach  Contributions - Static analyses - Multiple execution regions - Experimental evaluation  Conclusions

3 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 3 Motivation  Checkpointing/replaying has been used for a variety of purposes at system level - Originally designed to support fault tolerance - Debugging of OS and of parallel and distributed software  Checkpointing can benefit a number of software engineering tasks - Reduce the cost of manual debugging and testing - Support for automated techniques for debugging and testing: e.g., dynamic slicing and delta-debugging - Inspired by both system-level checkpointing [Pan-PDD88, Dunlap-OSDI02, King-USENIX05] and “saving-and-restoring” software engineering techniques [Saff-ASE05, Orso- WODA05, Orso-WODA06, Elbaum-FSE06]

4 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 4 Challenges  Ease of use and deployment - Application-level checkpointing: no JVM/runtime support, just code analysis and instrumentation - Challenge: no direct access to the call stack; no control over thread scheduling or external resources (files, etc.)  Reduce the size of the recorded state - Dumping the entire heap may be prohibitively expensive, especially for large programs - Challenge: static analyses to prune redundant state  Static and dynamic overhead - Static analysis cost is amortized over multiple runs - Approach is intended for long-running applications

5 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 5 Summary of Our Approach  Tool input: program + checkpoint definition  Performs static analyses and code instrumentation  Tool output: two program versions  First, an augmented checkpointing version is executed once to record (parts of) the run-time program states - At the checkpoint: heap objects, static fields, locals - At certain points along the call chain leading to the checkpoint  Next, a pruned replaying version is executed multiple times - Restore variables saved at the checkpoint - Restore variables saved at points along the call chain  How do we resume execution from the checkpoint? - Step 1: control flow quickly reaches the checkpoint - Step 2: recover state at checkpoint - Step 3: incrementally recover state after call sites along the call chain leading to the checkpoint

6 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 6 Definitions  Crosscut call chain (CC-chain) - A programmer-specified call chain that leads to the method that contains the checkpoint - E.g. main(44) -> run(28)  Decision points - A call site on the CC-chain (e.g. m.run) – due to polymorphism - A predicate on which a decision point or the checkpoint is control-dependent  At a decision point, the checkpointing version records the control-flow outcome  The replaying version uses this info to force the control flow to reach the checkpoint

7 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 7 Replaying, Step 1: Recover the Call Stack  Predicate decision point: recover boolean value  Call site decision point o.m(a1…, an) -Recover the run-time type of the receiver object; instantiated during replaying using sun.misc.Unsafe

8 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 8 Checkpointing Version void run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); boolean b = Options.v().whole_jimple(); => save(b); if (b){// DP getPack("cg").apply(); // --- checkpoint --- => save(…); getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); … }... } static void main(String[] args) { Main m = new Main(); boolean b = args.length !=0; => save(b); if (b) // DP => save(type_of(m)); m.run(args); // DP }

9 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 9 Replaying Version void run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); boolean b = Options.v().whole_jimple(); => read(b); if (b){// DP getPack("cg").apply(); // --- checkpoint --- =>read(…); getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); … } static void main(String[] args) { Main m = new Main(); boolean b = args.length !=0; => read(b); if (b) // DP => read(type_of(m)); => unsafe.allocate(m); => args = null; m.run(args); // DP }

10 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 10 Step 2: Recover at the Checkpoint  Our static analysis selects locals for recording(for checkpointing)/recovering(for replaying) when - They are written before the checkpoint - They are read after the checkpoint  Record primitive-typed values or entire object graphs on the heap (all reachable objects)  Static fields are selected based on the same idea void run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); if (Options.v().whole_jimple()) { getPack("cg").apply(); // --- checkpoint --- getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); for (Iterator i = body_packs.iterator(); i.hasNext();) { … }… } body_packs

11 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 11 Selection of Static Fields  A whole program Mod/Use analysis - A static field is “written” if its value is changed, or any heap object reachable from it is mutated - A static field is “read” if its value is directly read  Analysis algorithm - Context-sensitive and flow-insensitive; uses the points-to solution and the call graph from Spark [Lhotak CC-03] - Bottom-up traversal of the SCC-DAG of the call graph - For each method m, a set C m is maintained to contain all objects from which a mutated object can be reached - Propagate backwards the objects in C m that escape a callee method to its callers - Select a static field fld if PointsToSet(fld) ∩ C m ≠ ∅

12 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 12 Step 3: Recover after the Checkpoint  Replaying only at decision points and the checkpoint is not enough to guarantee correct execution after the checkpoint  Additionally record/recover local variables that will be read after each call site in CC-chain void main(){ Set hs = new HashSet(); B b = new B(hs); //-- reco/rest //(type_of(b)) b.m(); //-- extra reco/rest (hs) if(hs == b.s){ … } } class B{ Set s; void m(){ B r0 = this; r0.s = new HashSet(); //-- checkpoint //-- reco/rest (r0) r0.s.add(“”); } } hs uninitialized

13 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 13 Additional Issues  A checkpoint can have multiple run-time instances  If a method in CC-chain has callers that are not in the chain, it has to be replicated  Currently do not support multi-threaded programs  Our technique does not guarantee the correctness of the execution, when the post-checkpoint part of the program - Depends on external resources, such as files, databases - Depends on unique-per-execution values, such as clock - Is modified with new cross-checkpoint dependencies  Multiple execution regions - Designated by a starting point and an ending point - Specified by two CC-chains

14 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 14 Study 1: Static Analysis 5 3 jb-6.1 8 3 jlex-1.2.6 5 2 db 4 2 jtar-1.21 8 2 jflex 9 4 violet 8 3 jess 11 4 sablecc 9 4 javacup 35 10 soot-2.2.3 10 3 raytrace 14 3 socksecho 11 3 socksproxy 6 1 compress 20 3 muffin #IP #R Program

15 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 15 Static Analysis: Locals Reduction

16 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 16 Static Analysis: Static Fields Reduction

17 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 17 Static Analysis: Removed/Inserted Statements

18 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 18 Static Analysis Cost  Phase 1: Soot infrastructure cost - Between 1.64ms and 30.6ms per thousand Jimple statements - On average, 11.1ms/1000 statements  Phase 2: Our analysis cost - Between 1.67ms and 26.6ms per thousand Jimple statements - On average, 9.4ms/1000 statements  This should be amortized across multiple runs of the replaying version

19 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 19 Study 2: Run-Time Performance (compress)  Original program: compressing and decompressing 5 big tar files several times  Evaluated for five checkpoint definitions - One checkpoint, close to the beginning of the program - Two regions of compression and decompression - A region containing the process of compression - A region containing the process of decompression - One checkpoint, close to the end of the program

20 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 20 compress Performance  Normalized running time  Normalized size of captured program state

21 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 21 Study 2: Run-Time Performance (soot)  Input: soot-2.2.3 itself containing 2227333 methods  Phases - Enabling cg.spark, wjtp, wjop.ji, wjap.uft, jtp, jop.cp  Evaluated for six checkpoint definitions - Before whole-program packs - After cg - After wjtp - After wjop - After wjap - After body packs

22 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 22 soot Performance  Normalized running time  Normalized size of captured program state

23 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 23 Study 2: Run-Time Performance (jflex-1.4.1)  Input: a.flex grammar file corresponding to a DFA containing 21769 states  Evaluated for four checkpoint definitions - After NFA is generated - After DFA is generated to DFA - After minimization - After emission

24 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 24 jflex Performance  Normalized running time  Normalized size of captured program state

25 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 25 Summary of Evaluation  Static analysis successfully reduces the size of program state recorded and recovered  It is more meaningful to checkpoint/replay long- running programs  Checkpoints are better taken after a phase of long computation with (relatively) small output state -√ compress: small program state, short running time -√ soot: large program state, but very long computation time -X jflex: large program state, short running time

26 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 26 Conclusions  A static-analysis-based checkpointing/replaying technique  An implementation and an evaluation that shows our technique can be an interesting candidate for testing, debugging, and dynamic slicing of long- running programs  Future work - Language-level checkpointing/replaying multi-threaded programs - More precise static analyses could be employed to reduce the size of program state to be captured - The run-time support for object reading and writing could be improved

27 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 27  Questions?


Download ppt "PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture."

Similar presentations


Ads by Google