PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio.

Slides:



Advertisements
Similar presentations
A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.
Advertisements

Uncovering Performance Problems in Java Applications with Reference Propagation Profiling PRESTO: Program Analyses and Software Tools Research Group, Ohio.
ASSUMPTION HIERARCHY FOR A CHA CALL GRAPH CONSTRUCTION ALGORITHM JASON SAWIN & ATANAS ROUNTEV.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Flow insensitive pointer analysis: fixed S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l.
1 CS 201 Compiler Construction Lecture Interprocedural Data Flow Analysis.
1 Practical Object-sensitive Points-to Analysis for Java Ana Milanova Atanas Rountev Barbara Ryder Rutgers University.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.
KQS More exercises/practice What about research frontier? Reading material Meetings for project Post notes more promptly.
X := 11; if (x == 11) { DoSomething(); } else { DoSomethingElse(); x := x + 1; } y := x; // value of y? Phase ordering problem Optimizations can interact.
Scaling CFL-Reachability-Based Points- To Analysis Using Context-Sensitive Must-Not-Alias Analysis Guoqing Xu, Atanas Rountev, Manu Sridharan Ohio State.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Interprocedural Analysis Noam Rinetzky Mooly Sagiv Tel Aviv University Textbook Chapter 2.5.
Intraprocedural Points-to Analysis Flow functions:
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Regression Test Selection for AspectJ Software Guoqing Xu and Atanas.
From last lecture x := y op z in out F x := y op z (in) = in [ x ! in(y) op in(z) ] where a op b =
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Recap from last time g() { lock; } h() { unlock; } f() { h(); if (...) { main(); } } main() { g(); f(); lock; unlock; } mainfgh ;;;;;;; u u ” ”””” ” ”
ESP [Das et al PLDI 2002] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked.
A simple approach Given call graph and CFGs of procedures, create a single CFG (control flow super-graph) by: –connecting call sites to entry nodes of.
Overview of program analysis Mooly Sagiv html://
Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University A Framework for Source-Code- Level Interprocedural Dataflow Analysis.
An Efficient Inclusion-Based Points-To Analysis for Strictly-Typed Languages John Whaley Monica S. Lam Computer Systems Laboratory Stanford University.
Transfer functions Our pairs of summaries look like functions from input information to output information We call these transfer functions Complete transfer.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
From last class. The above is Click’s solution (PLDI 95)
Precision Going back to constant prop, in what cases would we lose precision?
Static Control-Flow Analysis for Reverse Engineering of UML Sequence Diagrams Atanas (Nasko) Rountev Ohio State University with Olga Volgin and Miriam.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Precise Interprocedural Dataflow Analysis via Graph Reachibility
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Course Overview John Cavazos University.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Coverage Criteria for Testing of Object Interactions in Sequence Diagrams Atanas (Nasko) Rountev Scott Kagan Jason Sawin Ohio State University.
Rethinking Soot for Summary-Based Whole- Program Analysis PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Dacong Yan.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Advanced Compilers CMPSCI 710 Spring 2004 Lecture 1 Emery Berger University of Massachusetts,
CBSE'051 Component-Level Dataflow Analysis Atanas (Nasko) Rountev Ohio State University.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.
Software Engineering Research Group, Graduate School of Engineering Science, Osaka University A Slicing Method for Object-Oriented Programs Using Lightweight.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
5/7/03ICSE Fragment Class Analysis for Testing of Polymorphism in Java Software Atanas (Nasko) Rountev Ohio State University Ana Milanova Barbara.
Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University.
Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University.
1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.
Inter-procedural analysis
Static Analysis of Object References in RMI-based Java Software
Atanas (Nasko) Rountev Ohio State University
Compositional Pointer and Escape Analysis for Java Programs
Atanas (Nasko) Rountev Barbara G. Ryder Rutgers University
Building a Whole-Program Type Analysis in Eclipse
Inlining and Devirtualization Hal Perkins Autumn 2011
Inlining and Devirtualization Hal Perkins Autumn 2009
Demand-Driven Context-Sensitive Alias Analysis for Java
Pointer analysis.
Interactive Exploration of Reverse-Engineered UML Sequence Diagrams
Lecture 4: Instruction Set Design/Pipelining
Optimizing Compilers CISC 673 Spring 2009 Course Overview
Presentation transcript:

PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio State University Thomas Marlowe Seton Hall University

3/30/06 2 CC 2006, Scott Kagan, PRESTO Research Group Uses of Interprocedural Dataflow Analysis  Performance optimizations in compilers  Software understanding and transformation  e.g. dependence analysis for program slicing, change impact analysis, refactoring, etc.  Software testing  e.g. dataflow-based testing; testing of object interactions in OO software  Software checking  e.g. object protocols: open(read|write)*close

3/30/06 3 CC 2006, Scott Kagan, PRESTO Research Group Model for Interprocedural Whole-Program Analysis  Components C 1, C 2, …, C n form a complete program  Assumption: it is possible and desirable to analyze the source code of the entire program code for C 1 code for C 2 … code for C n dataflow solution for C 1 + C 2 + … + C n Engine for Whole-ProgramDataflowAnalysis

3/30/06 4 CC 2006, Scott Kagan, PRESTO Research Group A Specific Case: Main + Lib  Main + Lib form a complete program  What if we are using large libraries that need to be re- analyzed from scratch?  e.g. the standard Java libraries contain about 10,000 classes and 80,000 methods  need to be re-analyzed with every new Main component code for Main code for Lib dataflow solution for Main + Lib Engine for Whole-ProgramDataflowAnalysis

3/30/06 5 CC 2006, Scott Kagan, PRESTO Research Group Example: Methods in Java Programs

3/30/06 6 CC 2006, Scott Kagan, PRESTO Research Group A Specific Case: Main + Lib  Goal: the solution for Main should be as good as the solution that would have been computed by a whole- program analysis (no loss of precision) code for Lib Summary Generation Analysis summary for Lib code for Main dataflow solution for Main Engine for Whole-Program Dataflow Analysis summary for Lib

3/30/06 7 CC 2006, Scott Kagan, PRESTO Research Group Functional Approach to Whole-Program Analysis  Sharir-Pnueli 1981  Dataflow lattice L  Edge function f: L  L for effects of a statement  Path function: f = f n  f n-1  …  f 2  f 1  Phase 1: summary functions φ n : L  L  solution at node n as a function of the solution at the entry of n’s procedure  Phase 2: solutions at start nodes of procedures  Phase 3: solutions at the remaining nodes

3/30/06 8 CC 2006, Scott Kagan, PRESTO Research Group φ 6 = φ 13  f 1  f 0 Example: Functional Approach φ 28 = f 8  f 7 φ 21 = f 4  f 5  (φ 28  f 6 ) φ 13 = (φ 21  f 2 )  (φ 21  f 3 )

3/30/06 9 CC 2006, Scott Kagan, PRESTO Research Group Callbacks  Callbacks  e.g. function pointers in C  e.g. virtual dispatch in C++ and Java  Can no longer determine φ 21 and φ 13 without code for ext

3/30/06 10 CC 2006, Scott Kagan, PRESTO Research Group Library Summary  Idea: run “pieces” of phase 1  Compute functions for sets of library- local paths φ = id φ = f 8  f 7  f 6 φ = f 4  f 5 φ = f 2  f 3 φ = id

3/30/06 11 CC 2006, Scott Kagan, PRESTO Research Group Library Summary Generation  “Fixed” call in the library  always invokes the same library procedure independent of code for main component  “Fixed” procedure in the library  makes no calls, or  makes only fixed calls, to fixed procedures  standard functional approach can be applied  For any other procedure, compute φ  k is the start node, or  k is a return from a non-fixed call, or  k is a return from a fixed call to a non-fixed procedure knkn

3/30/06 12 CC 2006, Scott Kagan, PRESTO Research Group Example: Library Summary Generation  Fixed calls  and  Non-fixed calls   Fixed procedures  p3  Non-fixed procedures  p1 and p2  Contexts k for φ  7 and 14: start nodes  17: return from a non-fixed call  12: return from a fixed call to a non-fixed procedure knkn

3/30/06 13 CC 2006, Scott Kagan, PRESTO Research Group The Condensed Graph p2 21 p φ = id φ = f 8  f 7  f 6 φ = f 4  f 5 φ = f 2  f 3 φ = id

3/30/06 14 CC 2006, Scott Kagan, PRESTO Research Group Analysis of a Main Component  Create a “fake” graph for the whole program  Run a whole- program analysis engine  Safe solutions for non-library nodes  precise for distributive problems

3/30/06 15 CC 2006, Scott Kagan, PRESTO Research Group Original vs. Condensed Library CFGs: Number of Nodes

3/30/06 16 CC 2006, Scott Kagan, PRESTO Research Group Original vs. Condensed Library CFGs: Number of Edges

3/30/06 17 CC 2006, Scott Kagan, PRESTO Research Group Discussion  Flow and context insensitivity  Cost reduction: time and memory  Compact representation of functions  IFDS, IDE  Use assumptions about the callback methods?  e.g. assume callback methods are “good”