PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.

Slides:

Advertisements

Similar presentations

R O O T S Field-Sensitive Points-to-Analysis Eda GÜNGÖR

Advertisements

Uncovering Performance Problems in Java Applications with Reference Propagation Profiling PRESTO: Program Analyses and Software Tools Research Group, Ohio.

Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.

Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.

Go with the Flow: Profiling Copies to Find Run-time Bloat Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, Gary Sevitsky Ohio State University.

A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.

1 Practical Object-sensitive Points-to Analysis for Java Ana Milanova Atanas Rountev Barbara Ryder Rutgers University.

Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.

The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.

Semi-Sparse Flow-Sensitive Pointer Analysis Ben Hardekopf Calvin Lin The University of Texas at Austin POPL ’09 Simplified by Eric Villasenor.

Interprocedural analysis © Marcelo d’Amorim 2010.

Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.

Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.

Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.

Finding Low-Utility Data Structures Guoqing Xu 1, Nick Mitchell 2, Matthew Arnold 2, Atanas Rountev 1, Edith Schonberg 2, Gary Sevitsky 2 1 Ohio State.

Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.

1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3

Previous finals up on the web page use them as practice problems look at them early.

Scaling CFL-Reachability-Based Points- To Analysis Using Context-Sensitive Must-Not-Alias Analysis Guoqing Xu, Atanas Rountev, Manu Sridharan Ohio State.

Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:

1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.

A Context-Sensitive Pointer Analysis Phase in Open64 Compiler Tianwei Sheng, Wenguang Chen, Weimin Zheng Tsinghua University.

Intraprocedural Points-to Analysis Flow functions:

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Regression Test Selection for AspectJ Software Guoqing Xu and Atanas.

Recap from last time g() { lock; } h() { unlock; } f() { h(); if (...) { main(); } } main() { g(); f(); lock; unlock; } mainfgh ;;;;;;; u u ” ”””” ” ”

Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.

Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.

Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University A Framework for Source-Code- Level Interprocedural Dataflow Analysis.

Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.

Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.

Static Control-Flow Analysis for Reverse Engineering of UML Sequence Diagrams Atanas (Nasko) Rountev Ohio State University with Olga Volgin and Miriam.

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.

PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio.

Putting Pointer Analysis to Work Rakesh Ghiya and Laurie J. Hendren Presented by Shey Liggett & Jason Bartkowiak.

Rethinking Soot for Summary-Based Whole- Program Analysis PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Dacong Yan.

Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.

Static Detection of Loop-Invariant Data Structures Harry Xu, Tony Yan, and Nasko Rountev University of California, Irvine Ohio State University 1.

CBSE'051 Component-Level Dataflow Analysis Atanas (Nasko) Rountev Ohio State University.

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture.

ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.

Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.

Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.

Pointer Analysis – Part II CS Unification vs. Inclusion Earlier scalable pointer analysis was context- insensitive unification-based [Steensgaard.

CS 343 presentation Concrete Type Inference Department of Computer Science Stanford University.

Verification of Behavioral Consistency in C by Using Symbolic Simulation and Program Slicer Takeshi Matsumoto Thanyapat Sakunkonchak Hiroshi Saito Masahiro.

Escape Analysis for Java Will von Rosenberg Noah Wallace.

Pointer and Escape Analysis for Multithreaded Programs Alexandru Salcianu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.

Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.

5/7/03ICSE Fragment Class Analysis for Testing of Polymorphism in Java Software Atanas (Nasko) Rountev Ohio State University Ana Milanova Barbara.

Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University.

Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University.

1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.

Pick Your Contexts Well: Understanding Object-Sensitivity The Making of a Precise and Scalable Pointer Analysis Yannis Smaragdakis University of Massachusetts,

Inter-procedural analysis

Run-Time Environments Presented By: Seema Gupta 09MCA102.

Compositional Pointer and Escape Analysis for Java programs

Static Analysis of Object References in RMI-based Java Software

Atanas (Nasko) Rountev Ohio State University

Compositional Pointer and Escape Analysis for Java Programs

Pointer Analysis Lecture 2

Interprocedural Analysis Chapter 19

Ravi Mangal Mayur Naik Hongseok Yang

Demand-Driven Context-Sensitive Alias Analysis for Java

Pointer analysis.

자바 언어를 위한 정적 분석 (Static Analyses for Java) ‘99 한국정보과학회 가을학술발표회 튜토리얼

UNIT V Run Time Environments.

Presentation transcript:

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to Analysis Guoqing Xu and Atanas Rountev Ohio State University Supported by NSF under CAREER grant CCF

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 2 Precise and Scalable Points-to Analysis  Analysis precision - Calling context sensitivity – e.g. chain of call sites - Heap cloning [Nystrom-PASTE’04, Lhotak-CC’06] - The most precise analysis: refinement-based analysis [Sridharan-PLDI’06], but does not scale well  Analysis scalability - Millions of distinct call chains in a moderate-size Java program - Option 1: sacrifice precision with k-length chain - Option 2: merge equivalent relationships using BDDs - BDDs incurs running time overhead, and may not scale for heap-cloning-based analysis

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Motivation  Many clients require the underlying pointer analysis to quickly provide a precise points-to solution for all variables (e.g., slicing; verification)  Example: - The refinement-based analysis used 1000 sec to provide a precise solution for all variables in a 30-line Java program (including variables in all relevant library code) - The solution produced under a 500-second budget made our slicer generate a slice containing the whole program 3 Goal: relatively precise result in a more scalable manner

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merge Equivalent Contexts  Equivalence classes exists in the representation of calling contexts [Lhotak-CC’06]  Can we find and merge equivalent calling contexts? - We would be able to scale the points-to analysis without relying on the merging inside the BDD “black box”  A unique replacement context (URC) can be used to represent an equivalence class in the points-to relationships 4

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Outline  A model of equivalent contexts - Abstraction functions for pointer variables and targets - Proposed for pointer analysis, but can be applied to other context-sensitive analysis algorithms  A whole-program points-to analysis for Java - Implements the model - Context-sensitive for both pointer variables and targets - Does not limit the length of context strings (not k-CFA) - Bottom-up, summary-based  Experimental evaluation - Much more precise and efficient than state-of-the-art 1-object-sensitive analysis with BDDs [Lhotak-CC’06] - More efficient than the refinement-based analysis 5

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Motivating Example void main(String[] args){ A a1 = new A(); A a2 = new A(); foo(a1); //call site 1 foo(a2); //call site 2 } void foo(A a){ t = new B(); a.fld = t ; bar(a.fld); //call site 3 } B bar(B b){ p = b; return p; } 6 p (1,3) ---> new B (1,3) p (2,3) ---> new B (2,3) t 1 ---> new B 1 t 2 ---> new B 2 Observation: 1. t points to new B, under all calling contexts 2. p points to new B, under calling contexts (*,3)

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University A Better Representation?  Can we represent the points-to relationships like this? - t  new B - p 3  new B copy of t and p, 2 copies of new B  Key insights - Context-sensitivity corresponds to inlining; full context- sensitivity is achieved if all reachable methods are inlined in main - If a points-to relationship can be determined at method m during inlining, it will not be affected by m’s callers 7

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Two Hypothetical Inlining-based Analyses  (v, c v, o, c o ) represents a fully context-sensitive points-to relationship; v is local var; o is alloc site  Analysis 1: all-the-way inlining - A statement p := q is rewritten as p (e) := q (e), at call graph edge e - An intraprocedural points-to analysis is performed on the body of main  Analysis 2: a variation of analysis 1 - An intraprocedural analysis is performed immediately after a method m inlines all its call sites - Suppose (v, c v, o, c o ) is produced for m - (v, (e)  c v, o, (e)  c o ) must be produced later for m’s caller n; e is the call graph edge between n and m 8

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Calling Context Reduction  Inlining preserves context-sensitive points-to relationships - All statements in m are also in n  The lifetime of a tuple (v, c v, o, c o ) consists of - A single creation event in some method m: flowing point - A sequence of inlining steps that increase both c v and c o in synch  URCs computed by abstraction functions for calling context; m is the flowing point - F v (c v ) = (e 0, e 1, …, e i ) where e 0.src = m - F o (c o ) = (f 0, f 1, …, f j ) where f 0.src = m - Keep only the relevant suffix of the call chain 9

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Approximation in the Presence of Recursion  Two-phase approximation: (1) map an infinite call chain to a finite one, collapsing cycles while going backwards; (2) then, use the abstraction function shown earlier 10  Precision loss may result from phase 1: e.g., p points to o under context eabcdabf, but does not under context eabf

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Using the Reduced Contexts  A URC is used to represent a set of calling contexts in the points-to relationships  A query (v, c) can be answered as follows: - Find all (v, urc v, o, urc o ) such that substring(urc v, c) holds - Return all (o, c o ) such that substring(urc o, c o ) holds  A BDD may not be as effective for an analysis with heap cloning - Without heap cloning, an equivalence class is defined by a single string urc v - For an analysis with heap cloning, an equivalence class is defined by a pair (urc v, urc o ) - More classes = fewer opportunities for merging 11

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Points-to Analysis  A specific algorithm that implements this model - Using URCs to represent calling contexts - The use of URCs could be applicable to other categories of points-to analysis  Resembles bottom-up inlining  Heap-cloning-based - Context-sensitively treat both pointer variables and targets  Partial unification (bi-directional flow of values) 12

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Intraprocedural Analysis  Symbolic points-to graph (SPG) - A symbolic object node is introduced for each (1) formal parameter, (2) base variable v of load a = v.f, and (3) lhs v of a call site v = a.b(…) - Standard points-to analysis algorithm [Lhotak-CC’03] is used for SPG construction - SPG contains much fewer nodes and edges than the original program  Example void add (Integer t) { this.names = t; } 13

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Escape Analysis  An allocation node new C or symbolic node SO directly escapes a method, if - It is pointed to by a formal parameter - It is pointed to by a returned variable - It is pointed to by a static field  A node indirectly escapes a method, if - It is reachable from nodes that directly escape  Compute a set of allocation/symbolic nodes that escape the method where they are defined 14

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Interprocedural Analysis  Summary-based - Bottom-up traverse the call graph SCC-DAG  Summary function definition: - {[O f, G f ]} - G f : the subgraph of all escaping objects (reachable from O f ) and their points-to edges  Clone a summary function for each incoming call graph edge e - If o 1 c1  o 2 c2 where o 1 and o 2 escape: in the caller, create o 1 (e)  c1  o 2 (e)  c2 - If v c1  s c2 where s is an escaping symbolic node: create v (e)  c1  s (e)  c2 15 f f

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Interprocedural Analysis  Composition of summary functions - For each [O actual, G actual ] - Find [O’ formal, G’ formal ] - Merge G actual and G’ formal  Subgraph merging - Simultaneously traverse G actual and G’ formal from O actual and O’ formal - Merge b and c, if   where d and e have been merged 16

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Two Nodes 17

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Interprocedural Analysis  The points-to solution is built on the fly - Once v c1  o c2 is formed, (v, c 1, o, c 2 ) is added to the points-to solution - The edge is removed from the SPG: we have found the flowing point and no further propagation is necessary - (c 1, c 2 ) defines an equivalence class for contexts  Node merging is essentially a dynamic transitive closure computation for identifying memory aliases 18

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University K-Last-Substring-Merging  A scalability-precision tradeoff - When composing summary functions, do not clone node o c1 in the callee if there already exists a node o c2 in the caller such that suffix(c 1, k) == suffix(c 2, k)  Does not limit the context length  Can be used to tune the amount of context merging throughout the program  Example - k =3 - O (defg) and O (hefg) are distinct nodes if d.src != h.src - O (defg) and O (hefg) are merged if d.src == h.src 19

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Experiments  Benchmark set contains 19 Java programs, from SPECJVM, Ashes, and DaCapo  Experimentally compared our analysis with - Refinement: the refinement-based analysis from [Sridharan-PLDI’06] with its default budget for refinement  the most precise publicly-available analysis for computing an on-demand solution  queried the points-to sets for all possible (v, c) - 1H: 1-object-sensitive analysis with heap cloning, using BDDs [Lhotak-CC’06]; the most precise publicly-available analysis for computing a whole-program solution  Our analysis: computes a whole-program solution - 1Equiv: 1-last-substring-merging - 2Equiv: 2-last-substring-merging 20

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Precision: #downcasts proven to be safe 21 2Equiv proves 59% more safe downcasts than 1H But less than Refinement

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Cost: time to get a whole-program solution 22 2Equiv is 13 times faster than 1H 5.3 times faster than Refinement

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Conclusions  Context equivalence class identification - An equivalence class is represented by a URC - Only URCs need to be explicitly represented in the data structures of the analysis  A points-to analysis for Java - Uses URCs to represent contexts - Summary-based, bottom-up, with partial unification - Heap-cloning  Experimental evaluation - Precision approaches that of the refinement-based analysis, and is much higher than that of 1H - Faster than both refinement-based and 1H