Presentation is loading. Please wait.

Presentation is loading. Please wait.

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.

Similar presentations


Presentation on theme: "PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to."— Presentation transcript:

1 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to Analysis Guoqing Xu and Atanas Rountev Ohio State University Supported by NSF under CAREER grant CCF-0546040

2 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 2 Precise and Scalable Points-to Analysis  Analysis precision - Calling context sensitivity – e.g. chain of call sites - Heap cloning [Nystrom-PASTE’04, Lhotak-CC’06] - The most precise analysis: refinement-based analysis [Sridharan-PLDI’06], but does not scale well  Analysis scalability - Millions of distinct call chains in a moderate-size Java program - Option 1: sacrifice precision with k-length chain - Option 2: merge equivalent relationships using BDDs - BDDs incurs running time overhead, and may not scale for heap-cloning-based analysis

3 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Motivation  Many clients require the underlying pointer analysis to quickly provide a precise points-to solution for all variables (e.g., slicing; verification)  Example: - The refinement-based analysis used 1000 sec to provide a precise solution for all variables in a 30-line Java program (including variables in all relevant library code) - The solution produced under a 500-second budget made our slicer generate a slice containing the whole program 3 Goal: relatively precise result in a more scalable manner

4 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merge Equivalent Contexts  Equivalence classes exists in the representation of calling contexts [Lhotak-CC’06]  Can we find and merge equivalent calling contexts? - We would be able to scale the points-to analysis without relying on the merging inside the BDD “black box”  A unique replacement context (URC) can be used to represent an equivalence class in the points-to relationships 4

5 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Outline  A model of equivalent contexts - Abstraction functions for pointer variables and targets - Proposed for pointer analysis, but can be applied to other context-sensitive analysis algorithms  A whole-program points-to analysis for Java - Implements the model - Context-sensitive for both pointer variables and targets - Does not limit the length of context strings (not k-CFA) - Bottom-up, summary-based  Experimental evaluation - Much more precise and efficient than state-of-the-art 1-object-sensitive analysis with BDDs [Lhotak-CC’06] - More efficient than the refinement-based analysis 5

6 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Motivating Example void main(String[] args){ A a1 = new A(); A a2 = new A(); foo(a1); //call site 1 foo(a2); //call site 2 } void foo(A a){ t = new B(); a.fld = t ; bar(a.fld); //call site 3 } B bar(B b){ p = b; return p; } 6 p (1,3) ---> new B (1,3) p (2,3) ---> new B (2,3) t 1 ---> new B 1 t 2 ---> new B 2 Observation: 1. t points to new B, under all calling contexts 2. p points to new B, under calling contexts (*,3)

7 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University A Better Representation?  Can we represent the points-to relationships like this? - t  new B - p 3  new B 3 - 1 copy of t and p, 2 copies of new B  Key insights - Context-sensitivity corresponds to inlining; full context- sensitivity is achieved if all reachable methods are inlined in main - If a points-to relationship can be determined at method m during inlining, it will not be affected by m’s callers 7

8 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Two Hypothetical Inlining-based Analyses  (v, c v, o, c o ) represents a fully context-sensitive points-to relationship; v is local var; o is alloc site  Analysis 1: all-the-way inlining - A statement p := q is rewritten as p (e) := q (e), at call graph edge e - An intraprocedural points-to analysis is performed on the body of main  Analysis 2: a variation of analysis 1 - An intraprocedural analysis is performed immediately after a method m inlines all its call sites - Suppose (v, c v, o, c o ) is produced for m - (v, (e)  c v, o, (e)  c o ) must be produced later for m’s caller n; e is the call graph edge between n and m 8

9 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Calling Context Reduction  Inlining preserves context-sensitive points-to relationships - All statements in m are also in n  The lifetime of a tuple (v, c v, o, c o ) consists of - A single creation event in some method m: flowing point - A sequence of inlining steps that increase both c v and c o in synch  URCs computed by abstraction functions for calling context; m is the flowing point - F v (c v ) = (e 0, e 1, …, e i ) where e 0.src = m - F o (c o ) = (f 0, f 1, …, f j ) where f 0.src = m - Keep only the relevant suffix of the call chain 9

10 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Approximation in the Presence of Recursion  Two-phase approximation: (1) map an infinite call chain to a finite one, collapsing cycles while going backwards; (2) then, use the abstraction function shown earlier 10  Precision loss may result from phase 1: e.g., p points to o under context eabcdabf, but does not under context eabf

11 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Using the Reduced Contexts  A URC is used to represent a set of calling contexts in the points-to relationships  A query (v, c) can be answered as follows: - Find all (v, urc v, o, urc o ) such that substring(urc v, c) holds - Return all (o, c o ) such that substring(urc o, c o ) holds  A BDD may not be as effective for an analysis with heap cloning - Without heap cloning, an equivalence class is defined by a single string urc v - For an analysis with heap cloning, an equivalence class is defined by a pair (urc v, urc o ) - More classes = fewer opportunities for merging 11

12 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Points-to Analysis  A specific algorithm that implements this model - Using URCs to represent calling contexts - The use of URCs could be applicable to other categories of points-to analysis  Resembles bottom-up inlining  Heap-cloning-based - Context-sensitively treat both pointer variables and targets  Partial unification (bi-directional flow of values) 12

13 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Intraprocedural Analysis  Symbolic points-to graph (SPG) - A symbolic object node is introduced for each (1) formal parameter, (2) base variable v of load a = v.f, and (3) lhs v of a call site v = a.b(…) - Standard points-to analysis algorithm [Lhotak-CC’03] is used for SPG construction - SPG contains much fewer nodes and edges than the original program  Example void add (Integer t) { this.names = t; } 13

14 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Escape Analysis  An allocation node new C or symbolic node SO directly escapes a method, if - It is pointed to by a formal parameter - It is pointed to by a returned variable - It is pointed to by a static field  A node indirectly escapes a method, if - It is reachable from nodes that directly escape  Compute a set of allocation/symbolic nodes that escape the method where they are defined 14

15 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Interprocedural Analysis  Summary-based - Bottom-up traverse the call graph SCC-DAG  Summary function definition: - {[O f, G f ]} - G f : the subgraph of all escaping objects (reachable from O f ) and their points-to edges  Clone a summary function for each incoming call graph edge e - If o 1 c1  o 2 c2 where o 1 and o 2 escape: in the caller, create o 1 (e)  c1  o 2 (e)  c2 - If v c1  s c2 where s is an escaping symbolic node: create v (e)  c1  s (e)  c2 15 f f

16 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Interprocedural Analysis  Composition of summary functions - For each [O actual, G actual ] - Find [O’ formal, G’ formal ] - Merge G actual and G’ formal  Subgraph merging - Simultaneously traverse G actual and G’ formal from O actual and O’ formal - Merge b and c, if   where d and e have been merged 16

17 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Two Nodes 17

18 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Interprocedural Analysis  The points-to solution is built on the fly - Once v c1  o c2 is formed, (v, c 1, o, c 2 ) is added to the points-to solution - The edge is removed from the SPG: we have found the flowing point and no further propagation is necessary - (c 1, c 2 ) defines an equivalence class for contexts  Node merging is essentially a dynamic transitive closure computation for identifying memory aliases 18

19 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University K-Last-Substring-Merging  A scalability-precision tradeoff - When composing summary functions, do not clone node o c1 in the callee if there already exists a node o c2 in the caller such that suffix(c 1, k) == suffix(c 2, k)  Does not limit the context length  Can be used to tune the amount of context merging throughout the program  Example - k =3 - O (defg) and O (hefg) are distinct nodes if d.src != h.src - O (defg) and O (hefg) are merged if d.src == h.src 19

20 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Experiments  Benchmark set contains 19 Java programs, from SPECJVM, Ashes, and DaCapo  Experimentally compared our analysis with - Refinement: the refinement-based analysis from [Sridharan-PLDI’06] with its default budget for refinement  the most precise publicly-available analysis for computing an on-demand solution  queried the points-to sets for all possible (v, c) - 1H: 1-object-sensitive analysis with heap cloning, using BDDs [Lhotak-CC’06]; the most precise publicly-available analysis for computing a whole-program solution  Our analysis: computes a whole-program solution - 1Equiv: 1-last-substring-merging - 2Equiv: 2-last-substring-merging 20

21 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Precision: #downcasts proven to be safe 21 2Equiv proves 59% more safe downcasts than 1H But less than Refinement

22 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Cost: time to get a whole-program solution 22 2Equiv is 13 times faster than 1H 5.3 times faster than Refinement

23 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Conclusions  Context equivalence class identification - An equivalence class is represented by a URC - Only URCs need to be explicitly represented in the data structures of the analysis  A points-to analysis for Java - Uses URCs to represent contexts - Summary-based, bottom-up, with partial unification - Heap-cloning  Experimental evaluation - Precision approaches that of the refinement-based analysis, and is much higher than that of 1H - Faster than both refinement-based and 1H


Download ppt "PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to."

Similar presentations


Ads by Google