Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.

Similar presentations


Presentation on theme: "Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India."— Presentation transcript:

1 Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India

2 Andersen’s Analysis A flow-insensitive analysis –computes a single points-to solution valid at all program points –ignores control-flow – treats program as a set of statements –equivalent to merging all vertices into one (and applying algorithm A) –equivalent to adding an edge between every pair of vertices (and applying algo. A) –a solution R such that R  IdealMayPT(u) for every vertex u

3 Example (Flow-Sensitive Analysis) x = &a; y = x; x = &b; z = x; x = &a y = x 4 5 x = &b z = x

4 Example: Andersen’s Analysis x = &a; y = x; x = &b; z = x; x = &a y = x 4 5 x = &b z = x

5 Andersen’s Analysis Strong updates? Initial state?

6 Why Flow-Insensitive Analysis? Reduced space requirements –a single points-to solution Reduced time complexity –no copying individual updates more efficient –no need for joins –number of iterations? –a cubic-time algorithm Scales to millions of lines of code –most popular points-to analysis

7 Andersen’s Analysis A Set-Constraints Formulation Compute PT x for every variable x StatementConstraint x = null x = &y x = y x = *y *x = y

8 Steensgaard’s Analysis Unification-based analysis Inspired by type inference –an assignment “lhs := rhs” is interpreted as a constraint that lhs and rhs have the same type –the type of a pointer variable is the set of variables it can point-to “Assignment-direction-insensitive” –treats “lhs := rhs” as if it were both “lhs := rhs” and “rhs := lhs” An almost-linear time algorithm –single-pass algorithm; no iteration required

9 Example: Andersen’s Analysis x = &a; y = x; y = &b; b = &c; x = &a y = x 4 5 y = &b b = &c

10 Example: Steensgaard’s Analysis x = &a; y = x; y = &b; b = &c; x = &a y = x 4 5 y = &b b = &c

11 Steensgaard’s Analysis Can be implemented using Union-Find data-structure Leads to an almost-linear time algorithm

12 Exercise x = &a; y = x; y = &b; b = &c; *x = &d;

13 May-Point-To Analyses Ideal-May-Point-To Algorithm A Andersen’s Steensgaard’s more efficient / less precise ??? more efficient / less precise

14 Ideal Points-To Analysis: Definition Recap A sequence of states s 1 s 2 … s n is said to be an execution (of the program) iff – s 1 is the Initial-State –s i | s i+1 for 1 <= I < n A state s is said to be a reachable state iff there exists some execution s 1 s 2 … s n is such that s n = s. RS(u) = { s | (u,s) is reachable } IdealMayPT (u) = { (p,x) |  s  RS(u). s(p) == x } IdealMustPT (u) = { (p,x) |  s  RS(u). s(p) == x }

15 Does Algorithm A Compute The Most Precise Solution?

16 Ideal Algorithm A Abstract away correlations between variables –relational analysis vs. –independent attribute x: &by: &x x: &yy: &z x: {&y,&b}y: {&x,&z}   x: &yy: &x x: &by: &z x: &yy: &z x: &by: &x

17 Does Algorithm A Compute The Most Precise Solution?

18 Is The Precise Solution Computable? Claim: The set RS(u) of reachable concrete states (for our language) is computable. Note: This is true for any collecting semantics with a finite state space.

19 Precise Points-To Analysis: Decidability Corollary: Precise may-point-to analysis is computable. Corollary: Precise (demand) may-alias analysis is computable. –Given ptr-exp1, ptr-exp2, and a program point u, identify if there exists some reachable state at u where ptr-exp1 and ptr-exp2 are aliases. Ditto for must-point-to and must-alias … for our restricted language!

20 Precise Points-To Analysis: Computational Complexity What’s the complexity of the least-fixed point computation using the collecting semantics? The worst-case complexity of computing reachable states is exponential in the number of variables. –Can we do better? Theorem: Computing precise may-point-to is PSPACE-hard even if we have only two-level pointers.

21 May-Point-To Analyses Ideal-May-Point-To Algorithm A Andersen’s Steensgaard’s more efficient / less precise

22 Precise Points-To Analysis: Caveats Theorem: Precise may-alias analysis is undecidable in the presence of dynamic memory allocation. –Add “x = new/malloc ()” to language –State-space becomes infinite Digression: Integer variables + conditional- branching also makes any precise analysis undecidable.

23 May-Point-To Analyses Ideal (no Int, no Malloc) Algorithm A Andersen’s Steensgaard’s Ideal (with Int, with Malloc) Ideal (with Int) Ideal (with Malloc)

24 Dynamic Memory Allocation s: x = new () / malloc () Assume, for now, that allocated object stores one pointer –s: x = malloc ( sizeof(void*) ) Introduce a pseudo-variable V s to represent objects allocated at statement s, and use previous algorithm –treat s as if it were “x = &V s ” –also track possible values of V s –allocation-site based approach Key aspect: V s represents a set of objects (locations), not a single object –referred to as a summary object (node)

25 Dynamic Memory Allocation: Example x = new; y = x; *y = &b; *y = &a; x = new y = x 4 5 *y = &b *y = &a

26 Dynamic Memory Allocation: Object Fields Field-sensitive analysis class Foo { A* f; B* g; } s: x = new Foo() x->f = &b; x->g = &a;

27 Dynamic Memory Allocation: Object Fields Field-insensitive analysis class Foo { A* f; B* g; } s: x = new Foo() x->f = &b; x->g = &a;

28 Other Aspects Context-sensitivity Indirect (virtual) function calls and call- graph construction Pointer arithmetic Object-sensitivity

29 Andersen’s Analysis: Further Optimizations and Extensions Fahndrich et al., Partial online cycle elimination in inclusion constraint graphs, PLDI Rountev and Chandra, Offline variable substitution for scaling points-to analysis, Heintze and Tardieu, Ultra-fast aliasing analysis using CLA: a million lines of C code in a second, PLDI M. Hind, Pointer analysis: Haven’t we solved this problem yet?, PASTE Hardekopf and Lin, The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code, PLDI Hardekopf and Lin, Exploiting pointer and location equivalence to optimize pointer analysis, SAS Hardekopf and Lin, Semi-sparse flow-sensitive pointer analysis, POPL 2009.

30 Context-Sensitivity Etc. Liang & Harrold, Efficient computation of parameterized pointer information for interprocedural analyses. SAS Lattner et al., Making context-sensitive points-to analysis with heap cloning practical for the real world, PLDI Zhu & Calman, Symbolic pointer analysis revisited. PLDI Whaley & Lam, Cloning-based context-sensitive pointer alias analysis using BDD, PLDI Rountev et al. Points-to analysis for Java using annotated constraints. OOPSLA Milanova et al. Parameterized object sensitivity for points-to and side-effect analyses for Java. ISSTA 2002.

31 Applications Compiler optimizations Verification & Bug Finding –use in preliminary phases –use in verification itself

32 Dynamic Memory Allocation: Summary Object Update 4 5 *y = &a

33 Abstract Transformers: Weak/Strong Update AS[stmt] : AbsDataState -> AbsDataState AS[ *x = y ] s =

34 Correctness & Precision How can we formally reason about the correctness & precision of abstract transformers? Can we systematically derive a correct abstract transformer?

35 Enter: The French Recipe (Abstract Interpretation) 2 Data-State 2 Var x Var’ Concrete Domain Concrete states: C Semantics: For every statement st, CS[st] : C -> C  

36 Points-To Analysis (Abstract Interpretation)  (Y) = { (p,x) | exists s in Y. s(p) == x } RS(u) 2 Data-State 2 Var x Var’ IdealMayPT(u) MayPT(u)    IdealMayPT (u) =  ( RS(u) )

37 Approximating Transformers: Correctness Criterion CA correctly approximated by c1 c2 f a1 a2 f#f# correctly approximated by c is said to be correctly approximated by a iff  (c)   a

38 Approximating Transformers: Correctness Criterion CA c1 c2 f a1 a2 f#f# concretization  abstraction  requirement: f # (a1) ≥  (f(  (a1))

39 Concrete Transformers CS[stmt] : Data-State -> Data-State CS[ x = y ] s = s[x  s(y)] CS[ x = *y ] s = s[x  s(s(y))] CS[ *x = y ] s = s[s(x)  s(y)] CS[ x = null ] s = s[x  null] CS*[stmt] : 2 Data-State -> 2 Data-State CS*[st] X = { CS[st]s | s  X }

40 Abstract Transformers AS[stmt] : AbsDataState -> AbsDataState AS[ x = y ] s = s[x  s(y)] AS[ x = null ] s = s[x  {null}] AS[ x = *y ] s = s[x  s*(s(y))] where s*({v 1,…,v n }) = s(v 1 )  …  s(v n ) AS[ *x = y ] s = ???

41 Algorithm A: Tranformers Weak/Strong Update x: {&y}y: {&x,&z}z: {&a} x: &by: &xz: &a x: &yy: &zz: &b x: {&y,&b}y: {&x,&z}z: {&a,&b} x: &yy: &xz: &a x: &yy: &zz: &a *y = &b; f#f# f  

42 Algorithm A: Tranformers Weak/Strong Update x: {&y}y: {&x,&z}z: {&a} x: &yy: &bz: &a x: &yy: &bz: &a x: {&y}y: {&b}z: {&a} x: &yy: &xz: &a x: &yy: &zz: &a *x = &b; f#f# f  

43 Dynamic Memory Allocation: Summary Object Update 4 5 *y = &a


Download ppt "Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India."

Similar presentations


Ads by Google