Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pointer Analysis. Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008.

Similar presentations


Presentation on theme: "Pointer Analysis. Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008."— Presentation transcript:

1 Pointer Analysis. Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008.

2 Outline. Motivation and Introduction. Related Work. Preliminary Results. Research Directions.

3 What is Pointer Analysis? Pointer analysis is the mechanism of statically finding out possible run-time values of a pointer.

4 What is Pointer Analysis? Pointer analysis is the mechanism of statically finding out possible run-time values of a pointer and relation of various pointers with each other.

5 Relation between pointers. p = arr + ii; q = arr + jj; if (p == q) { fun(); } q = p;... if (p == q) { fun(); }

6 Variants of Pointer Analysis. Alias analysis. do p and q point to the same memory location? Points-to analysis. does p point to memory location x?

7 Why Pointer Analysis? for parallelization: fun(p); fun(q); for common subexpression elimination: x = p + 2; y = q + 2; for dead code elimination. if (p == q) { fun(); } for other optimizations.

8 Introduction. Flow sensitivity. Context sensitivity. Field sensitivity. Unification based. Inclusion based.

9 Flow sensitivity. p = &x; p = &y; label:... flow-sensitive: {(p, &y)}. flow-insensitive: {(p, &x), (p, &y)}.

10 Context sensitivity. caller1() { caller2() { fun(int *ptr) { fun(p); fun(q); r = ptr; } } } context-insensitive: {(r, p), (r, q)}. context sensitive: {(r, p)} along call-path caller1, {(r, q)} along call-path caller2.

11 Field sensitivity. x.f = p; or p = x.f; field-sensitive: {(x.f, p)}. field-insensitive: {(x, p)}.

12 Unification based. one(&s1); one(struct s*p) { two(struct s*q) { one(&s2); p->a = 3; q->b = 4; two(&s3); two(p); } } unification-based: {(p, &s1), (p, &s2), (p, &s3), (q, &s1), (q, &s2), (q, &s3)}.

13 Inclusion based. one(&s1); one(struct s*p) { two(struct s*q) { one(&s2); p->a = 3; q->b = 4; two(&s3); two(p); } } inclusion-based: {(p, &s1), (p, &s2), (q, &s1), (q, &s2), (q, &s3)}

14 Like all other important problems in Computer Science... Alias analysis without memory allocation, intra-procedural, flow-sensitive, supporting arbitrary levels of indirection, is NP-hard. For two levels of indirection, it is still NP-hard. Even flow-insensitive analysis is NP-hard (for arbitrary levels of indirection). With dynamic memory allocation, allowing structs, it becomes undecidable. Even for scalars (no structs), it remains undecidable. G Ramalingam, The undecidability of aliasing, TOPLAS 1994. Venkatesan Chakaravarthy, New results on the computability and complexity of points-to analysis, POPL 2003.

15 But the good news is... For single pointer dereference, even a flow-sensitive analysis with only scalars and well-defined types is in P, if dynamic memory allocation is not allowed. For arbitrary number of dereferences, if the analysis is flow-insensitive, it is in P. G Ramalingam, The undecidability of aliasing, TOPLAS 1994. Venkatesan Chakaravarthy, New results on the computability and complexity of points-to analysis, POPL 2003.

16 Open Problems. When dynamic memory allocation is not allowed, but arbitrary number of levels of dereferencing is allowed, the problem is NP-hard. Is it in NP? Is the above problem for bounded number of dereferences in P? When dynamic memory is allowed, is the problem decidable?

17 Related Work. Choi et al, POPL 1993.  flow sensitive.  solution set for each program point.  alias sets for each CFG node.  uses worklists for efficiency.  precise but inefficient. J D Choi,M Burke, P Carini, Efficient flow-sensitive interprocedural computation of pointer induced aliases and side effects, POPL 1993.

18 Related Work. Andersen, PhD Thesis, 1994.  flow insensitive.  context insensitive.  inclusion based.  each variable represented using separate node.  precision used as upper bound. Lars Ole Andersen, Program Analysis and Specialization for the C Programming Language, PhD thesis, 1994.

19 Related Work. Burke et al, LCPC 1995.  flow insensitive.  alias solution for each procedure.  worklist used for efficiency.  can filter alias information based on scoping.  nearly as precise as Andersen's. M Burke, P Carini, J D Choi, M Hind, Flow-insensitive interprocedural alias analysis in the presence of function pointers, LCPC 1995.

20 Related Work. Reps et al, POPL 1995.  problem formulated using graph reachability.  poly-time algorithm for interprocedural finite distributive subset-based problems.  graph reachability used for aliasing. Thomas Reps, Susan Horwitz, Mooly Sagiv, Precise Interprocedural Dataflow Analysis via Graph Reachability, POPL 1995.

21 Related Work. Steensgaard, POPL 1996.  flow insensitive.  context insensitive.  field insensitive.  unification based.  linear space and almost linear time algorithm.  imprecise but sets lower bound on time complexity. Bjarne Steensgaard, Points-to Analysis in Almost Linear Time, POPL 1996.

22 Related Work. Ghiya et al, PLDI 1996.  flow sensitive.  context sensitive.  field insensitive.  makes use of direction, interference and shape.  classifies as tree, dag or cyclic graph. Rakesh Ghiya, Laurie Hendren, Is it a Tree, a DAG, or a Cyclic Graph? A Shape Analysis For Heap Directed Pointers in C, PLDI 1996.

23 Related Work. Cheng et al, PLDI 2000.  uses access paths.  flow insensitive.  field sensitive.  cost effective context sensitivity.  works well for large number of indirect function calls. Ben-Chung Cheng, Wen-Mei Hwu, Modular Interprocedural Pointer Analysis using Access Paths: Design, Implementation, and Evaluation, PLDI 2000.

24 Related Work. Whaley et al, PLDI 2004.  context sensitive.  field sensitive.  partially flow sensitive.  inclusion based.  scalable (10 min, 400 MB, 8000 methods).  ordered BDDs. John Whaley, Monica Lam, Cloning-based Context-sensitive Pointer Alias Analysis Using Binary Decision Diagrams, PLDI 2004.

25 Related Work. Lattner et al, PLDI 2007.  context sensitive.  flow insensitive.  field sensitive.  unification based.  scalable.  efficient (3 sec for 200K lines).  low storage requirement (30MB). Chris Lattner, Andrew Lenharth, Vikram Adve, Making Context Sensitive Points-to Analysis with Heap Cloning Practical For The Real World, PLDI 2007.

26 Our Experiments. framework = LLVM. algorithm = Andersen. benchmark = SPEC 2000.

27 Our Experiments.

28

29 Research Directions. Pointer arithmetic. void f(struct list *p, struct list *q) { struct list *tmp; tmp = p->next; p->next = q->next; q->next = q->next->next; p->next->next = tmp; }

30 Research Directions. Profiling.  at specific program points like function entry, exit.  for hot functions.  for fat pointers.

31 Research Directions. Complex data structures.  a recursive data structure is merged into a single node.  some programs have a single global data structure to operate on, like symbol table, dictionary.  how to characterize complexity of a data structure?

32 Pointer Analysis. Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008.

33 188.ammp Description. Benchmark Program General Category: Computational Chemistry. Modeling large systems of molecules usually associated with Biology. Benchmark Description: The benchmark runs molecular dynamics (i.e. solves the ODE defined by Newton's equations for the motions of the atoms in the system) on a protein-inhibitor complex which is embedded in water (see Harrison 1993 for descriptions of the algorithm and stability analysis on it). The energy is approximated by a classical potential or "force field". The protein is HIV protease complexed with the inhibitor indinavir. There are 9582 atoms in the water and protein making this representative of a typical large simulation. This benchmark is derived from published work on understanding drug resistance in HIV (Weber and Harrison 1999). Input Description: The problem tracks how the atoms move from an initial coorinates and initial velocities.

34 Conferences. POPL: Principles of Programming Languages. PLDI: Programming Language Design and Implementation. MSP: Memory Systems Performance. LCPC: Languages and Compilers for Parallel Computing.

35 Related Work. Raman et al, MSP 2005.  uses executable instructions.  run time (dynamic).  collects RDS profile.  no type information.  interesting properties of data structures are found out.


Download ppt "Pointer Analysis. Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008."

Similar presentations


Ads by Google