Scalable Points-to Analysis. Rupesh Nasre. Advisor: Prof. R. Govindarajan. Comprehensive Examination. Jun 22, 2009.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Global Value Numbering using Random Interpretation Sumit Gulwani George C. Necula CS Department University of California, Berkeley.
Program Analysis using Random Interpretation Sumit Gulwani UC-Berkeley March 2005.
R O O T S Field-Sensitive Points-to-Analysis Eda GÜNGÖR
CSE 5317/4305 L9: Instruction Selection1 Instruction Selection Leonidas Fegaras.
1 Parallel Algorithms (chap. 30, 1 st edition) Parallel: perform more than one operation at a time. PRAM model: Parallel Random Access Model. p0p0 p1p1.
© 2009 IBM Corporation IBM Research Xianglong Liu 1, Junfeng He 2,3, and Bo Lang 1 1 Beihang University, Beijing, China 2 Columbia University, New York,
1 Testing Stochastic Processes Through Reinforcement Learning François Laviolette Sami Zhioua Nips-Workshop December 9 th, 2006 Josée Desharnais.
Error Handling A compiler should: detect errors locate errors recover from errors Errors may occur in any of the three main analysis phases of compiling:
Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs Mayur Naik Intel Research CS294 Lecture March 19, 2009 (Slides courtesy of John Whaley)
2 x /10/2015 Know Your Facts!. 8 x /10/2015 Know Your Facts!
Pointer Analysis B. Steensgaard: Points-to Analysis in Almost Linear Time. POPL 1996 M. Hind: Pointer analysis: haven't we solved this problem yet? PASTE.
Synthesis For Finite State Machines. FSM (Finite State Machine) Optimization State tables State minimization State assignment Combinational logic optimization.
Interprocedural Analysis. Currently, we only perform data-flow analysis on procedures one at a time. Such analyses are called intraprocedural analyses.
ITEC 352 Lecture 13 ISA(4).
5 x4. 10 x2 9 x3 10 x9 10 x4 10 x8 9 x2 9 x4.
Parallel algorithms for expression evaluation Part1. Simultaneous substitution method (SimSub) Part2. A parallel pebble game.
Variational Inference Amr Ahmed Nov. 6 th Outline Approximate Inference Variational inference formulation – Mean Field Examples – Structured VI.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Intermediate Code Generation
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Pointer Analysis.
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Demand-driven Alias Analysis Implementation Based on Open64 Xiaomi An
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Pointer Analysis.
1 Today’s lecture  Last lecture we started talking about control flow in MIPS (branches)  Finish up control-flow (branches) in MIPS —if/then —loops —case/switch.
Author: Nan Hua, Bill Lin, Jun (Jim) Xu, Haiquan (Chuck) Zhao Publisher: ANCS’08 Presenter: Yun-Yan Chang Date:2011/02/23 1.
Parallel Inclusion-based Points-to Analysis Mario Méndez-Lojo Augustine Mathew Keshav Pingali The University of Texas at Austin (USA) 1.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.
Semi-Sparse Flow-Sensitive Pointer Analysis Ben Hardekopf Calvin Lin The University of Texas at Austin POPL ’09 Simplified by Eric Villasenor.
Program Analysis with Set Constraints Ravi Chugh.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Data Parallel Algorithms Presented By: M.Mohsin Butt
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel.
Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen.
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Previous finals up on the web page use them as practice problems look at them early.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.
Intraprocedural Points-to Analysis Flow functions:
Swerve: Semester in Review. Topics  Symbolic pointer analysis  Model checking –C programs –Abstract counterexamples  Symbolic simulation and execution.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Compact Data Structures and Applications Gil Einziger and Roy Friedman Technion, Haifa.
A GPU Implementation of Inclusion-based Points-to Analysis Mario Méndez-Lojo (AMD) Martin Burtscher (Texas State University, USA) Keshav Pingali (U.T.
Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
Pointer Analysis as a System of Linear Equations. Rupesh Nasre (CSA). Advisor: Prof. R. Govindarajan. Jan 22, 2010.
Static Program Analysis of Embedded Software Ramakrishnan Venkitaraman Graduate Student, Computer Science Advisor: Dr. Gopal Gupta
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Using Types to Analyze and Optimize Object-Oriented Programs By: Amer Diwan Presented By: Jess Martin, Noah Wallace, and Will von Rosenberg.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.
Pointer Analysis – Part II CS Unification vs. Inclusion Earlier scalable pointer analysis was context- insensitive unification-based [Steensgaard.
Pointer Analysis. Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008.
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
Points-to Analysis as a System of Linear Equations Rupesh Nasre. Computer Science and Automation Indian Institute of Science Advisor: Prof. R. Govindarajan.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
The Variable-Increment Counting Bloom Filter
Pointer Analysis Lecture 2
Hash-Based Indexes Chapter 10
Pointer analysis.
Pointer Analysis Jeff Da Silva Sept 20, 2004 CARG.
Presentation transcript:

Scalable Points-to Analysis. Rupesh Nasre. Advisor: Prof. R. Govindarajan. Comprehensive Examination. Jun 22, 2009.

Outline. Introduction (points-to analysis). Issues involved in context-sensitive analyses. Bloom filter. Points-to analysis with bloom filter. Experimental evaluation. Future work.

What is Pointer Analysis? Pointer analysis is the mechanism of statically finding out possible run-time values of a pointer and relation of various pointers with each other.

Why Pointer Analysis? for parallelization: fun(p); fun(q); for common subexpression elimination: x = p + 2; y = q + 2; for dead code elimination. if (p == q) { fun(); } for other optimizations.

Introduction. Flow sensitivity. Context sensitivity. Field sensitivity. Unification based. Inclusion based.

Flow sensitivity. p = &x; p = &y; label:... flow-sensitive, at label: {(p, y)}. flow-insensitive: {(p, x), (p, y)}.

Context sensitivity. caller1() { caller2() { fun(int *ptr) { fun(&x); fun(&y); a = ptr; } } } context-insensitive: {(a, x), (a, y)}. context-sensitive: {(a, x)} along call-path caller1, {(a, y)} along call-path caller2.

Field sensitivity. a.f = &x; field-sensitive: {(a.f, x)}. field-insensitive: {(a, x)}.

Unification based. one(&s1); one(struct s*p) { two(struct s*q) { one(&s2); p->a = 3; q->b = 4; two(&s3); two(p); } } unification-based: {(p, &s1), (p, &s2), (p, &s3), (q, &s1), (q, &s2), (q, &s3)}.

Inclusion based. one(&s1); one(struct s*p) { two(struct s*q) { one(&s2); p->a = 3; q->b = 4; two(&s3); two(p); } } inclusion-based: {(p, &s1), (p, &s2), (q, &s1), (q, &s2), (q, &s3)}

Related work. Scalable points-to analyses. B. Steensgaard, Points-to Analysis in Almost Linear Time, POPL J. Whaley and M. S. Lam, Cloning-Based Context-Sensitive Pointer Alias Analysis Using Binary Decision Diagrams, PLDI B. Hardekopf and C. Lin, The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code, PLDI V. Kahlon, Bootstrapping: a technique for scalable flow and context- sensitive pointer alias analysis, PLDI 2008.

Issues with context-sensitivity. main() { f(a) { g(b) { S1: f(&x); S3: g(a);... S2: f(&y); S4: g(z);... } } } ff gggg main S1 S2 S3S4S3S4 Invocation graph. Exponential number of contexts.

Issues with context-sensitivity. Storage requirement increases exponentially. Along S1-S3-S5-S7, p points to {x1, x3, x5, x7}. Along S1-S3-S5-S8, p points to {x1, x3, x5, x8}. Along S1-S3-S6-S7, p points to {x1, x3, x6, x7}. Along S1-S3-S6-S8, p points to {x1, x3, x6, x8}. Along S1-S4-S5-S7, p points to {x1, x4, x5, x7}. Along S1-S4-S5-S8, p points to {x1, x4, x5, x8}. Along S1-S4-S6-S7, p points to {x1, x4, x6, x7}. Along S1-S4-S6-S8, p points to {x1, x4, x6, x8}. Along S2...

Tackling scalability issues. How about not storing complete contexts? How about storing approximate points-to information? Can we have a probabilistic data structure that approximates the storage? Can we control the false-positive rate?

Bloom filter. A bloom filter is a probabilistic data structure for membership queries, and is typically implemented as a fixed-sized array of bits. To store elements e1, e2, e3, bits at positions hash(e1), hash(e2) and hash(e3) are set. 11 e1, e3e2

Points-to analysis with Bloom filter. A constraint is an abstract representation of the pointer instruction. p = &xp.pointsTo(x). p = qp.copyFrom(q). *p = qp.storeThrough(q). p = *qp.loadFrom(q). Function arguments and return values resemble p = q type of statement. Note, each constraint also stores the context.

Points-to Analysis with Bloom filter. If points-to pairs (p, x) are kept in bloom filter, existential queries like “does p point to x?” can be answered. What about queries like “do p and q alias?”? What about context-sensitive queries like “do p and q alias in context c?”? How to process assignment statements p = q? How about load/store statements *p = q and q = *p?

Multi-Bloom filter. Points-to pairs are kept in a bloom filter per pointer. A bit set to 1 represents a points-to pair. Example (diagram on the next slide): Points-to pairs {(p, x), (p, y), (q, x)}. hash(x) = 5, hash(y) = 6. Set bit numbers: p.bucket[5], p.bucket[6], q.bucket[5]. Can hash(x) and hash(y) be the same? Yes.

Multi-Bloom filter (p, x)(p,y) (q, x) p.bucket. q.bucket

Multi-Bloom filter. Each pointer has a fixed number of bits for storing its points-to information, called as a bucket. Thus, if bucket size == 10, all pointees are hashed to a value from 0 to 9. This notion is extended to have multiple buckets for each pointer for multiple hash functions.

Handling p = q. Points-to set of q should be added to the points- to set of p. Bitwise-OR each bucket of q with the corresponding bucket of p. Example on the next slide.

Example. h1(x) = 0, h2(x) = 5, h1(y) = 3, h2(y) = 3.

Handling p = *q and *p = q. Extend multi-bloom to have another dimension for pointers pointed to by pointers. The idea can be extended to higher-level pointers (***p, ****p, and so on). We implemented it only for two-level pointers. Example on the next slide.

Another example. h(x) = 1, h(y) = 4, hs(p1) = 1, hs(p2) = 2.

Alias query: context-sensitive. If the query is DoAlias(p, q, c), for each hash function ii { hasPointee = false; for each bucket-bit jj if (p.bucket[c][ii][jj] and q.bucket[c][ii][jj]) hasPointee = true; if (hasPointee == false) return NoAlias; } return MayAlias;

Alias query: context-insensitive. If the query is DoAlias(p, q), for each context c { if (DoAlias(p, q, c) == MayAlias) return MayAlias; } return NoAlias;

Experimental evaluation: Time.

Experimental evaluation: Memory.

Experimental evaluation: Precision.

Summary. Bloom filters offer an effective way to represent points-to information. Precision can be as close to exact, still saving storage. Parameters can be configured for an application usage.

Future work. Flow-sensitive analysis using counting bloom filter.  need to support kill operation.  require resetting bits.  may introduce false negatives.  storage requirement of flow-sensitive analysis is an issue.

Future work. Adaptive bloom filter parameters.  not all pointers require same number of bits.  bits saved from one pointer can be used by another.  storage required for counters representing number of bits for each pointer.  bitwise operations are not straightforward.

Future work. Efficient flow-insensitive analysis.  approach similar to wave/deep propagation¹.  similarity with flow-sensitive analysis.  preliminary results show that the number of iterations to reach a fix-point can be reduced, e.g., on an example set of programs, total number of iterations are reduced from 180 (deep) to 148. ¹ F M Q Periera, Daniel Berlin, Wave Propagation and Deep Propagation for Pointer Analysis, CGO 2009.

Scalable Points-to Analysis. Rupesh Nasre. Advisor: Prof. R. Govindarajan. Comprehensive Examination. Jun 22, 2009.

Approach. Start from main(). Add constraints for each pointer statement. Flow-insensitive. Jump to the called function, process it and return. Continue with the caller. A function called multiple times is processed multiple times. Keep context along with each constraint. Recursion is handled by iterating over the cycle until a fix- point. This is context-insensitive. At the end, iterate over constraints to extract points-to information in context-sensitive manner. Iteration makes it flow-insensitive.

Example. main() {f(a) { S1:r1 = f(p)return a S2:r2 = f(q)} S3:r3 = g(p)g(b) { S4:r4 = h()c = b} h() { p = &x q = &y } Only realizable paths. Along main-S1, r1 points to x. Along main-S2, r2 points to y. Even though main-S3 and main- S4 are different contexts, we merge context-information. Thus, c points to x. Since main-S1 and main-S2 call the same function, we do not merge the information. Thus, r1 does not alias with r2.

Experimental evaluation: Mod/Ref.