# Scalable Points-to Analysis. Rupesh Nasre. Advisor: Prof. R. Govindarajan. Comprehensive Examination. Jun 22, 2009.

## Presentation on theme: "Scalable Points-to Analysis. Rupesh Nasre. Advisor: Prof. R. Govindarajan. Comprehensive Examination. Jun 22, 2009."— Presentation transcript:

Scalable Points-to Analysis. Rupesh Nasre. Advisor: Prof. R. Govindarajan. Comprehensive Examination. Jun 22, 2009.

Outline. Introduction (points-to analysis). Issues involved in context-sensitive analyses. Bloom filter. Points-to analysis with bloom filter. Experimental evaluation. Future work.

What is Pointer Analysis? Pointer analysis is the mechanism of statically finding out possible run-time values of a pointer and relation of various pointers with each other.

Why Pointer Analysis? for parallelization: fun(p); fun(q); for common subexpression elimination: x = p + 2; y = q + 2; for dead code elimination. if (p == q) { fun(); } for other optimizations.

Introduction. Flow sensitivity. Context sensitivity. Field sensitivity. Unification based. Inclusion based.

Flow sensitivity. p = &x; p = &y; label:... flow-sensitive, at label: {(p, y)}. flow-insensitive: {(p, x), (p, y)}.

Context sensitivity. caller1() { caller2() { fun(int *ptr) { fun(&x); fun(&y); a = ptr; } } } context-insensitive: {(a, x), (a, y)}. context-sensitive: {(a, x)} along call-path caller1, {(a, y)} along call-path caller2.

Field sensitivity. a.f = &x; field-sensitive: {(a.f, x)}. field-insensitive: {(a, x)}.

Unification based. one(&s1); one(struct s*p) { two(struct s*q) { one(&s2); p->a = 3; q->b = 4; two(&s3); two(p); } } unification-based: {(p, &s1), (p, &s2), (p, &s3), (q, &s1), (q, &s2), (q, &s3)}.

Inclusion based. one(&s1); one(struct s*p) { two(struct s*q) { one(&s2); p->a = 3; q->b = 4; two(&s3); two(p); } } inclusion-based: {(p, &s1), (p, &s2), (q, &s1), (q, &s2), (q, &s3)}

Related work. Scalable points-to analyses. B. Steensgaard, Points-to Analysis in Almost Linear Time, POPL 1996. J. Whaley and M. S. Lam, Cloning-Based Context-Sensitive Pointer Alias Analysis Using Binary Decision Diagrams, PLDI 2004. B. Hardekopf and C. Lin, The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code, PLDI 2007. V. Kahlon, Bootstrapping: a technique for scalable flow and context- sensitive pointer alias analysis, PLDI 2008.

Issues with context-sensitivity. main() { f(a) { g(b) { S1: f(&x); S3: g(a);... S2: f(&y); S4: g(z);... } } } ff gggg main S1 S2 S3S4S3S4 Invocation graph. Exponential number of contexts.

Issues with context-sensitivity. Storage requirement increases exponentially. Along S1-S3-S5-S7, p points to {x1, x3, x5, x7}. Along S1-S3-S5-S8, p points to {x1, x3, x5, x8}. Along S1-S3-S6-S7, p points to {x1, x3, x6, x7}. Along S1-S3-S6-S8, p points to {x1, x3, x6, x8}. Along S1-S4-S5-S7, p points to {x1, x4, x5, x7}. Along S1-S4-S5-S8, p points to {x1, x4, x5, x8}. Along S1-S4-S6-S7, p points to {x1, x4, x6, x7}. Along S1-S4-S6-S8, p points to {x1, x4, x6, x8}. Along S2...

Tackling scalability issues. How about not storing complete contexts? How about storing approximate points-to information? Can we have a probabilistic data structure that approximates the storage? Can we control the false-positive rate?

Bloom filter. A bloom filter is a probabilistic data structure for membership queries, and is typically implemented as a fixed-sized array of bits. To store elements e1, e2, e3, bits at positions hash(e1), hash(e2) and hash(e3) are set. 11 e1, e3e2

Points-to analysis with Bloom filter. A constraint is an abstract representation of the pointer instruction. p = &xp.pointsTo(x). p = qp.copyFrom(q). *p = qp.storeThrough(q). p = *qp.loadFrom(q). Function arguments and return values resemble p = q type of statement. Note, each constraint also stores the context.

Points-to Analysis with Bloom filter. If points-to pairs (p, x) are kept in bloom filter, existential queries like “does p point to x?” can be answered. What about queries like “do p and q alias?”? What about context-sensitive queries like “do p and q alias in context c?”? How to process assignment statements p = q? How about load/store statements *p = q and q = *p?

Multi-Bloom filter. Points-to pairs are kept in a bloom filter per pointer. A bit set to 1 represents a points-to pair. Example (diagram on the next slide): Points-to pairs {(p, x), (p, y), (q, x)}. hash(x) = 5, hash(y) = 6. Set bit numbers: p.bucket[5], p.bucket[6], q.bucket[5]. Can hash(x) and hash(y) be the same? Yes.

Multi-Bloom filter. 00000110 05 6 (p, x)(p,y) 0000010 0 5 (q, x) p.bucket. q.bucket. 00 0 0

Multi-Bloom filter. Each pointer has a fixed number of bits for storing its points-to information, called as a bucket. Thus, if bucket size == 10, all pointees are hashed to a value from 0 to 9. This notion is extended to have multiple buckets for each pointer for multiple hash functions.

Handling p = q. Points-to set of q should be added to the points- to set of p. Bitwise-OR each bucket of q with the corresponding bucket of p. Example on the next slide.

Example. h1(x) = 0, h2(x) = 5, h1(y) = 3, h2(y) = 3.

Handling p = *q and *p = q. Extend multi-bloom to have another dimension for pointers pointed to by pointers. The idea can be extended to higher-level pointers (***p, ****p, and so on). We implemented it only for two-level pointers. Example on the next slide.

Another example. h(x) = 1, h(y) = 4, hs(p1) = 1, hs(p2) = 2.

Alias query: context-sensitive. If the query is DoAlias(p, q, c), for each hash function ii { hasPointee = false; for each bucket-bit jj if (p.bucket[c][ii][jj] and q.bucket[c][ii][jj]) hasPointee = true; if (hasPointee == false) return NoAlias; } return MayAlias;

Alias query: context-insensitive. If the query is DoAlias(p, q), for each context c { if (DoAlias(p, q, c) == MayAlias) return MayAlias; } return NoAlias;

Experimental evaluation: Time.

Experimental evaluation: Memory.

Experimental evaluation: Precision.

Summary. Bloom filters offer an effective way to represent points-to information. Precision can be as close to exact, still saving storage. Parameters can be configured for an application usage.

Future work. Flow-sensitive analysis using counting bloom filter.  need to support kill operation.  require resetting bits.  may introduce false negatives.  storage requirement of flow-sensitive analysis is an issue.

Future work. Adaptive bloom filter parameters.  not all pointers require same number of bits.  bits saved from one pointer can be used by another.  storage required for counters representing number of bits for each pointer.  bitwise operations are not straightforward.

Future work. Efficient flow-insensitive analysis.  approach similar to wave/deep propagation¹.  similarity with flow-sensitive analysis.  preliminary results show that the number of iterations to reach a fix-point can be reduced, e.g., on an example set of programs, total number of iterations are reduced from 180 (deep) to 148. ¹ F M Q Periera, Daniel Berlin, Wave Propagation and Deep Propagation for Pointer Analysis, CGO 2009.

Scalable Points-to Analysis. Rupesh Nasre. Advisor: Prof. R. Govindarajan. Comprehensive Examination. Jun 22, 2009.

Approach. Start from main(). Add constraints for each pointer statement. Flow-insensitive. Jump to the called function, process it and return. Continue with the caller. A function called multiple times is processed multiple times. Keep context along with each constraint. Recursion is handled by iterating over the cycle until a fix- point. This is context-insensitive. At the end, iterate over constraints to extract points-to information in context-sensitive manner. Iteration makes it flow-insensitive.

Example. main() {f(a) { S1:r1 = f(p)return a S2:r2 = f(q)} S3:r3 = g(p)g(b) { S4:r4 = h()c = b} h() { p = &x q = &y } Only realizable paths. Along main-S1, r1 points to x. Along main-S2, r2 points to y. Even though main-S3 and main- S4 are different contexts, we merge context-information. Thus, c points to x. Since main-S1 and main-S2 call the same function, we do not merge the information. Thus, r1 does not alias with r2.

Experimental evaluation: Mod/Ref.