Presentation is loading. Please wait.

Presentation is loading. Please wait.

Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel.

Similar presentations


Presentation on theme: "Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel."— Presentation transcript:

1 Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel

2 Instantiation Constraints  Flow-insensitive  Context-sensitive  Handles higher-order functions (function pointers) smoothly  “Flow” analysis (“provenance” analysis)  Inspired by Henglein’s Type Inference with Polymorphic Recursion 1993  Constraint-based analysis

3 Constraint-Based Analysis  Pattern of program analysis  Program is read to produce abstract representation—set of constraints Graph System of equations  Abstract representation is processed  Result is examined to tell us about the program

4 Types a2a2 ptr 3 a4a4 Alpha type Pointer type pointer pointee func 4 a5a5 a6a6 Function type function inputreturn

5 Equality Constraint  Result of an assignment  Values flow both ways From *x to *y From *y to *x  Handle with unification *x = *y; x: ptr 1 *x: a 2 y: ptr 3 *y: a 4

6 Instantiation Constraint  Used to make connections across procedures  Values flow one direction only  Generated by naming functions  Identified with labels int foo(int x) { … } … foo 1 (b); foo: func 4 x: a 5 a6a6 foo 1 : func 1 b: a 2 a3a3 )1)1

7 Call Id Twice Example int *id(int *x) { return x; } void main(void) { int *a, *b, *c, *d; int e, f; b = &e; d = &f; a = id 1 (b); c = id 2 (d); }

8 Generated Graph id: func x: a 5 a: ptrb: ptrc: ptrd: ptr fe *a*c id 2 : func id 1 : func )1)1 )2)2

9 Processing Rules (1) foo: func 4 x: a 5 a6a6 foo 1 : func 1 b: a 2 a3a3 )1)1 foo: func 4 x: a 5 a6a6 foo 1 : func 1 b: a 2 a3a3 )1)1 )1)1 (1(1

10 Processing Rules (2) ptr 1 a2a2 ptr 3 a4a4 )1)1 (1(1 ptr 1 a2a2 ptr 3 a4a4 )1)1 (1(1 )1)1 ptr 1 a2a2 ptr 3 a4a4 ptr 1 a2a2 ptr 3 a4a4 )1)1 (1(1 (1(1

11 Processing Rules (3) a1a1 a3a3 )1)1 (1(1 a2a2 a1a1 a3a3 )1)1 (1(1 a2a2 a 1, a 3 )1)1 (1(1 a2a2 “Closure rule”

12 Generated Graph id: func x: a 5 a: ptrb: ptrc: ptrd: ptr fe *a*c id 2 : func id 1 : func )1)1 )2)2

13 Processing Graph (1) id: func x: a 5 a: ptrb: ptrc: ptrd: ptr fe *a*c id 2 : func id 1 : func )2)2 )2)2 )1)1 (2(2

14 Processing Graph (2) id: func x: a 5 a: ptrb: ptrc: ptrd: ptr fe *a*c id 2 : func id 1 : func )2)2 )2)2 )1)1 (2(2

15 Processing Graph (3) id: func x: a 5 a: ptrb: ptrc, d: ptr fe *a id 2 : func id 1 : func )2)2 )2)2 )1)1 (2(2

16 Processing Graph (4) id: func x: a 5 a: ptrb: ptrc, d: ptr fe *a id 2 : func id 1 : func )2)2 )2)2 )1)1 (2(2 (1(1 )1)1

17 Processing Graph (5) id: func x: a 5 a: ptrb: ptrc, d: ptr fe *a id 2 : func id 1 : func )2)2 )2)2 )1)1 (2(2 (1(1 )1)1

18 Result id: func x: a 5 a, b: ptr c, d: ptr fe id 2 : func id 1 : func )2)2 )2)2 )1)1 (2(2 (1(1 )1)1

19 Result id: func x: a 5 a, b: ptr c, d: ptr fe id 2 : func id 1 : func )2)2 )2)2 )1)1 (2(2 (1(1 )1)1

20 Nuts and Bolts  Build the constraints for a procedure  Simplify those constraints  Instantiate (copy) those constraints to the callers  Polarity: Distinguish between function inputs and outputs

21 Using the Results  A value in on variable can end up in a second if: There is a path in the graph consisting of 0 or more red/close paren edges followed by 0 or more green/open paren edges  From a to x: yes  From a to c: no ** ( )

22 An Application  Format String Vulnerability Some format strings can cause data to be overwritten E.g. “%n%n%n%n%n%n%n%n” Malicious format string can gain control of program  Problem formulation Values should not flow from an unsafe source to a format string Source, network in recv Format string, first argument to printf

23 Inst. Constraints Summary  Result shows provenance—source of values  Can be used as a pointer analysis  Algorithm is actually undecidable But seems to run quickly in practice  Context-sensitive Paren-matching through closure rule  Interesting base analysis to build tools on

24 The End

25 Some Other Pointer Work  Andersen Flow-insensitive, with directional assignments Assignment copies points-to set of rhs to lhs Cycles make it slow; other work to find and collapse cycles  Das (“one level flow”) One pointer level of directionality Handles the common cases in C well Multiple return values, altering parameters  Wilson and Lam Flow-sensitive, context-sensitive

26 Further Reading Manuvir Das. Unification-Based Pointer Analysis with Directional Assignments. Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, Vancouver, BC, Canada, June 2000. Manuel Fahndrich, Jakob Rehof, Manuvir Das. Scalable context-sensitive flow analysis using instantiation constraints. Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, Vancouver, BC, Canada, June 2000. R. P. Wilson and M. S. Lam. Efficient context-sensitive pointer analysis for C programs. Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation, June 1995. L.O. Andersen. Program analysis and specialization for the C programming language. Technical Report 94-19, University of Copenhagen, 1994.

27 Overview  Steensgaard ’95 Flow-insensitive, context-insensitive alias analysis  Liang and Harrold ’99 Straightforward context-sensitive extension of Steensgaard  Fahndrich, Rehof, Das ’00 Context-sensitive extension of Steensgaard that handles function pointers and higher-order functions smoothly

28 Steensgaard Preliminaries  “Type inference” Types here do not refer to integer, char, etc! “Non-standard” or “extended” types Two objects sharing the same type have some property in common Point to the same things Each variable in the program has a type Type rules describe a consistent typing  Points-to sets vs. alias pairs  Begin by ignoring the conditional join stuff

29 Steensgaard  Basic operation After an assignment x=y, x and y point to the same set of things (*x and *y are the same) Non-directional: x=y has same effect as y=x Also implies that *x and *y point to the same set of things (and **x and **y, and so on) Alias relation (not points-to relation) is symmetric, reflexive, transitive  Types can be grouped into equivalence classes of objects that point to the same things  Assignment joins equivalence classes of pointees together

30 Steensgaard Example (1) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c x y z &x &y &z *a *c *b *y

31 Steensgaard Example (2) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c x y z &x &y &z *a *c *b *y

32 Steensgaard Example (3) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c y z &x &y &z *a, x *c *b *y

33 Steensgaard Example (4) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c y z &x &y &z *a, x *c *b *y

34 Steensgaard Example (5) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b cz &x &y &z *a, x *c *b, y *y

35 Steensgaard Example (6) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b cz &x &y &z *a, x *c *b, y *y

36 Steensgaard Example (7) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c &x &y &z *a, x *c *b, y *y, z

37 Steensgaard Example (8) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c &x &y &z *a, x *c *b, y *y, z

38 Steensgaard Example (9) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c &x &y &z *a, x, *y, z *c *b, y

39 Steensgaard Example (10) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c &x &y &z *a, x, *y, z *c *b, y

40 Steensgaard Example (Result) a = &x; b = &y; if (p) y = &z; else y = &x; c = &y; a b c &x &y &z *a, x, *y, z *b, y, *c

41  Observation: forcing pointees to join is sometimes too strong an action Assignment is really directional After x=y, x points to everything y points to Using a directional assignment prevents using equivalence classes/union find But, if y’s points-to set is null, OK to do nothing When we see x=y for y not a pointer (bottom type in this notation), don’t join immediately, but record the fact that if y later is found to point to something, we must join it with x.

42 A Typing Rule GIVEN x : ref(a) x has type pointer to type a y : ref(b)y has type pointer to type b b  ab is not a pointer type, or a and b are the same type we conclude that welltyped (x = y)our types are well-typed for statement x = y (we have a consistent points-to graph) Reading downward verifies consistency; upward gives constraints.

43 Steensgaard Summary  Fast union-find allows solution in “near- linear” (linear times inverse Ackerman’s function) time  Does not handle structs (but see later paper by same author)  Flow-insensitive  Context-insensitive  Non-directional assignments

44 Liang & Harrold Operation  Do Steensgaard within each procedure to build a summary  Then do bottom-up, top-down propagation of results Bottom-up: aliases from callees to callers Top-down: from callers to callees  “FICS”=flow-insensitive context-sensitive

45 L & H Example (1) int *id(int *x) { return x; } int e, f; void main(void) { int *a, *b, *c, *d; b = &e; d = &f; a = id(b); c = id(d); } main id

46 L & H Example (2) (Phase 1) aidret *x, *idret xbcd fe This is the result of applying Steensgaard to each procedure individually; the result is a summary of the pointer behavior of each function. mainid

47 L & H Example (3) (Phase 2) aidret *x, *idret xbcd fe Propagate pointer information from callees to callers (apply summaries of called functions.) Bind formals to actuals, and returns to where they are assigned. (Bottom-up phase.) mainid bindings induced edge First call site

48 L & H Example (4) (Phase 2) aidret *x, *idret xbcd fe mainid Second call site

49 L & H Example (5) (Phase 3) aidret *x, *idret, e xbcd fe Propagate pointer information from callers to callees. (Top-down phase.) mainid First call site

50 L & H Example (6) (Phase 3) aidret *x, *idret, e, f xbcd fe mainid Second call site

51 L & H Summary  Cycles in call graph In BU/TD phases, iterate among procedures in each SCC to find a fixpoint  Presumes call graph pre-exists With function pointers, need pointer analysis to provide call graph! Algorithm as expressed doesn’t handle function pointers


Download ppt "Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel."

Similar presentations


Ads by Google