Presentation is loading. Please wait.

Presentation is loading. Please wait.

Static Analysis: Data-Flow Analysis I

Similar presentations


Presentation on theme: "Static Analysis: Data-Flow Analysis I"— Presentation transcript:

1 Static Analysis: Data-Flow Analysis I
( AMP’11: Advanced Models & Programs, 2011 ) Claus Brabrand IT University of Copenhagen ( )

2

3 Agenda (4/4, ’11) Introduction: Data-flow Analysis:
Undecidability, Reduction, and Approximation Data-flow Analysis: Quick tour & running example Control-Flow Graphs: Control-flow, data-flow, and confluence ”Science-Fiction Math”: Lattice theory, monotonicity, and fixed-points Putting it all together…: Example revisited

4 Breaks? #Breaks, |Breaks|, ...?
I’ll aim for: more (but shorter) breaks

5 Notes on Static Analysis
”Lecture Notes on Static Analysis” by Michael I. Schwartzbach (BRICS, Uni. Aarhus) Chapter 1, 2, 4, 5, 6 (until p. 19) (Excl. ”pointers”) Claims to be "not overly formal", but the math involved can be quite challenging (at times)…

6 Static Analysis Purpose (of static analysis):
Gather information (on running behavior of program) ”program points” Usage (of static analysis): Basis for subsequent…: Error detection information program points Static Analysis Optimization

7 Analyses for Optimization
Example Analyses: ”Constant Propagation Analysis”: Precompute constants (e.g., replace ’5*x+z’ by ’42’) ”Live Variables Analysis”: Dead-code elimination (e.g., get rid of unused variable ’z’) ”Available Expressions Analysis”: Avoid recomputing already computed exprs (cache results)

8 Analyses for Finding Errors
Example Analyses: ”Symbol Checking”: Catch (dynamic) symbol errors ”Type Checking”: Catch (dynamic) type errors ”Initialized Variable Analysis”: Catch unintialized variables

9 Quizzzzz: Optimization?
If you want a fast C-program, should you use: LOOP 1: LOOP 2 (optimized by programmer): i.e., ”array-version” or ”optimized pointer-version” ? for (i = 0; i < N; i++) { a[i] = a[i] * 2000; a[i] = a[i] / 10000; } ? …or… b = a; for (i = 0; i < N; i++) { *b = *b * 2000; *b = *b / 10000; b++; } ?

10 Answer: Results (of running the programs):
You’ll learn how to make similar analyses [next two weeks] Results (of running the programs): Compilers use highly sophisticated static analyses for optimization! Recommendation: focus on writing clear code (for people and for compilers to understand)

11

12 Conceptual Motivation
Undecidability Reduction principle Approximation

13 Rice’s Theorem (1953) Examples: does program ’P’ always halt?
is the value of integer variable ’x’ always positive? does variable ’x’ always have the same value? which variables can pointer ’p’ point to? does expression ’E’ always evaluate to true? what are the possible outputs of program ’P’? “Any interesting problem about the runtime behavior of a program* is undecidable” -- Rice’s Theorem [paraphrased] (1953) *) written in a turing-complete language

14 Undecidability (self-referentiality)
Consider "The Book-of-all-Books": This book contains the titles of all books that do not have a self-reference (i.e. don't contain their title inside) Finitely many books; i.e.: We can sit down & figure out whether to include or not... Q: What about "The Book-of-all-Books"; Should it be included or not? "Self-referential paradox" (many guises): e.g. "This sentence is false" "The Bible" "War and Peace" "Programming Languages, An Interp.-Based Approach" ... The Book-of-all-Books

15 Termination Undecidable!
Assume termination is decidable (in Java); i.e.  some program, halts: Program  bool Q: Does P0 loop or terminate...? :) Hence: halts cannot exist! i.e., "Termination is undecidable" bool halts(Program p) { ... } -- P0.java -- bool halts(Program p) { ... } Program p0 = read_program("P0.java"); if (halts(p0)) loop(); else halt(); *) for turing-complete languages

16 Rice’s Theorem (1953) Examples: does program ’P’ always halt?
is the value of integer variable ’x’ always positive? does variable ’x’ always have the same value? which variables can pointer ’p’ point to? does expression ’E’ always evaluate to true? what are the possible outputs of program ’P’? “Any interesting problem about the runtime behavior program* is undecidable” -- Rice’s Theorem [paraphrased] (1953) *) written in a turing-complete language reduction

17 solve always-pos  solve halts
Assume ’x-is-always-pos(P)’ is decidable Given P (here’s how we could solve ’halts(P)’): Construct (clever) reduction program R: Run ”supposedly decidable” analysis: res = Deduce from result: if (res) then P loops!; else P halts :-) THUS: ’x-is-always-pos(P)’ must be undecidable! -- R.java -- int x = 1; P /* insert program P here :-) */ x = -1; x-is-always-positive(R)

18 Reduction Principle Reduction principle (in short): Example: Exercise:
Carry out reduction + whole explanation for: ”which are the possible outputs of program P?” (P) undecidable  [solve (P)  solve (P)] (P) undecidable reduction ’halts(P)’ undecidable  [solve ’x-is-always-pos(P)’  solve ’halts(P)’] ’x-is-always-pos(P)’ undecidable Assume ’which-var-p-points-to(P)’ is decidable: Given P (here’s how to (cleverly) decide halts(P)): Construct (clever) reduction program R: ptr p = 0xffff; P /* insert program P */ p = null; Run ’which-var-p-points-to(R)’ = res If (null \in res) P halts! else; P loops! :-) THUS: ’which-var-p-points-to(P)’ must be undecidable!

19 Answer Assume ’which-var-q-points-to(P)’ is decidable:
Given P (here’s how to (cleverly) decide halts(P)): Construct (clever) reduction program R: Run ’which-var-p-points-to(R)’ = res If (null  res) P halts! else; P loops! :-) THUS: ’which-var-q-points-to(P)’ must be undecidable! -- R.java -- ptr q = 0xffff; P /* insert program P */ q = null;

20 Undecidability Undecidability means that…: However(!)… ?
…no-one can decide this line (for all programs)! However(!)… ? loops halts

21 “Side-Stepping Undecidability”
Compilers use safe approximations (computed via ”static analyses”) such that: However, just because it’s undecidable, doesn’t mean there aren’t (good) approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”: error error ok ok Okay! Dunno? Dunno? Error!

22 “Side-Stepping Undecidability”
Unsafe approximation: For testing it may be okay to ”abandon” safety and use unsafe approximations: However, just because it’s undecidable, doesn’t mean there aren’t (good) approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”: loops halts unsafe approximation error ok Here are some programs for you to (manually) consider !

23 “Slack” Undecidability means: “there’ll always be a slack”: However, still useful: (possible interpretations of “Dunno?”): Treat as error (i.e., reject program): “Sorry, program not accepted!” Treat as warning (i.e., warn programmer): “Here are some potential problems: …” loops . halts . . Okay! Dunno?

24 Soundness & Completeness
Analysis reports no errors  Really are no errors Completeness: Analysis reports an error  Really is an error error error ok ok Sound analysis Complete analysis …or alternative (equivalent) formulation, via ”contra-position”: P  Q Q  P Really are error(s)  Analysis reports error(s) Really no error(s)  Analysis reports no error(s)

25 Example: Type Checking
Will this program have type error (when run)? Undecidable (because of reduction): Type error  <EXP> evaluates to true void f() { var b; if (<EXP>) { b = 42; } else { b = true; } /* some code */ if (b) ...; // error is b is '42' Undecidable (in all cases) i.e., what <EXP> evaluates to (when run)

26 Example: Type Checking
Hence, languages use static requirements: All variables must be declared And have only one type (throughout the program) This is (very) easy to check (i.e., "type-checking") void f() { bool b; // instead of ”var b;” /* some code */ if (<EXP>) { b = 42; } else { b = true; } /* some more code */ } Static compiler error: Regardless of what <EXP> evaluates to when run

27 Agenda (4/4, ’11) Introduction: Data-flow Analysis:
Undecidability, Reduction, and Approximation Data-flow Analysis: Quick tour & running example Control-Flow Graphs: Control-flow, data-flow, and confluence ”Science-Fiction Math”: Lattice theory, monotonicity, and fixed-points Putting it all together…: Example revisited

28 5’ Crash Course on Data-Flow Analysis
( AMP’11: Advanced Models & Programs, 2011 ) Claus Brabrand IT University of Copenhagen ( )

29 “Simulate runtime execution at compile-time using abstract values”
Data-Flow Analysis IDEA: We (only) need 3 things: A control-flow graph A lattice Transfer functions Example: “(integer) constant propagation” “Simulate runtime execution at compile-time using abstract values”

30  Control-flow graph We (only) need 3 things: A control-flow graph
A lattice Transfer functions Control-flow graph Given program: int x = 1; int x = 1; int y = 3; if (...) { x = x+2; } else { x <-> y; } print(x,y); int y = 3; ... true false x = x+2; x <-> y; print(x,y);

31 We (only) need 3 things: A control-flow graph A lattice Transfer functions A Lattice Lattice L of abstract values of interest and their relationships (i.e. ordering “”): Induces least-upper-bound operator: for combining information “top” ~ “could be anything” ·· ·· “bottom” ~ “we haven’t analyzed yet”

32 Data-Flow Analysis We (only) need 3 things: A control-flow graph
A lattice Transfer functions Data-Flow Analysis x y [ , ]  ENVL E . E[x  1] int x = 1; [ , ] 1 [ , ] 1 E . E[y  3] int y = 3; [ , ] 1 3 [ , ] 1 3 ... [ , ] 1 3 [ , ] [ , ] 1 3 1 3 E . E[x  E(y), y  E(x)] E . E[x  E(x)  2] x = x+2; x <-> y; [ , ] 3 3 [ , ] [ , ] 3 1 3 print(x,y);

33 Agenda (4/4, ’11) Introduction: Data-flow Analysis:
Undecidability, Reduction, and Approximation Data-flow Analysis: Quick tour & running example Control-Flow Graphs: Control-flow, data-flow, and confluence ”Science-Fiction Math”: Lattice theory, monotonicity, and fixed-points Putting it all together…: Example revisited

34 Control Structures Control Structures: if-else: if:
Statements (or Expr’s) that affect ”flow of control”: if-else: if: Stm1 Exp true false Stm2 confluence [syntax] if ( Exp ) Stm1 else Stm2 [semantics] The expression must be of type boolean; if it evaluates to true, Statement-1 is executed, otherwise Statement-2 is executed. Stm Exp true false confluence [syntax] if ( Exp ) Stm [semantics] The expression must be of type boolean; if it evaluates to true, the given statement is executed, otherwise not.

35 Control Structures (cont’d)
Stm Exp true false confluence while: for: [syntax] while ( Exp ) Stm [semantics] The expression must be of type boolean; if it evaluates to false, the given statement is skipped, otherwise it is executed and afterwards the expression is evaluated again. If it is still true, the statement is executed again. This is continued until the expression evaluates to false. Exp1; Exp2 true false Stm confluence Exp3; [syntax] for (Exp1 ; Exp2 ; Exp3) Stm [semantics] Equivalent to: { Exp1; while ( Exp2 ) { Stm Exp3; } }

36  Control-flow graph Given program: int x = 1; int x = 1; int y = 3;
if (a>b) { x = x+2; } else { x <-> y; } print(x,y); int y = 3; a>b true false x = x+2; x <-> y; print(x,y);

37 Exercise: Draw a Control-Flow Graph for:
public static void main ( String[] args ) { int mi, ma; if (args.length == 0) System.out.println("No numbers"); else { mi = ma = Integer.parseInt(args[0]); for (int i=1; i < args.length; i++) { int obs = Integer.parseInt(args[i]); if (obs > ma) ma = obs; else if (mi < obs) mi = obs; } System.out.println(”min=" + mi + "," "max=" + ma); }} if else for if else if

38 System.out.println ("No numbers");
Control-Flow Graph int mi, ma; CFG: args.length == 0 true false System.out.println ("No numbers"); mi = ma = Integer.parseInt(args[0]); int i=1; i < args.length true false int obs = Integer.parseInt(args[i]); obs > ma true false ma = obs; mi < obs true false mi = obs; i++; System.out.println(”min=" + mi + "," + "max=" + ma);

39 Control Structures (cont’d2)
do-while: ”?:”; ”conditional expression”: ”||”; ”lazy disjunction” (aka., ”short-cut ”): ”&&”; ”lazy conjunction” (aka., ”short-cut ”): switch: Exercise: do Stm while ( Exp ); Exp1 ? Exp2 : Exp3 Exp1 || Exp2 Exp1 && Exp2 Swb: case Exp : Stm* break; switch ( Exp ) { Swb* } default : Stm* break;

40 Control Structures (cont’d3)
try-catch-finally (exceptions): return / break / continue: ”method invocation”: e.g.; ”recursive method invocation”: ”virtual dispatching”: try Stm1 catch ( Exp ) Stm2 finally Stm3 return ; return Exp ; break ; continue ; f(x) f(x) f(x)

41 Control Structures (cont’d4)
”function pointers”: e.g.; ”higher-order functions”: ”dynamic evaluation”: Some constructions (and thus languages) require a separate control-flow analysis for determining control-flow in order to do data-flow analysis (*f)(x) f.x.(f x) eval(some-string-which-has-been-dynamically-computed)

42 Agenda (4/4, ’11) Introduction: Data-flow Analysis:
Undecidability, Reduction, and Approximation Data-flow Analysis: Quick tour & running example Control-Flow Graphs: Control-flow, data-flow, and confluence ”Science-Fiction Math”: Lattice theory, monotonicity, and fixed-points Putting it all together…: Example revisited


Download ppt "Static Analysis: Data-Flow Analysis I"

Similar presentations


Ads by Google