Presentation is loading. Please wait.

Presentation is loading. Please wait.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.

Similar presentations


Presentation on theme: "Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University."— Presentation transcript:

1 Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University

2 Tentative syllabus Semantics Natural Semantics Structural semantics Axiomatic Verification Static Analysis Automating Hoare Logic Control Flow Graphs Equation Systems Collecting Semantics Abstract Interpretation fundamentals LatticesFixed-Points Chaotic Iteration Galois Connections Domain constructors Widening/ Narrowing Analysis Techniques Numerical Domains Alias analysis Interprocedural Analysis Shape Analysis CEGAR Crafting your own Soot From proofs to abstractions Systematically developing transformers 2

3 Previously Static Analysis by example – Simple Available Expressions analysis – Abstract transformer for assignments – Three-address code – Processing serial composition – Processing conditions – Processing loops 3

4 Agenda Another static analysis example: Constant Propagation Basic concepts in static analysis – Control flow graphs – Equation systems – Collecting semantics – (Trace semantics) 4

5 Constant propagation 5

6 Second static analysis example Optimization: constant folding – Example: x:=7; y:=x*9 transformed to: x:=7; y:=7*9 and then to: x:=7; y:=63 Analysis: constant propagation (CP) – Infers facts of the form x = c 6 { x = c } y := aexpr y := eval(aexpr[c/x]) constant folding simplifies constant expressions

7 Plan Define domain – set of allowed assertions Handle assignments Handle composition Handle conditions Handle loops 7

8 Constant propagation domain 8

9 CP semantic domain 9 ?

10 Define CP-factoids:  = { x = c | x  Var, c  Z } – How many factoids are there? Define predicates as  = 2  – How many predicates are there? – Do all predicates make sense? (x=5)  (x=7) Treat conjunctive formulas as sets of factoids {x=5, y=7} ~ (x=5)  (y=7) 10

11 Handling assignments 11

12 CP abstract transformer Goal: define a function F CP [x:=aexpr] :    such that if F CP [x:=aexpr] P = P’ then sp(x:=aexpr, P)  P’ 12 ?

13 CP abstract transformer Goal: define a function F CP [x:=aexpr] :    such that if F CP [x:=aexpr] P = P’ then sp(x:=aexpr, P)  P’ 13 { x=c } x:=aexpr { } [kill] { y=c 1, z=c 2 } x:=y op z { x=c} and c=c 1 op c 2 [gen-2] { } x:=c { x=c } [gen-1] { y=c } x:=aexpr { y=c } [preserve]

14 Gen-kill formulation of transformers Suited for analysis propagating sets of factoids – Available expressions, – Constant propagation, etc. For each statement, define a set of killed factoids and a set of generated factoids F[S] P = (P \ kill(S))  gen(S) F CP [x:=aexpr] P = (P \ {x=c}) aexpr is not a constant F CP [x:=k] P = (P \ {x=c})  {x=k} Used in dataflow analysis – a special case of abstract interpretation 14

15 Handling composition 15

16 Does this still work? Annotate(P, S 1 ; S 2 ) = let Annotate(P, S 1 ) be {P} A 1 {Q 1 } let Annotate(Q 1, S 2 ) be {Q 1 } A 2 {Q 2 } return {P} A 1 ; {Q 1 } A 2 {Q 2 } 16

17 Handling conditions 17

18 Handling conditional expressions We want to soundly approximate D  bexpr and D   bexpr in  Define  (bexpr) = if bexpr is CP-factoid {bexpr} else {} Define F[ assume bexpr](D) = D   (bexpr) 18

19 Does this still work? let P t = F[ assume bexpr] P let P f = F[ assume  bexpr] P let Annotate(P t, S 1 ) be {P t } A 1 {Q 1 } let Annotate(P f, S 2 ) be {P f } A 2 {Q 2 } return {P} if bexpr then {P t } A 1 {Q 1 } else {P f } A 2 {Q 2 } {Q 1  Q 2 } 19 How do we define join for CP?

20 Join example {x=5, y=7}  {x=3, y=7, z=9} = 20

21 Handling loops 21

22 Does this still work? What about correctness? What about termination? 22 Annotate(P, while bexpr do S) = N := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N} N c := N c  N until N = Nc return {P} INV= {N} while bexpr do {P t } A body {F[ assume  bexpr](N)}

23 Does this still work? What about correctness? – If loop terminates then is N a loop invariant? What about termination? 23 Annotate(P, while bexpr do S) = N := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N} N c := N c  N until N = Nc return {P} INV= {N} while bexpr do {P t } A body {F[ assume  bexpr](N)}

24 A termination principle g : X  X is a function How can we determine whether the sequence x 0, x 1 = g(x 0 ), …, x k+1 =g(x k ),… stabilizes? Technique: 1.Find ranking function rank : X  N (that is show that rank(x)  0 for all x) 2.Show that if x  g(x) then rank(g(x)) < rank(x) 24

25 Rank function for available expressions rank(P) = ? 25

26 Rank function for available expressions rank(P) = |P| number of factoids Prove that either N c = N c  N or rank(N c  N) < ? rank(N c ) 26 Annotate(P, while bexpr do S) = N := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N} N c := N c  N until N = Nc return {P} INV= {N} while bexpr do {P t } A body {F[ assume  bexpr](N)}

27 Rank function for constant propagation rank(P) = ? Prove that either N c = N c  N or rank(N c ) > ? rank(N c  N) 27 Annotate(P, while bexpr do S) = N := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N} N c := N c  N until N = Nc return {P} INV= {N} while bexpr do {P t } A body {F[ assume  bexpr](N)}

28 Rank function for constant propagation rank(P) = |P| number of factoids Prove that either N c = N c  N’ or rank(N c ) > ? rank(N c  N’) 28 Annotate(P, while bexpr do S) = N’ := N c := P // Initialize repeat let P t = F[ assume bexpr] N c let Annotate(P t, S) be {N c } A body {N’} N c := N c  N’ until N’ = Nc return {P} INV= {N’} while bexpr do {P t } A body {F[ assume  bexpr](N)}

29 Generalizing 29 By NMZ (Photoshop) [CC0], via Wikimedia Commons 1 Available Expressions Constant Propagation Abstract Interpretation

30 Towards a recipe for static analysis Two static analyses – Available Expressions (extended with equalities) – Constant Propagation Semantic domain – a family of formulas – Join operator approximates pairs of formulas Abstract transformers for basic statements – Assignments – assume statements Initial precondition 30

31 Control flow graphs 31

32 A technical issue Unrolling loops is quite inconvenient and inefficient (but we can avoid it as we just saw) How do we handle more complex control-flow constructs, e.g., goto, break, exceptions…? – The problem: non-inductive control flow constructs Solution: model control-flow by labels and goto statements Would like a dedicated data structure to explicitly encode control flow in support of the analysis Solution: control-flow graphs (CFGs) 32

33 Modeling control flow with labels 33 while (x  z) do x := x + 1 y := x + a d := x + a a := b label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b

34 Control-flow graph example 34 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b 2 3 4 5 7 8 6 label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b 1 2 3 4 5 6 7 8 line number

35 Control-flow graph example 35 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b 2 3 4 5 7 8 6 label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b 1 2 3 4 5 6 8 entry exit 7

36 Control-flow graph Node are statements or labels Special nodes for entry/exit A edge from node v to node w means that after executing the statement of v control passes to w – Conditions represented by splits and join node – Loops create cycles Can be generated from abstract syntax tree in linear time – Automatically taken care of by the front-end Usage: store analysis results (assertions) in CFG nodes 36

37 Control-flow graph example 37 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b 2 3 4 5 7 8 6 label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b 1 2 3 4 5 6 7 8 entry exit

38 Eliminating labels We can use edges to point to the nodes following labels and remove all label nodes (other than entry/exit) 38

39 Control-flow graph example 39 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b 2 3 4 5 7 8 6 label0: if x  z x := x + 1 y := x + a d := x + a goto label0 label1: a := b 1 2 3 4 5 6 7 8 entry exit

40 Control-flow graph example 40 1 label0: if x  z goto label1 x := x + 1 y := x + a d := x + a goto label0 label1: a := b 2 3 4 5 7 8 6 if x  z x := x + 1 y := x + a d := x + a a := b 2 3 4 5 8 entry exit

41 Basic blocks A basic block is a chain of nodes with a single entry point and a single exit point Entry/exit nodes are separate blocks 41 if x  z x := x + 1 y := x + a d := x + a a := b 2 3 4 5 8 entry exit

42 Blocked CFG Stores basic blocks in a single node Extended blocks – maximal connected loop- free subgraphs 42 if x  z x := x + 1 y := x + a d := x + a a := b 2 3 8 entry exit 4 5

43 43 Collecting semantics

44 Why need another semantics? Operational semantics explains how to compute output from a given input – Useful for implementing an interpreter/compiler – Less useful for reasoning about safety properties – Not suitable for analysis purposes – does not explicitly show how assertions in different program points influence each other Need a more explicit semantics – Over a control flow graph 44

45 Control-flow graph example 1 2 3 4 5 if x > 0 x := x - 1 goto label0: label1: 2 3 45 entry exit label0: 1 45 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1:

46 Trimmed CFG 1 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit 46 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1:

47 Collecting semantics example: input 1 1 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit [x1][x1] [x1][x1] [x0][x0] [x0][x0] 47 [x1][x1][x2][x2][x3][x3] … label0: if x <= 0 goto label1 x := x – 1 goto label0 label1:

48 Collecting semantics example: input 2 1 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit [x1][x1] [x1][x1] [x0][x0][x2][x2] [x2][x2] 48 [x1][x1][x2][x2][x3][x3] … label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: [x0][x0]

49 Collecting semantics example: input 3 1 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit [x1][x1] [x1][x1] [x0][x0][x2][x2] [x2][x2] [x3][x3] [x3][x3] 49 [x1][x1][x2][x2][x3][x3] … label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: [x0][x0]

50 ad infinitum – fixed point 1 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit [x1][x1] [x1][x1] [x1][x1] [x0][x0] [x2][x2] [x2][x2] [x2][x2] [x3][x3] [x3][x3] [x3][x3] … … … 50 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: [ x  -1][ x  -2] … [x0][x0]

51 Predicates at fixed point 1 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit 51 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: { true } {?}{?} {?}{?}{?}{?}

52 Predicates at fixed point 1 2 3 4 5 if x > 0 x := x - 1 2 3 entry exit 52 label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: { true } { x>0 }{x0}{x0}{x0}{x0}

53 Collecting semantics Accumulates for each control-flow node the (possibly infinite) sets of states that can reach there by executing the program from some given set of input states Not computable in general A reference point for static analysis (An abstraction of the trace semantics) We will define it formally 53

54 Collecting semantics in equational form 54

55 Math reference: function lifting Let f : X  Y be a function The lifted function f’ : 2 X  2 Y is defined as f’(XS) = { f(x) | x  XS } We will sometimes use the same symbol for both functions when it is clear from the context which one is used 55

56 Equational definition example A vector of variables R[0, 1, 2, 3, 4] R[0] = { x  Z} // established input R[1] = R[0]  R[4] R[2] =  assume x>0  R[1] R[3] =  assume  ( x>0)  R[1] R[4] =  x:=x-1  R[2] A (recursive) system of equations 56 if x > 0 x := x-1 entry exit R[0] R[1] R[2] R[4] R[3] Semantic function for x:=x-1 lifted to sets of states

57 General definition A vector of variables R[0, …, k] one per input/output of a node – R[0] is for entry For node n with multiple predecessors add equation R[n] =  {R[k] | k is a predecessor of n} For an atomic operation node R[m] S R[n] add equation R[n] =  S  R[m] Transform if b then S 1 else S 2 to ( assume b; S 1 ) or ( assume  b; S 2 ) 57 if x > 0 x := x-1 entry exit R[0] R[1] R[2] R[4] R[3]

58 Next lecture: abstract interpretation fundamentals


Download ppt "Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University."

Similar presentations


Ads by Google