Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatically Checking the Correctness of Program Analyses and Transformations.

Similar presentations


Presentation on theme: "Automatically Checking the Correctness of Program Analyses and Transformations."— Presentation transcript:

1 Automatically Checking the Correctness of Program Analyses and Transformations

2 Compilers have many bugs [Bug middle-end/19650] New: miscompilation of correct code [Bug c++/19731] arguments incorrectly named in static member specialization [Bug rtl-optimization/13300] Variable incorrectly identified as a biv [Bug rtl-optimization/16052] strength reduction produces wrong code [Bug tree-optimization/19633] local address incorrectly thought to escape [Bug target/19683] New: MIPS wrong-code for 64-bit multiply [Bug c++/19605] Wrong member offset in inherited classes Bug java/19295] [4.0 regression] Incorrect bytecode produced for bitwise AND … Searched for “incorrect” and “wrong” in the gcc- bugs mailing list. Some of the results: Total of 545 matches… And this is only for Janary 2005! On a mature compiler!

3 Compiler bugs cause problems if (…) { x := …; } else { y := …; } …; Exec Compiler They lead to buggy executables They rule out having strong guarantees about executables

4 The focus: compiler optimizations A key part of any optimizing compiler Original program Optimization Optimized program

5 The focus: compiler optimizations A key part of any optimizing compiler Hard to get optimizations right –Lots of infrastructure-dependent details –There are many corner cases in each optimization –There are many optimizations and they interact in unexpected ways –It is hard to test all these corner cases and all these interactions

6 Goals Make it easier to write compiler optimizations –student in an undergrad compiler course should be able to write optimizations Provide strong guarantees about the correctness of optimizations –automatically (no user intervention at all) –statically (before the opts are even run once) Expressive enough for realistic optimizations

7 The Rhodium work A domain-specific language for writing optimizations: Rhodium A correctness checker for Rhodium optimizations An execution engine for Rhodium optimizations Implemented and checked the correctness of a variety of realistic optimizations

8 Broader implications Many other kinds of program manipulators: code refactoring tools, static checkers –My work is about program analyses and transformations, the core of any program manipulator Enables safe extensible program manipulators –Allow end programmers to easily and safely extend program manipulators –Improve programmer productivity

9 Outline Introduction Overview of the Rhodium system Writing Rhodium optimizations Checking Rhodium optimizations Evaluation

10 Rhodium system overview Checker Written by programmer Written by me Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt

11 Rhodium system overview Checker Written by programmer Written by me Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt

12 Rhodium system overview Rdm Opt Rdm Opt Rdm Opt Checker

13 Rhodium system overview Exec Compiler Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt if (…) { x := …; } else { y := …; } …; Checker

14 The technical problem Tension between: –Expressiveness –Automated correctness checking Challenge: develop techniques –that will go a long way in terms of expressiveness –that allow correctness to be checked

15 Contribution: three techniques Automatic Theorem Prover Rdm Opt Verification Task Checker Show that for any original program: behavior of original program = behavior of optimized program Verification Task

16 Contribution: three techniques Automatic Theorem Prover Rdm Opt Verification Task

17 Contribution: three techniques Automatic Theorem Prover Rdm Opt Verification Task

18 Contribution: three techniques 1.Rhodium is declarative –declare intent using rules –execution engine takes care of the rest Automatic Theorem Prover Rdm Opt

19 Contribution: three techniques Automatic Theorem Prover Rdm Opt 1.Rhodium is declarative –declare intent using rules –execution engine takes care of the rest

20 Contribution: three techniques 1.Rhodium is declarative 2.Factor out heuristics –legal transformations –vs. profitable transformations Automatic Theorem Prover Rdm Opt Heuristics not affecting correctness Part that must be reasoned about

21 Contribution: three techniques Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics –legal transformations –vs. profitable transformations Heuristics not affecting correctness Part that must be reasoned about

22 Contribution: three techniques 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent Automatic Theorem Prover opt- dependent opt- independent

23 Contribution: three techniques 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent Automatic Theorem Prover

24 Contribution: three techniques Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent

25 Contribution: three techniques Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task Result: Expressive language Automated correctness checking

26 Outline Introduction Overview of the Rhodium system Writing Rhodium optimizations Checking Rhodium optimizations Evaluation

27 MustPointTo analysis c = a a = &b d = *c ab c ab d = b

28 MustPointTo info in Rhodium c = a a = &b mustPointTo ( a, b ) c ab mustPointTo ( c, b ) ab d = *c

29 MustPointTo info in Rhodium c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) mustPointTo ( a, b ) abab

30 MustPointTo info in Rhodium define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) mustPointTo ( a, b ) ab Fact correct on edge if: whenever program execution reaches edge, meaning of fact evaluates to true in the program state

31 Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] mustPointTo ( a, b ) ab

32 a = &b if currStmt == [X = &Y] then mustPointTo(X,Y)@out Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if currStmt == [X = &Y] then mustPointTo(X,Y)@out define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] mustPointTo ( a, b ) ab

33 if currStmt == [X = &Y] then mustPointTo(X,Y)@out Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] mustPointTo ( a, b ) ab

34 c = a Propagating facts a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if mustPointTo(X,Y)@in and currStmt == [Z = X] then mustPointTo(Z,Y)@out mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] if currStmt == [X = &Y] then mustPointTo(X,Y)@out mustPointTo ( a, b ) ab

35 Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] if currStmt == [X = &Y] then mustPointTo(X,Y)@out if mustPointTo(X,Y)@in and currStmt == [Z = X] then mustPointTo(Z,Y)@out mustPointTo ( a, b ) ab

36 d = *c Transformations c = a a = &b c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if mustPointTo(X,Y)@in and currStmt == [Z = *X] then transform to [Z = Y] mustPointTo ( c, b ) d = b define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] if currStmt == [X = &Y] then mustPointTo(X,Y)@out if mustPointTo(X,Y)@in and currStmt == [Z = X] then mustPointTo(Z,Y)@out mustPointTo ( a, b ) ab

37 d = *c Transformations c = a a = &b c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if mustPointTo(X,Y)@in and currStmt == [Z = *X] then transform to [Z = Y] d = b define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] if currStmt == [X = &Y] then mustPointTo(X,Y)@out if mustPointTo(X,Y)@in and currStmt == [Z = X] then mustPointTo(Z,Y)@out mustPointTo ( a, b ) ab

38 Profitability heuristics Legal transformations Subset of legal transformations (identified by the Rhodium rules) (actually performed) Profitability Heuristics

39 Profitability heuristic example 1 Inlining Many heuristics to determine when to inline a function –compute function sizes, estimate code-size increase, estimate performance benefit –maybe even use AI techniques to make the decision However, these heuristics do not affect the correctness of inlining They are just used to choose which of the correct set of transformations to perform

40 Profitability heuristic example 2 a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := a + b; Partial redundancy elimination (PRE)

41 Profitability heuristic example 2 Code duplication a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := a + b; PRE as code duplication followed by CSE

42 Profitability heuristic example 2 Code duplication CSE a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := x := a + b; a + b; x; PRE as code duplication followed by CSE

43 Profitability heuristic example 2 Code duplication CSE self-assignment removal a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := x := a + b; x; PRE as code duplication followed by CSE

44 a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := a + b; Profitability heuristic example 2 Legal placements of x := a + b Profitable placement

45 Semantics of a Rhodium opt Run propagation rules in a loop until there are no more changes (optimistic iterative analysis) Then run transformation rules Then run profitability heuristics For better precision, combine propagation rules and transformations rules using our previous composition framework [POPL 02]

46 More facts define fact mustNotPointTo(X:Var,Y:Var) with meaning [ X  &Y ] define fact hasConstantValue(X:Var,C:Const) with meaning [ X == C ] define fact doesNotPointIntoHeap(X:Var) with meaning [ exists Y:Var. X == &Y ]

47 More rules if currStmt == [X = *A] and mustNotPointToHeap(A)@in and forall B:Var. mayPointTo(A,B)@in implies mustNotPointTo(B,Y) then mustNotPointTo(X,Y)@out if currStmt == [Y = I + BE ] and varEqualArray(X,A,J)@in and equalsPlus(J,I,BE)@in and : mayDef(X) Æ : mayDefArray(A) Æ unchanged(BE) then varEqualArray(X,A,Y)@out

48 More in Rhodium More powerful pointer analyses –Heap summaries Analyses across procedures –Interprocedural analyses Analyses that don’t care about the order of statements –Flow-insensitive analyses

49 Outline Introduction Overview of the Rhodium system Writing Rhodium optimizations Checking Rhodium optimizations Evaluation

50 Exec Compiler Rhodium Execution engine Rdm Opt Rdm Opt if (…) { x := …; } else { y := …; } …; Checker Rhodium correctness checker Rdm Opt Checker

51 Rhodium correctness checker Checker Rdm Opt

52 Checker Rhodium correctness checker Automatic theorem prover Rdm Opt Checker

53 Rhodium correctness checker Automatic theorem prover define fact … if … then transform … if … then … Checker Profitability heuristics Rhodium optimization

54 Rhodium correctness checker Automatic theorem prover Rhodium optimization define fact … if … then transform … if … then … Checker

55 Rhodium correctness checker Automatic theorem prover Rhodium optimization define fact … VCGen Local VC Lemma For any Rhodium opt: If Local VCs are true Then opt is correct Proof   «  ¬   $  \ r t  l Checker Opt- dependent Opt- independent VCGen if … then … if … then transform …

56 Local verification conditions define fact mustPointTo(X,Y) with meaning [ X == &Y ] if mustPointTo(X,Y)@in and currStmt == [Z = X] then mustPointTo(Z,Y)@out if mustPointTo(X,Y)@in and currStmt == [Z = *X] then transform to [Z = Y] Assume: Propagated fact is correct Show: All incoming facts are correct Assume: Original stmt and transformed stmt have same behavior Show: All incoming facts are correct Local VCs (generated and proven automatically)

57 Local correctness of prop. rules currStmt == [Z = X] then mustPointTo(Z,Y)@out Local VC (generated and proven automatically) if mustPointTo(X,Y)@in and define fact mustPointTo(X,Y) with meaning [ X == &Y ] Assume: Propagated fact is correct Show: All incoming facts are correct Show: [ Z == &Y ] (  out ) [ X == &Y ] (  in ) and  out  = step (  in, [Z = X] ) Assume: mustPointTo ( X, Y ) mustPointTo ( Z, Y ) Z := X

58 Local correctness of prop. rules Show: [ Z == &Y ] (  out ) [ X == &Y ] (  in ) and  out  = step (  in, [Z = X] ) Assume: Local VC (generated and proven automatically) define fact mustPointTo(X,Y) with meaning [ X == &Y ] currStmt == [Z = X] then mustPointTo(Z,Y)@out if mustPointTo(X,Y)@in and mustPointTo ( X, Y ) mustPointTo ( Z, Y ) Z := X XY  in  out ZY ?

59 Outline Introduction Overview of the Rhodium system Writing Rhodium optimizations Checking Rhodium optimizations Evaluation

60 Dimensions of evaluation Ease of use Correctness guarantees Usefulness of the checker Expressiveness

61 Ease of use Joao Dias –third year graduate student in compilers at Harvard –less than 45 mins to write CSE and copy prop Erika Rice, summer 2004 –only knowledge of compilers: one undergrad class –started writing Rhodium optimizations in a few days Simple interface to the compiler’s structures –pattern matching –“flow functions” familiar to compiler 101 students Ease of use Guarantees Usefulness Expressiveness

62 Correctness guarantees Once checked, optimizations are guaranteed to be correct Caveat: trusted computing base –execution engine –checker implementation –proofs done by hand once by me Adding a new optimization does not increase the size of the trusted computing base Ease of use Guarantees Usefulness Expressiveness

63 Usefulness of the checker Found subtle bugs in my initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning [ X == E ] if currStmt == [X = E] then equals(X,E)@out x := x + 1x = x + 1 equals ( x, x + 1 ) Ease of use Guarantees Usefulness Expressiveness

64 if currStmt == [X = E] then equals(X,E)@out if currStmt == [X = E] and “X does not appear in E” then equals(X,E)@out Usefulness of the checker Found subtle bugs in my initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning [ X == E ] x := x + 1x = x + 1 equals ( x, x + 1 ) Ease of use Guarantees Usefulness Expressiveness

65 x = x + 1 x = *y + 1 Usefulness of the checker Found subtle bugs in my initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning [ X == E ] if currStmt == [X = E] Æ “X does not appear in E” then equals(X,E)@out equals ( x, x + 1 ) equals ( x, *y + 1 ) if currStmt == [X = E] and “E does not use X” then equals(X,E)@out Ease of use Guarantees Usefulness Expressiveness

66 Rhodium expressiveness Traditional optimizations: –const prop and folding, branch folding, dead assignment elim, common sub-expression elim, partial redundancy elim, partial dead assignment elim, arithmetic invariant detection, and integer range analysis. Pointer analyses –must-point-to analysis, Andersen's may-point-to analysis with heap summaries Loop opts –loop-induction-variable strength reduction, code hoisting, code sinking Array opts –constant propagation through array elements, redundant array load elimination Ease of use Guarantees Usefulness Expressiveness

67 Rhodium expressiveness Traditional optimizations: –const prop and folding, branch folding, dead assignment elim, common sub-expression elim, partial redundancy elim, partial dead assignment elim, arithmetic invariant detection, and integer range analysis. Pointer analyses –must-point-to analysis, Andersen's may-point-to analysis with heap summaries Loop opts –loop-induction-variable strength reduction, code hoisting, code sinking Array opts –constant propagation through array elements, redundant array load elimination Ease of use Guarantees Usefulness Expressiveness

68 Expressiveness limitations May not be able to express your optimization in Rhodium –opts that build complicated data structures –opts that perform complicated many-to-many transformations (e.g.: loop fusion, loop unrolling) A correct Rhodium optimization may be rejected by the correctness checker –limitations of the theorem prover –limitations of first-order logic Ease of use Guarantees Usefulness Expressiveness

69 Summary Rhodium system –makes it easier to write optimizations –provides correctness guarantees –is expressive enough for realistic optimizations Rhodium system provides a foundation for safe extensible program manipulators


Download ppt "Automatically Checking the Correctness of Program Analyses and Transformations."

Similar presentations


Ads by Google