Automatically Checking the Correctness of Program Analyses and Transformations.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

CSC 4181 Compiler Construction Code Generation & Optimization.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Course Outline Traditional Static Program Analysis Software Testing
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
Program Representations. Representing programs Goals.
Optimization Compiler Baojian Hua
ISBN Chapter 3 Describing Syntax and Semantics.
Technology from seed Automatic Synthesis of Weakest Preconditions for Compiler Optimizations Nuno Lopes Advisor: José Monteiro.
Automated Soundness Proofs for Dataflow Analyses and Transformations via Local Rules Sorin Lerner* Todd Millstein** Erika Rice* Craig Chambers* * University.
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
An Integration of Program Analysis and Automated Theorem Proving Bill J. Ellis & Andrew Ireland School of Mathematical & Computer Sciences Heriot-Watt.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Recap from last time Saw several examples of optimizations –Constant folding –Constant Prop –Copy Prop –Common Sub-expression Elim –Partial Redundancy.
Automatically Proving the Correctness of Compiler Optimizations Sorin Lerner Todd Millstein Craig Chambers University of Washington.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
Administrative info Subscribe to the class mailing list –instructions are on the class web page, click on “Course Syllabus” –if you don’t subscribe by.
Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.
9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.
Advanced Compilers CSE 231 Instructor: Sorin Lerner.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Previous finals up on the web page use them as practice problems look at them early.
X := 11; if (x == 11) { DoSomething(); } else { DoSomethingElse(); x := x + 1; } y := x; // value of y? Phase ordering problem Optimizations can interact.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Some administrative stuff Class mailing list: –send to with the command “subscribe”
From last time S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Recap from last time We saw various different issues related to program analysis and program transformations You were not expected to know all of these.
Provably Correct Compilers (Part 2) Nazrul Alam and Krishnaprasad Vikram April 21, 2005.
Describing Syntax and Semantics
Schedule Midterm out tomorrow, due by next Monday Final during finals week Project updates next week.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
Recap from last time We saw various different issues related to program analysis and program transformations You were not expected to know all of these.
Pointer analysis. Flow insensitive loss of precision S1: l := new Cons p := l S2: t := new Cons *p := t p := t l t S1 p S2 l t S1 p S2 l t S1 p S2 l t.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Have Your Verified Compiler And Extend It Too Zachary Tatlock Sorin Lerner UC San Diego.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
© Andrew IrelandDependable Systems Group Proof Automation for the SPARK Approach to High Integrity Ada Andrew Ireland Computing & Electrical Engineering.
An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.
Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
PLC '06 Experience in Testing Compiler Optimizers Using Comparison Checking Masataka Sassa and Daijiro Sudo Dept. of Mathematical and Computing Sciences.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Proving Optimizations Correct using Parameterized Program Equivalence University of California, San Diego Sudipta Kundu Zachary Tatlock Sorin Lerner.
Interface specifications At the core of each Larch interface language is a model of the state manipulated by the associated programming language. Each.
Credible Compilation With Pointers Martin Rinard and Darko Marinov Laboratory for Computer Science Massachusetts Institute of Technology.
Code Optimization Overview and Examples
Code Optimization.
Machine-Independent Optimization
Optimizing Transformations Hal Perkins Autumn 2011
Compilers have many bugs
Code Optimization Overview and Examples Control Flow Graph
Data Flow Analysis Compiler Design
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Advanced Compiler Design
Intermediate Code Generation
Building “Correct” Compilers
Presentation transcript:

Automatically Checking the Correctness of Program Analyses and Transformations

Compilers have many bugs [Bug middle-end/19650] New: miscompilation of correct code [Bug c++/19731] arguments incorrectly named in static member specialization [Bug rtl-optimization/13300] Variable incorrectly identified as a biv [Bug rtl-optimization/16052] strength reduction produces wrong code [Bug tree-optimization/19633] local address incorrectly thought to escape [Bug target/19683] New: MIPS wrong-code for 64-bit multiply [Bug c++/19605] Wrong member offset in inherited classes Bug java/19295] [4.0 regression] Incorrect bytecode produced for bitwise AND … Searched for “incorrect” and “wrong” in the gcc- bugs mailing list. Some of the results: Total of 545 matches… And this is only for Janary 2005! On a mature compiler!

Compiler bugs cause problems if (…) { x := …; } else { y := …; } …; Exec Compiler They lead to buggy executables They rule out having strong guarantees about executables

The focus: compiler optimizations A key part of any optimizing compiler Original program Optimization Optimized program

The focus: compiler optimizations A key part of any optimizing compiler Hard to get optimizations right –Lots of infrastructure-dependent details –There are many corner cases in each optimization –There are many optimizations and they interact in unexpected ways –It is hard to test all these corner cases and all these interactions

Goals Make it easier to write compiler optimizations –student in an undergrad compiler course should be able to write optimizations Provide strong guarantees about the correctness of optimizations –automatically (no user intervention at all) –statically (before the opts are even run once) Expressive enough for realistic optimizations

The Rhodium work A domain-specific language for writing optimizations: Rhodium A correctness checker for Rhodium optimizations An execution engine for Rhodium optimizations Implemented and checked the correctness of a variety of realistic optimizations

Broader implications Many other kinds of program manipulators: code refactoring tools, static checkers –My work is about program analyses and transformations, the core of any program manipulator Enables safe extensible program manipulators –Allow end programmers to easily and safely extend program manipulators –Improve programmer productivity

Outline Introduction Overview of the Rhodium system Writing Rhodium optimizations Checking Rhodium optimizations Evaluation

Rhodium system overview Checker Written by programmer Written by me Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt

Rhodium system overview Checker Written by programmer Written by me Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt

Rhodium system overview Rdm Opt Rdm Opt Rdm Opt Checker

Rhodium system overview Exec Compiler Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt if (…) { x := …; } else { y := …; } …; Checker

The technical problem Tension between: –Expressiveness –Automated correctness checking Challenge: develop techniques –that will go a long way in terms of expressiveness –that allow correctness to be checked

Contribution: three techniques Automatic Theorem Prover Rdm Opt Verification Task Checker Show that for any original program: behavior of original program = behavior of optimized program Verification Task

Contribution: three techniques Automatic Theorem Prover Rdm Opt Verification Task

Contribution: three techniques Automatic Theorem Prover Rdm Opt Verification Task

Contribution: three techniques 1.Rhodium is declarative –declare intent using rules –execution engine takes care of the rest Automatic Theorem Prover Rdm Opt

Contribution: three techniques Automatic Theorem Prover Rdm Opt 1.Rhodium is declarative –declare intent using rules –execution engine takes care of the rest

Contribution: three techniques 1.Rhodium is declarative 2.Factor out heuristics –legal transformations –vs. profitable transformations Automatic Theorem Prover Rdm Opt Heuristics not affecting correctness Part that must be reasoned about

Contribution: three techniques Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics –legal transformations –vs. profitable transformations Heuristics not affecting correctness Part that must be reasoned about

Contribution: three techniques 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent Automatic Theorem Prover opt- dependent opt- independent

Contribution: three techniques 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent Automatic Theorem Prover

Contribution: three techniques Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent

Contribution: three techniques Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task Result: Expressive language Automated correctness checking

Outline Introduction Overview of the Rhodium system Writing Rhodium optimizations Checking Rhodium optimizations Evaluation

MustPointTo analysis c = a a = &b d = *c ab c ab d = b

MustPointTo info in Rhodium c = a a = &b mustPointTo ( a, b ) c ab mustPointTo ( c, b ) ab d = *c

MustPointTo info in Rhodium c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) mustPointTo ( a, b ) abab

MustPointTo info in Rhodium define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) mustPointTo ( a, b ) ab Fact correct on edge if: whenever program execution reaches edge, meaning of fact evaluates to true in the program state

Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] mustPointTo ( a, b ) ab

a = &b if currStmt == [X = &Y] then Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if currStmt == [X = &Y] then define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] mustPointTo ( a, b ) ab

if currStmt == [X = &Y] then Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] mustPointTo ( a, b ) ab

c = a Propagating facts a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if and currStmt == [Z = X] then mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] if currStmt == [X = &Y] then mustPointTo ( a, b ) ab

Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] if currStmt == [X = &Y] then if and currStmt == [Z = X] then mustPointTo ( a, b ) ab

d = *c Transformations c = a a = &b c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if and currStmt == [Z = *X] then transform to [Z = Y] mustPointTo ( c, b ) d = b define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] if currStmt == [X = &Y] then if and currStmt == [Z = X] then mustPointTo ( a, b ) ab

d = *c Transformations c = a a = &b c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if and currStmt == [Z = *X] then transform to [Z = Y] d = b define fact mustPointTo(X:Var,Y:Var) with meaning [ X == &Y ] if currStmt == [X = &Y] then if and currStmt == [Z = X] then mustPointTo ( a, b ) ab

Profitability heuristics Legal transformations Subset of legal transformations (identified by the Rhodium rules) (actually performed) Profitability Heuristics

Profitability heuristic example 1 Inlining Many heuristics to determine when to inline a function –compute function sizes, estimate code-size increase, estimate performance benefit –maybe even use AI techniques to make the decision However, these heuristics do not affect the correctness of inlining They are just used to choose which of the correct set of transformations to perform

Profitability heuristic example 2 a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := a + b; Partial redundancy elimination (PRE)

Profitability heuristic example 2 Code duplication a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := a + b; PRE as code duplication followed by CSE

Profitability heuristic example 2 Code duplication CSE a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := x := a + b; a + b; x; PRE as code duplication followed by CSE

Profitability heuristic example 2 Code duplication CSE self-assignment removal a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := x := a + b; x; PRE as code duplication followed by CSE

a :=...; b :=...; if (...) { a :=...; x := a + b; } else {... } x := a + b; Profitability heuristic example 2 Legal placements of x := a + b Profitable placement

Semantics of a Rhodium opt Run propagation rules in a loop until there are no more changes (optimistic iterative analysis) Then run transformation rules Then run profitability heuristics For better precision, combine propagation rules and transformations rules using our previous composition framework [POPL 02]

More facts define fact mustNotPointTo(X:Var,Y:Var) with meaning [ X  &Y ] define fact hasConstantValue(X:Var,C:Const) with meaning [ X == C ] define fact doesNotPointIntoHeap(X:Var) with meaning [ exists Y:Var. X == &Y ]

More rules if currStmt == [X = *A] and and forall B:Var. implies mustNotPointTo(B,Y) then if currStmt == [Y = I + BE ] and and and : mayDef(X) Æ : mayDefArray(A) Æ unchanged(BE) then

More in Rhodium More powerful pointer analyses –Heap summaries Analyses across procedures –Interprocedural analyses Analyses that don’t care about the order of statements –Flow-insensitive analyses

Outline Introduction Overview of the Rhodium system Writing Rhodium optimizations Checking Rhodium optimizations Evaluation

Exec Compiler Rhodium Execution engine Rdm Opt Rdm Opt if (…) { x := …; } else { y := …; } …; Checker Rhodium correctness checker Rdm Opt Checker

Rhodium correctness checker Checker Rdm Opt

Checker Rhodium correctness checker Automatic theorem prover Rdm Opt Checker

Rhodium correctness checker Automatic theorem prover define fact … if … then transform … if … then … Checker Profitability heuristics Rhodium optimization

Rhodium correctness checker Automatic theorem prover Rhodium optimization define fact … if … then transform … if … then … Checker

Rhodium correctness checker Automatic theorem prover Rhodium optimization define fact … VCGen Local VC Lemma For any Rhodium opt: If Local VCs are true Then opt is correct Proof   «  ¬   $  \ r t  l Checker Opt- dependent Opt- independent VCGen if … then … if … then transform …

Local verification conditions define fact mustPointTo(X,Y) with meaning [ X == &Y ] if and currStmt == [Z = X] then if and currStmt == [Z = *X] then transform to [Z = Y] Assume: Propagated fact is correct Show: All incoming facts are correct Assume: Original stmt and transformed stmt have same behavior Show: All incoming facts are correct Local VCs (generated and proven automatically)

Local correctness of prop. rules currStmt == [Z = X] then Local VC (generated and proven automatically) if and define fact mustPointTo(X,Y) with meaning [ X == &Y ] Assume: Propagated fact is correct Show: All incoming facts are correct Show: [ Z == &Y ] (  out ) [ X == &Y ] (  in ) and  out  = step (  in, [Z = X] ) Assume: mustPointTo ( X, Y ) mustPointTo ( Z, Y ) Z := X

Local correctness of prop. rules Show: [ Z == &Y ] (  out ) [ X == &Y ] (  in ) and  out  = step (  in, [Z = X] ) Assume: Local VC (generated and proven automatically) define fact mustPointTo(X,Y) with meaning [ X == &Y ] currStmt == [Z = X] then if and mustPointTo ( X, Y ) mustPointTo ( Z, Y ) Z := X XY  in  out ZY ?

Outline Introduction Overview of the Rhodium system Writing Rhodium optimizations Checking Rhodium optimizations Evaluation

Dimensions of evaluation Ease of use Correctness guarantees Usefulness of the checker Expressiveness

Ease of use Joao Dias –third year graduate student in compilers at Harvard –less than 45 mins to write CSE and copy prop Erika Rice, summer 2004 –only knowledge of compilers: one undergrad class –started writing Rhodium optimizations in a few days Simple interface to the compiler’s structures –pattern matching –“flow functions” familiar to compiler 101 students Ease of use Guarantees Usefulness Expressiveness

Correctness guarantees Once checked, optimizations are guaranteed to be correct Caveat: trusted computing base –execution engine –checker implementation –proofs done by hand once by me Adding a new optimization does not increase the size of the trusted computing base Ease of use Guarantees Usefulness Expressiveness

Usefulness of the checker Found subtle bugs in my initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning [ X == E ] if currStmt == [X = E] then x := x + 1x = x + 1 equals ( x, x + 1 ) Ease of use Guarantees Usefulness Expressiveness

if currStmt == [X = E] then if currStmt == [X = E] and “X does not appear in E” then Usefulness of the checker Found subtle bugs in my initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning [ X == E ] x := x + 1x = x + 1 equals ( x, x + 1 ) Ease of use Guarantees Usefulness Expressiveness

x = x + 1 x = *y + 1 Usefulness of the checker Found subtle bugs in my initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning [ X == E ] if currStmt == [X = E] Æ “X does not appear in E” then equals ( x, x + 1 ) equals ( x, *y + 1 ) if currStmt == [X = E] and “E does not use X” then Ease of use Guarantees Usefulness Expressiveness

Rhodium expressiveness Traditional optimizations: –const prop and folding, branch folding, dead assignment elim, common sub-expression elim, partial redundancy elim, partial dead assignment elim, arithmetic invariant detection, and integer range analysis. Pointer analyses –must-point-to analysis, Andersen's may-point-to analysis with heap summaries Loop opts –loop-induction-variable strength reduction, code hoisting, code sinking Array opts –constant propagation through array elements, redundant array load elimination Ease of use Guarantees Usefulness Expressiveness

Rhodium expressiveness Traditional optimizations: –const prop and folding, branch folding, dead assignment elim, common sub-expression elim, partial redundancy elim, partial dead assignment elim, arithmetic invariant detection, and integer range analysis. Pointer analyses –must-point-to analysis, Andersen's may-point-to analysis with heap summaries Loop opts –loop-induction-variable strength reduction, code hoisting, code sinking Array opts –constant propagation through array elements, redundant array load elimination Ease of use Guarantees Usefulness Expressiveness

Expressiveness limitations May not be able to express your optimization in Rhodium –opts that build complicated data structures –opts that perform complicated many-to-many transformations (e.g.: loop fusion, loop unrolling) A correct Rhodium optimization may be rejected by the correctness checker –limitations of the theorem prover –limitations of first-order logic Ease of use Guarantees Usefulness Expressiveness

Summary Rhodium system –makes it easier to write optimizations –provides correctness guarantees –is expressive enough for realistic optimizations Rhodium system provides a foundation for safe extensible program manipulators