Correctness. Until now We’ve seen how to define dataflow analyses How do we know our analyses are correct? We could reason about each individual analysis.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
Program Representations. Representing programs Goals.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
CSE 231 : Advanced Compilers Building Program Analyzers.
CSE 231 : Advanced Compilers Building Program Analyzers.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
1 Introduction to Computability Theory Lecture2: Non Deterministic Finite Automata (cont.) Prof. Amos Israeli.
Lecture 8 Recursively enumerable (r.e.) languages
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Course project presentations No midterm project presentation Instead of classes, next week I’ll meet with each group individually, 30 mins each Two time.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
CS 536 Spring Global Optimizations Lecture 23.
Data Abstraction COS 441 Princeton University Fall 2004.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.
From last time: Lattices A lattice is a tuple (S, v, ?, >, t, u ) such that: –(S, v ) is a poset – 8 a 2 S. ? v a – 8 a 2 S. a v > –Every two elements.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –
X := 11; if (x == 11) { DoSomething(); } else { DoSomethingElse(); x := x + 1; } y := x; // value of y? Phase ordering problem Optimizations can interact.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Back to lattice (D, v, ?, >, t, u ) = (2 A, ¶, A, ;, Å, [ ) where A = { x ! N | x 2 Vars Æ N 2 Z } What’s the problem with this lattice? Lattice is infinitely.
Loop invariant detection using SSA An expression is invariant in a loop L iff: (base cases) –it’s a constant –it’s a variable use, all of whose single.
Class canceled next Tuesday. Recap: Components of IR Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements.
Administrative stuff Office hours: After class on Tuesday.
Recap Let’s do a recap of what we’ve seen so far Started with worklist algorithm for reaching definitions.
Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –
San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
From last lecture x := y op z in out F x := y op z (in) = in [ x ! in(y) op in(z) ] where a op b =
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Projects. Dataflow analysis Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
Even more formal To reason more formally about termination and precision, we re-express our worklist algorithm mathematically We will use fixed points.
Termination Still, it’s annoying to have to perform a join in the worklist algorithm It would be nice to get rid of it, if there is a property of the flow.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Precision Going back to constant prop, in what cases would we lose precision?
Example x := read() v := a + b x := x + 1 w := x + 1 a := w v := a + b z := x + 1 t := a + b.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Theory of Computation 1 Theory of Computation Peer Instruction Lecture Slides by Dr. Cynthia Lee, UCSD are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.
MA/CSSE 474 Theory of Computation DFSM Canonical Form Proof of NDFSM  DFSM ALGORITHM (as much as we have time for) This version includes the "answers"
Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Jeffrey D. Ullman Stanford University. 2 boolean x = true; while (x) {... // no change to x }  Doesn’t terminate.  Proof: only assignment to x is at.
Formalization of DFA using lattices. Recall worklist algorithm let m: map from edge to computed value at edge let worklist: work list of nodes for each.
Program Representations. Representing programs Goals.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Abstraction and Abstract Interpretation. Abstraction (a simplified view) Abstraction is an effective tool in verification Given a transition system, we.
Lecture #4 Thinking of designing an abstract machine acts as finite automata. Advanced Computation Theory.
Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d.
DFA foundations Simone Campanoni
Dataflow analysis.
G. Ramalingam Microsoft Research, India & K. V. Raghavan
Another example: constant prop
Dataflow analysis.
Abstract Interpretation
Abstract Interpretation
Formalization of DFA using lattices
Formalization of DFA using lattices
Formalization of DFA using lattices
Formalization of DFA using lattices
Presentation transcript:

Correctness

Until now We’ve seen how to define dataflow analyses How do we know our analyses are correct? We could reason about each individual analysis one a time However, a unified framework would make proofs easier to develop and understand

Abstract interpretation Abstract interpretation is such a framework Life in analysis-land is all about two things: –fixed points –and approximations In most general terms, abstract interpretation is a theory of fixed point approximation Abstract interpretation is very flexible, and it has been applied to many domains

Just a reminder An analysis (ignore subscripts for now...): F a is global flow function, which takes a map from edges to dataflow information Solution is

A simple example with const prop x := 0; y := 0; while (...) { x := x + 1; print(x); } print(y);

Same example with a twist At merge point, take union of maps, rather than merging maps x := 0; y := 0; while (...) { x := x + 1; print(x); } print(y);

What exactly is going on? (discussion)

What exactly is going on? Well, to begin with, our analysis doesn’t terminate anymore... We are keeping much more information around In fact, we are running the program

Excursion into semantics land Semantics of a programming language –captures how instructions of a programming language operate vs. semantics of a program –captures how a given program runs

Semantics of a program Can use fixed points to capture the semantics of a program Solution:

Semantics of a program Back to our const prop example What were we computing?

Semantics of a program Back to our const prop example What were we computing? Set of all program states at a given CFG edge This fixed point is not computable, but we never compute it. We only reason about it.

Abstract Interpretation An abstract interpretation I is a tuple: Important to not get confused: F is the global flow function here, and D is the global domain (ie: most of the time D will contain maps form edges to dataflow information)

Concrete and abstract “Concrete” interpretation “Running” program in concrete domain Generally not computable “Abstract” interpretation “Running” program in abstract domain Computable

Concrete and abstract Recall I is an abstract interpretation So we should really be saying “concrete” abstract interpretation, and “abstract” abstract interpretation. So even the concrete interpretation is called abstract... Anyone confused yet?

Concrete and abstract Ok, so why is the “concrete” interpretation a “concrete” abstract interpretation Because all interpretations are in some way or another abstractions of the program In fact, even our so-called “concrete” interpretation can be seen as “abstract” when compared to other interpretation

Back to semantics of a program Collecting semantics –compute set of all program states (this is the const prop example) More “concrete” than collecting semantics: trace prefix semantics –compute at edge e the set of all program traces that reach e Even more concrete: full trace semantics –collect set of all traces Even more concrete?

Back to semantics of a program Less concrete than trace semantics: input-output semantics –compute set of input-output pairs So, to summarize: many options, of varying levels of abstractions. In some sense, they are all abstract (unless maybe you are capturing the set of electron transitions in the wires of your computer... )

Back to semantics of a program Choosing the right concrete semantics is actually important: it affects what you are proving about the program But the key is that all can be expressed as a fixed point computation

Correctness Ok, so now we have two fixed point computations I c and I a. I c is the precise semantics of our program, but it’s not computable. So we compute I a instead. I a is computable, but... is it meaningfull? In other words does I a in fact tell us something about I c ? We now want to show that the abstract fixed point is in fact meaningful, in that it approximates the concrete one.

Formally Formalize relation between the two fixed points using two functions: –abstraction function  –concretization function  ???

Let’s start with the concretization function  : D a ! D c  (d a ) returns the most lenient concrete information that d a approximates For const prop:  (d a ) =

Let’s start with the concretization function  : D a ! D c  (d a ) returns the most lenient concrete information that d a approximates For const prop:  (d a ) =

Approximation d a approximates d c iff: d c v c  (d a ) Assume that at a given edge e, the dataflow info says that a is 4, ie: d a (e) = { a ! 4 }, and assume that d a approximates d c i.e.: d c v c  (d a ) From d c v c  (d a ), using the definition of v c, we get: d c (e) µ  (d a )(e) From d a (e) = { a ! 4 } and defn of , we get  (d a )(e) is the set of all program states where a is 4. So what does d c (e) µ  (d a )(e) say?

Approximation d a approximates d c iff: d c v c  (d a ) Assume that at a given edge e, the dataflow info says that a is 4, ie: d a (e) = { a ! 4 }, and assume that d a approximates d c i.e.: d c v c  (d a ) From d c v c  (d a ), using the definition of v c, we get: d c (e) µ  (d a )(e) From d a (e) = { a ! 4 } and defn of , we get  (d a )(e) is the set of all program states where a is 4. So what does d c (e) µ  (d a )(e) say? It says that a evaluates to 4 in all program states at e

Fixed point approximation We want to show that the abstract fixed point approximates the concrete fixed point

Fixed point approximation We want to show that the abstract fixed point approximates the concrete fixed point This is our goal. We’ll get to establishing it later. First, let’s see 

Abstraction function  : D c ! D a  (d c ) returns the most precise abstract information that characterizes d c For const prop:  (d c ) =

Abstraction function  : D c ! D a  (d c ) returns the most precise abstract information that characterizes d c For const prop:  (d c ) =

Approximation d a approximates d c iff:  (d c ) v a d a Assume that at a given edge e, the dataflow info says that a is 4, ie: d a (e) = { a ! 4 }, and assume that d a approximates d c i.e.:  (d c ) v a d a From  (d c ) v a d a,using the definition of v a, we get:  (d c )(e) ¶ d a (e), and since d a (e) = { a ! 4 }, we get (a ! 4) 2  (d c )(e). From defn of ,  (d c )(e) is the set of all constant prop information that we could possibly get out of d c. So what does (a ! 4) 2  (d c )(e) say?

Fixed point approximation We want to show that the abstract fixed point approximates the concrete fixed point

Fixed point approximation We want to show that the abstract fixed point approximates the concrete fixed point

Summary Want to show:

Summary

Problem The above conditions are global: they talk about the fixed point computation We want some local conditions on F a and F c

Cousot and Cousot 77 Cousot and Cousot show that the following conditions are sufficient for proving (1) and (2): (3) (4) (5)

Let’s look at the  condition

Let’s look at the  condition

Link between local and global Indeed, using (4) we can show by induction that is local version of (4) (1)

Link between local and global