CSE 231 : Advanced Compilers Building Program Analyzers.

Slides:



Advertisements
Similar presentations
Partial Orderings Section 8.6.
Advertisements

Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Relations Relations on a Set. Properties of Relations.
Equivalence, Order, and Inductive Proof
8.6 Partial Orderings. Definition Partial ordering– a relation R on a set S that is Reflexive, Antisymmetric, and Transitive Examples? R={(a,b)| a is.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
CSE 231 : Advanced Compilers Building Program Analyzers.
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Correctness. Until now We’ve seen how to define dataflow analyses How do we know our analyses are correct? We could reason about each individual analysis.
From last time: Lattices A lattice is a tuple (S, v, ?, >, t, u ) such that: –(S, v ) is a poset – 8 a 2 S. ? v a – 8 a 2 S. a v > –Every two elements.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Administrative stuff Office hours: After class on Tuesday.
Recap Let’s do a recap of what we’ve seen so far Started with worklist algorithm for reaching definitions.
Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –
Data Flow Analysis Compiler Design Nov. 8, 2005.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
Orderings and Bounds Parallel FSM Decomposition Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 10 Update and modified by Marek.
1 CS 201 Compiler Construction Lecture 4 Data Flow Framework.
Partial Orderings: Selected Exercises
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Even more formal To reason more formally about termination and precision, we re-express our worklist algorithm mathematically We will use fixed points.
Termination Still, it’s annoying to have to perform a join in the worklist algorithm It would be nice to get rid of it, if there is a property of the flow.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
1 Partial Orderings Based on Slides by Chuck Allison from Rosen, Chapter 8.6 Modified by.
Partially Ordered Sets (POSets)
Lecture 9 Illustrations Lattices. Fixpoints Abstract Interpretation.
Sets, POSets, and Lattice © Marcelo d’Amorim 2010.
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
MIT Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Solving fixpoint equations
The Integers. The Division Algorithms A high-school question: Compute 58/17. We can write 58 as 58 = 3 (17) + 7 This forms illustrates the answer: “3.
8.3 Representing Relations Directed Graphs –Vertex –Arc (directed edge) –Initial vertex –Terminal vertex.
CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.
CS 614: Theory and Construction of Compilers Lecture 17 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
Formalization of DFA using lattices. Recall worklist algorithm let m: map from edge to computed value at edge let worklist: work list of nodes for each.
Problem Statement How do we represent relationship between two related elements ?
1 Section 4.3 Order Relations A binary relation is an partial order if it transitive and antisymmetric. If R is a partial order over the set S, we also.
Chapter 8: Relations. 8.1 Relations and Their Properties Binary relations: Let A and B be any two sets. A binary relation R from A to B, written R : A.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Semilattices presented by Niko Simonson, CSS 548, Autumn 2012 Semilattice City, © 2009 Nora Shader.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d.
DFA foundations Simone Campanoni
Partial Orderings: Selected Exercises
Fixpoints and Reachability
Simone Campanoni DFA foundations Simone Campanoni
Discrete Math & Fixed points
Dataflow analysis.
Discrete Math (2) Haiming Chen Associate Professor, PhD
Concurrent Models of Computation
Data Flow Analysis Compiler Design
Lecture 20: Dataflow Analysis Frameworks 11 Mar 02
Background material.
Formalization of DFA using lattices
교환 학생 프로그램 내년 1월 중순부터 6월 초 현재 학부 2,3 학년?
Background material.
Formalization of DFA using lattices
Formalization of DFA using lattices
Formalization of DFA using lattices
Presentation transcript:

CSE 231 : Advanced Compilers Building Program Analyzers

Foundations

Foundations : Relations Relation R over set S is just a set of pairs from S: R ⊆ S x S a R b means (a, b) ∈ R Example: < = { (a, b) | b – a is positive } a < b means (a, b) ∈ <

Foundations : Types of Relations reflexive ∀ a ∈ S. a R a transitive ∀ a, b, c ∈ S. a R b /\ b R c  a R c symmetric ∀ a, b ∈ S. a R b  b R a anti-symmetric ∀ a, b ∈ S. a R b /\ b R a  a = b

Foundations : Anti-Symmetry anti-symmetric ∀ a, b ∈ S. a R b /\ b R a  a = b Anti-symmetry is slightly weird. Essentially: “If you start at X and only follow R to new elements, you will never get back to X.”

Foundations : Types of Relations equivalence reflexive, transitive, symmetric defines equivalence classes partitions domain partial order reflexive, transitive, anti-symmetric defines a partial order on domain

Foundations : Posets partially ordered set (poset) defined by set S and partial order ·

Foundations : Posets partially ordered set (poset) defined by set S and partial order · Example: (2 {x, y, z}, ⊆ )

Foundations : Posets partially ordered set (poset) defined by set S and partial order · Examples: (2 S, ⊆ ) (Z, <) (Z, divides)

Least Upper Foundations : Least Upper Bounds Assume poset (S, ⊑ ) least upper bound (lub) of a and b is c where a ⊑ c b ⊑ c ∀ d. (a ⊑ d /\ b ⊑ d)  c ⊑ d

Greatest Lower Foundations : Greatest Lower Bounds Assume poset (S, ⊑ ) greatest lower bound (glb) of a and b is c where c ⊑ a c ⊑ b ∀ d. (d ⊑ a /\ d ⊑ b)  d ⊑ c

Foundations : Bounds Assume poset (S, ⊑ ) Essentially: lub(a, b) = smallest thing bigger than a and b glb(a, b) = biggest thing smaller than a and b Question: Do lub and glb always exist?

Foundations : Bounds Assume poset (S, ⊑ ) Essentially: lub(a, b) = smallest thing bigger than a and b glb(a, b) = biggest thing smaller than a and b Question: Do lub and glb always exist? Answer:No!

Foundations : Bounds Question: Do lub and glb always exist? Answer:No! glb(, ) = ???

Foundations : Lattices A lattice is (S, ⊑, ⊥, ⊤, ⊔, ⊓ ) where: (S, ⊑ ) is a poset ⊥ is the smallest thing in S ⊤ is the biggest thing in S lub(a, b) and glb(a, b) always exist a ⊔ b = lub(a, b) a ⊓ b = glb(a, b)

Foundations : Lattices (Formally) A lattice is (S, ⊑, ⊥, ⊤, ⊔, ⊓ ) where: (S, ⊑ ) is a poset ∀ a ∈ S. ⊥ ⊑ a ∀ a ∈ S. a ⊑ ⊤ ∀ a, b ∈ S. ∃ c. c = lub(a, b) /\ a ⊔ b = c ∀ a, b ∈ S. ∃ c. c = glb(a, b) /\ a ⊓ b = c

Foundations : Fancy Lattice Names ⊥ is “botom” ⊤ is “top” ⊔ is “join” ⊓ is “meet”

Examples of lattices Powerset lattice

Examples of lattices Powerset lattice

Examples of lattices Booleans expressions

Examples of lattices Booleans expressions

Examples of lattices Booleans expressions

Examples of lattices Booleans expressions

End Foundations, Now Build

Analysis on Sets let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ∅ for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do let new_info := m(n.outgoing_edges[i]) ∪ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

Port Analysis To Run On Lattices Formalize domain with a powerset lattice. What should top and bottom be?

Port Analysis To Run On Lattices Formalize domain with a powerset lattice. What should top and bottom be? Does it matter? Should we even care?

Port Analysis To Run On Lattices Formalize domain with a powerset lattice. What should top and bottom be? Does it matter? Should we even care? Yes. A notion of approximation shows up in the lattice.

Lattice Direction Unfortunate name clashes: dataflow analysis picked one direction abstract interpretation picked the other We work in the abstract interpretation direction: ⊥ (bottom) is most precise (optimistic) ⊤ (top) is most imprecise (conservative)

Lattice Direction Always safe to go up in the lattice. can always set the result to ⊤ (top) Hard to go down in the lattice. So: ⊥ (bottom) will be empty set in reaching defns

Building an Analysis: worklist + lattice let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ⊥ for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do let new_info := m(n.outgoing_edges[i]) ⊔ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

Termination? For reaching definitions, it terminates. Why?

Termination? For reaching definitions, it terminates. Why? Because lattice is finite. Can we loosen this requirement? Yes, only require the lattice to have a finite height. Height of a lattice: length of longest ascending or descending chain

Termination? Height of lattice (2 S, ⊆ ) = ???

Termination? Height of lattice (2 S, ⊆ ) = | S |

Termination. But can we do better? Still annoying to perform join in the worklist algo: It would be nice to get rid of it. Is there a property of the flow functions that can help? while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do let new_info := m(n.outgoing_edges[i]) ⊔ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

Crank Up the Formality… To reason even more precisely about termination, we port our worklist algorithm to math. Fixed points underlie our new algorithm.

Back to Foundations

Fixpoints What is a fixpoint?

Fixpoints What is a fixpoint? An input to a function that equals its output. So a fixpoint for f is any input x such that: f(x) = x Easy!

Fixpoints Goal: compute map m from CFG edges to dataflow information Strategy: define a global flow function F as follows: F takes a map m as a parameter and returns a new map m’, in which individual local flow functions have been applied

Fixpoints Recall, we are computing m, a map from edges to dataflow information Define a global flow function F as follows: F takes a map m as a parameter and returns a new map m’, in which individual local flow functions have been applied

Fixpoints We want to find a fixed point of F, that is, a map m such that m = F(m) Approach to doing this? Define ⊥, which is ⊥ lifted to be a map: ⊥ = e. ⊥ Compute F( ⊥ ), then F(F( ⊥ )), then F(F(F( ⊥ ))),... until the result doesn’t change