San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Equivalence, Order, and Inductive Proof
8.6 Partial Orderings. Definition Partial ordering– a relation R on a set S that is Reflexive, Antisymmetric, and Transitive Examples? R={(a,b)| a is.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
CSE 231 : Advanced Compilers Building Program Analyzers.
CSE 231 : Advanced Compilers Building Program Analyzers.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
CS 536 Spring Global Optimizations Lecture 23.
Correctness. Until now We’ve seen how to define dataflow analyses How do we know our analyses are correct? We could reason about each individual analysis.
Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.
From last time: Lattices A lattice is a tuple (S, v, ?, >, t, u ) such that: –(S, v ) is a poset – 8 a 2 S. ? v a – 8 a 2 S. a v > –Every two elements.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Back to lattice (D, v, ?, >, t, u ) = (2 A, ¶, A, ;, Å, [ ) where A = { x ! N | x 2 Vars Æ N 2 Z } What’s the problem with this lattice? Lattice is infinitely.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Administrative stuff Office hours: After class on Tuesday.
Recap Let’s do a recap of what we’ve seen so far Started with worklist algorithm for reaching definitions.
Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –
Data Flow Analysis Compiler Design Nov. 8, 2005.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
From last lecture x := y op z in out F x := y op z (in) = in [ x ! in(y) op in(z) ] where a op b =
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Projects. Dataflow analysis Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why.
1 CS 201 Compiler Construction Lecture 4 Data Flow Framework.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Data Flow Analysis Compiler Design Nov. 8, 2005.
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Even more formal To reason more formally about termination and precision, we re-express our worklist algorithm mathematically We will use fixed points.
Termination Still, it’s annoying to have to perform a join in the worklist algorithm It would be nice to get rid of it, if there is a property of the flow.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Sets, POSets, and Lattice © Marcelo d’Amorim 2010.
Precision Going back to constant prop, in what cases would we lose precision?
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
MIT Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Solving fixpoint equations
8.3 Representing Relations Directed Graphs –Vertex –Arc (directed edge) –Initial vertex –Terminal vertex.
Formalization of DFA using lattices. Recall worklist algorithm let m: map from edge to computed value at edge let worklist: work list of nodes for each.
Problem Statement How do we represent relationship between two related elements ?
1 Section 4.3 Order Relations A binary relation is an partial order if it transitive and antisymmetric. If R is a partial order over the set S, we also.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d.
DFA foundations Simone Campanoni
Dataflow analysis.
Simone Campanoni DFA foundations Simone Campanoni
Another example: constant prop
Dataflow analysis.
Data Flow Analysis Compiler Design
Lecture 20: Dataflow Analysis Frameworks 11 Mar 02
Background material.
Formalization of DFA using lattices
Background material.
Formalization of DFA using lattices
Formalization of DFA using lattices
Formalization of DFA using lattices
Presentation transcript:

San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students Parties, company representatives, and more! The Anita Borg Institute for Women and Technology and the Association for Computing Machinery Present Volunteers Needed! Free registration!

Keynote Speakers Shirley Tilghman, President, Princeton University Sally Ride, former NASA astronaut and professor UCSD Helen Greiner, President, iRobot

Administrative info Subscribe to the class mailing list!!! –instructions are on the class web page, which is accessible from my home page, which is accessible by searching for Sorin Lerner on google

From last lecture Flow functions: Given information in before statement s, F s (in) returns information after statement s Flow functions are a central component of a dataflow analysis They state constraints on the information flowing into and out of a statement

1: x :=... 2: y :=... 3: y :=... 4: p :=... if(...)... x... 5: x := y x... 6: x :=... 7: *p :=... merge... x y... 8: y :=... d0d0 d1d1 d2d2 d3d3 d5d5 d6d6 d7d7 d9d9 d 10 d 11 d 13 d 14 d 15 d 16 d 12 d4d4 d8d8 Back to example d 1 = F a (d 0 ) d 2 = F b (d 1 ) d 3 = F c (d 2 ) d 4 = F d (d 3 ) d 5 = F e (d 4 ) d 6 = F g (d 5 ) d 7 = F h (d 6 ) d 8 = F i (d 7 ) d 10 = F j (d 9 ) d 11 = F k (d 10 ) d 12 = F l (d 11 ) d 9 = F f (d 5 ) d 13 = F m (d 12, d 8 ) d 14 = F n (d 13 ) d 15 = F o (d 14 ) d 16 = F p (d 15 ) How to find solutions for d i ?

This is a forward problem –given information flowing in to a node, can determine using the flow function the info flow out of the node To solve, simply propagate information forward through the control flow graph, using the flow functions What are the problems with this approach?

First problem What about the incoming information? –d 0 is not constrained –so where do we start? Need to constrain d 0 Two options: –explicitly state entry information –have an entry node whose flow function sets the information on entry (doesn’t matter if entry node has an incoming edge, its flow function ignores any input)

Entry node s: entry in out out = { x ! s | x 2 Formals }

1: x :=... 2: y :=... 3: y :=... 4: p :=... if(...)... x... 5: x := y x... 6: x :=... 7: *p :=... merge... x y... 8: y :=... d0d0 d1d1 d2d2 d3d3 d5d5 d6d6 d7d7 d9d9 d 10 d 11 d 13 d 14 d 15 d 16 d 12 d4d4 d8d8 Back to example d 1 = F a (d 0 ) d 2 = F b (d 1 ) d 3 = F c (d 2 ) d 4 = F d (d 3 ) d 5 = F e (d 4 ) d 6 = F g (d 5 ) d 7 = F h (d 6 ) d 8 = F i (d 7 ) d 10 = F j (d 9 ) d 11 = F k (d 10 ) d 12 = F l (d 11 ) d 9 = F f (d 5 ) d 13 = F m (d 12, d 8 ) d 14 = F n (d 13 ) d 15 = F o (d 14 ) d 16 = F p (d 15 ) d 0 = F entry () Which order to process nodes in?

How to find solutions for d i ? Sort nodes in topological order –each node appears in the order after all of its predecessors Just run the flow functions for each of the nodes in the topological order

Second problem When there are loops, there is no topological order! What to do? Let’s try and see what we should do

1: x :=... 2: y :=... 3: y :=... 4: p := x... 5: x := y x... 6: x :=... 7: *p := x y... 8: y :=...

1: x :=... 2: y :=... 3: y :=... 4: p := x... 5: x := y x... 6: x :=... 7: *p := x y... 8: y :=...

Solution: iterate! Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply flow function for node n –update the appropriate d i, and add nodes whose inputs have changed back onto worklist

Worklist algorithm let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ; for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do if (m(n.outgoing_edges[i])  info_out[i]) m(n.outgoing_edges[i]) := info_out[i]; worklist.add(n.outgoing_edges[i].dst);

Issues with worklist algorithm

Two issues with worklist algorithm Oreding –In what order should the original nodes be added to the worklist? –What order should nodes be removed from the worklist? Does this algorithm terminate?

Order of nodes Topological order assuming back-edges have been removed Reverse depth first order Use an ordered worklist

Termination Why is termination important? Can we stop the algorithm in the middle and just say we’re done... No: we need to run it to completion, otherwise the results are not safe...

Termination Assuming we’re doing reaching defs, let’s try to guarantee that the worklist loop terminates, regardless of what the flow function F does while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do if (m(n.outgoing_edges[i])  info_out[i]) m(n.outgoing_edges[i]) := info_out[i]; worklist.add(n.outgoing_edges[i].dst);

Termination Assuming we’re doing reaching defs, let’s try to guarantee that the worklist loop terminates, regardless of what the flow function F does while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do let new_info := m(n.outgoing_edges[i]) [ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

Structure of the domain We’re using the structure of the domain outside of the flow functions In general, it’s useful to have a framework that formalizes this structure We will use lattices

Background material

Relations A relation over a set S is a set R µ S £ S –We write a R b for (a,b) 2 R A relation R is: –reflexive iff 8 a 2 S. a R a –transitive iff 8 a 2 S, b 2 S, c 2 S. a R b Æ b R c ) a R c –symmetric iff 8 a, b 2 S. a R b ) b R a –anti-symmetric iff 8 a, b, 2 S. a R b ) : (b R a)

Relations A relation over a set S is a set R µ S £ S –We write a R b for (a,b) 2 R A relation R is: –reflexive iff 8 a 2 S. a R a –transitive iff 8 a 2 S, b 2 S, c 2 S. a R b Æ b R c ) a R c –symmetric iff 8 a, b 2 S. a R b ) b R a –anti-symmetric iff 8 a, b, 2 S. a R b Æ b R a ) a = b

Partial orders An equivalence class is a relation that is: A partial order is a relation that is:

Partial orders An equivalence class is a relation that is: –reflexive, transitive, symmetric A partial order is a relation that is: –reflexive, transitive, anti-symmetric A partially ordered set (a poset) is a pair (S, · ) of a set S and a partial order · over the set Examples of posets: (2 S, µ ), (Z, · ), (Z, divides)

Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d 2 S. (a · d Æ b · d) ) c · d –greatest lower bound (glb) is an element c such that c · a, c · b, and 8 d 2 S. (d · a Æ d · b) ) d · c

Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d 2 S. (a · d Æ b · d) ) c · d –greatest lower bound (glb) is an element c such that c · a, c · b, and 8 d 2 S. (d · a Æ d · b) ) d · c lub and glb don’t always exists:

Lattices A lattice is a tuple (S, v, ?, >, t, u ) such that: –(S, v ) is a poset – 8 a 2 S. ? v a – 8 a 2 S. a v > –Every two elements from S have a lub and a glb – t is the least upper bound operator, called a join – u is the greatest lower bound operator, called a meet

Examples of lattices Powerset lattice

Examples of lattices Booleans expressions

End of background material

Back to our example We formalize our domain with a powerset lattice What should be top and what should be bottom? Does it matter?

Back to our example We formalize our domain with a powerset lattice What should be top and what should be bottom? Does it matter? –It matters because, as we’ve seen, there is a notion of approximation, and we will this notion to show up in the lattice

Direction of lattice Unfortunately: –dataflow analysis community has picked one direction –abstract interpretation community has picked the other We will work with the abstract interpretation direction Bottom is the most precise (optimistic) answer, Top the most imprecise conservative

Direction of lattice Always safe to go up in the lattice Can always set the result to > Hard to go down in the lattice So... Bottom will be the empty set in reaching defs

Worklist algorithm using lattices let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ? for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do let new_info := m(n.outgoing_edges[i]) t info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

Termination of this algorithm? For reaching definitions, it terminates... Why? –lattice is finite Can we loosen this requirement? –Yes, we only require the lattice to have a finite height Height of a lattice: length of the longest ascending or descending chain Height of lattice (2 S, µ ) =

Termination of this algorithm? For reaching definitions, it terminates... Why? –lattice is finite Can we loosen this requirement? –Yes, we only require the lattice to have a finite height Height of a lattice: length of the longest ascending or descending chain Height of lattice (2 S, µ ) = | S |

Termination Still, it’s annoyting to have to perform a join in the worklist algorithm It would be nice to get rid of it, if there is a property of the flow functions that would allow us to do so while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do let new_info := m(n.outgoing_edges[i]) t info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

Even more formal To reason more formally about termination and precision, we re-express our worklist algorithm mathematically We will use fixed points to formalize our algorithm

Fixed points Recall, we are computing m, a map from edges to dataflow information Define a global flow function F as follows: F takes a map m as a parameter and returns a new map m’, in which individual local flow functions have been applied

Fixed points We want to find a fixed point of F, that is to say a map m such that m = F(m) Approach to doing this? Define ?, which is ? lifted to be a map: ? = e. ? Compute F( ? ), then F(F( ? )), then F(F(F( ? ))),... until the result doesn’t change anymore

Fixed points Formally: We would like the sequence F i ( ? ) for i = 0, 1, 2... to be increasing, so we can get rid of the outer join Require that F be monotonic: – 8 a, b. a v b ) F(a) v F(b)

Fixed points

Back to termination So if F is monotonic, we have what we want: finite height ) termination, without the outer join Also, if the local flow functions are monotonic, then global flow function F is monotonic

Another benefit of monotonicity Suppose Marsians came to earth, and miraculously give you a fixed point of F, call it fp. Then:

Another benefit of monotonicity We are computing the least fixed point...