CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
SSA.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Some Properties of SSA Mooly Sagiv. Outline Why is it called Static Single Assignment form What does it buy us? How much does it cost us? Open questions.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Program Representations. Representing programs Goals.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
CS 536 Spring Global Optimizations Lecture 23.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
CS745: SSA© Seth Copen Goldstein & Todd C. Mowry Static Single Assignment.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA Emery Berger University.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
CS 201 Compiler Construction
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Class canceled next Tuesday. Recap: Components of IR Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Precision Going back to constant prop, in what cases would we lose precision?
CSE P501 – Compiler Construction
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Introduction to SSA Data-flow Analysis Revisited – Static Single Assignment (SSA) Form Liberally Borrowed from U. Delaware and Cooper and Torczon Text.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
2/22/2016© Hal Perkins & UW CSEP-1 CSE P 501 – Compilers Register Allocation Hal Perkins Winter 2008.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Introduction to Optimization
Data Flow Analysis Suman Jana
Register Allocation Hal Perkins Autumn 2009
Static Single Assignment
Efficiently Computing SSA
Topic 10: Dataflow Analysis
Introduction to Optimization
Instruction Scheduling Hal Perkins Summer 2004
Factored Use-Def Chains and Static Single Assignment Forms
University Of Virginia
Instruction Scheduling Hal Perkins Winter 2008
CS 201 Compiler Construction
CSC D70: Compiler Optimization Static Single Assignment (SSA)
Static Single Assignment Form (SSA)
Register Allocation Hal Perkins Summer 2004
Register Allocation Hal Perkins Autumn 2005
Optimizations using SSA
Data Flow Analysis Compiler Design
EECS 583 – Class 7 Static Single Assignment Form
Introduction to Optimization
Static Single Assignment
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Instruction Scheduling Hal Perkins Autumn 2005
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/4/2019 CPEG421-05S/Topic5.
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/17/2019 CPEG421-05S/Topic5.
EECS 583 – Class 7 Static Single Assignment Form
SSA-based Optimizations
COMPILERS Liveness Analysis
Instruction Scheduling Hal Perkins Autumn 2011
Presentation transcript:

CSE P 501 – Compilers SSA Hal Perkins Autumn 2009 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Agenda Overview of SSA IR Constructing SSA graphs SSA-based optimizations Converting back from SSA form Source: Appel ch. 19, also an extended discussion in Cooper-Torczon sec. 9.3 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Def-Use (DU) Chains Common dataflow analysis problem: Find all sites where a variable is used, or find the definition site of a variable used in an expression Traditional solution: def-use chains – additional data structure on the dataflow graph Link each statement defining a variable to all statements that use it Link each use of a variable to its definition 5/31/2019 © 2002-09 Hal Perkins & UW CSE

DU-Chain Drawbacks Expensive: if a typical variable has N uses and M definitions, the total cost is O(N * M) Would be nice if cost were proportional to the size of the program Unrelated uses of the same variable are mixed together Complicates analysis 5/31/2019 © 2002-09 Hal Perkins & UW CSE

SSA: Static Single Assignment IR where each variable has only one definition in the program text This is a single static definition, but it may be in a loop that is executed dynamically many times 5/31/2019 © 2002-09 Hal Perkins & UW CSE

SSA in Basic Blocks Original SSA We’ve seen this before when looking at value numbering Original a := x + y b := a – 1 a := y + b b := x * 4 a := a + b SSA a1 := x + y b1 := a1 – 1 a2 := y + b1 b2 := x * 4 a3 := a2 + b2 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Merge Points The issue is how to handle merge points Solution: introduce a Φ-function a3 := Φ(a1, a2) Meaning: a3 is assigned either a1or a2 depending on which control path is used to reach the Φ-function 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Example Original SSA b := M[x] a := 0 b1 := M[x0] a1 := 0 if b < 4 c := a + b Original b1 := M[x0] a1 := 0 if b1 < 4 a2 := b1 a3 := Φ(a1, a2) c1 := a3 + b1 SSA 5/31/2019 © 2002-09 Hal Perkins & UW CSE

How Does Φ “Know” What to Pick? It doesn’t When we translate the program to executable form, we can add code to copy either value to a common location on each incoming edge For analysis, all we may need to know is the connection of uses to definitions – no need to “execute” anything 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Example With Loop SSA Original Notes: a3 := Φ(a1, a2) b1 := Φ(b0, b2) c2 := Φ(c0, c1) b2 := a3 + 1 c1 := c2 + b2 a2 := b2 * 2 if a2 < N return c1 SSA a := 0 b := a + 1 c := c + b a := b * 2 if a < N return c Original Notes: a0, b0, c0 are initial values of a, b, c on block entry b1 is dead – can delete later c is live on entry – either input parameter or uninitialized 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Converting To SSA Form Basic idea First, add Φ-functions Then, rename all definitions and uses of variables by adding subscripts 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Inserting Φ-Functions Could simply add Φ-functions for every variable at every join point(!) But Wastes way too much space and time Not needed 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Path-convergence criterion Insert a Φ-function for variable a at point z when There are blocks x and y, both containing definitions of a, and x  y There are nonempty paths from x to z and from y to z These paths have no common nodes other than z z is not in both paths prior to the end (it may appear in one of them) 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Details The start node of the flow graph is considered to define every variable (even if to “undefined”) Each Φ-function itself defines a variable, so we need to keep adding Φ-functions until things converge 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Dominators and SSA One property of SSA is that definitions dominate uses; more specifically: If x := Φ(…,xi,…) in block n, then the definition of x dominates the ith predecessor of n If x is used in a non-Φ statement in block n, then the definition of x dominates block n 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Dominance Frontier (1) To get a practical algorithm for placing Φ-functions, we need to avoid looking at all combinations of nodes leading from x to y Instead, use the dominator tree in the flow graph 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Dominance Frontier (2) Definitions x strictly dominates y if x dominates y and x  y The dominance frontier of a node x is the set of all nodes w such that x dominates a predecessor of w, but x does not strictly dominate w Essentially, the dominance frontier is the border between dominated and undominated nodes 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Example 1 2 5 9 3 10 11 7 6 Appel fig. 19.5. 12 4 8 13 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Dominance Frontier Cirterion If a node x contains the definition of variable a, then every node in the dominance frontier of x needs a Φ-function for a Since the Φ-function itself is a definition, this needs to be iterated until it reaches a fixed-point Theorem: this algorithm places exactly the same set of Φ-functions as the path criterion given previously 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Placing Φ-Functions: Details The basic steps are: Compute the dominance frontiers for each node in the flowgraph Insert just enough Φ-functions to satisfy the criterion. Use a worklist algorithm to avoid reexamining nodes unnecessarily Walk the dominator tree and rename the different definitions of variable a to be a1, a2, a3, … 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Efficient Dominator Tree Computation Goal: SSA makes optimizing compilers faster since we can find definitions/uses without expensive bit-vector algorithms So, need to be able to compute SSA form quickly Computation of SSA from dominator trees are efficient, but… 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Lengauer-Tarjan Algorithm Iterative set-based algorithm for finding dominator trees is slow in worst case Lengauer-Tarjan is near linear time Uses depth-first spanning tree from start node of control flow graph See books for details 5/31/2019 © 2002-09 Hal Perkins & UW CSE

SSA Optimizations Given the SSA form, what can we do with it? First, what do we know? (i.e., what information is kept in the SSA graph?) 5/31/2019 © 2002-09 Hal Perkins & UW CSE

SSA Data Structures Statement: links to containing block, next and previous statements, variables defined, variables used. Statement kinds are: ordinary, Φ-function, fetch, store, branch Variable: link to definition (statement) and use sites Block: List of contained statements, ordered list of predecessors, successor(s) 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Dead-Code Elimination A variable is live iff its list of uses is not empty(!) Algorithm to delete dead code: while there is some variable v with no uses if the statement that defines v has no other side effects, then delete it Need to remove this statement from the list of uses for its operand variables – which may cause those variables to become dead 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Simple Constant Propagation If c is a constant in v := c, any use of v can be replaced by c Then update every use of v to use constant c If the ci’s in v := Φ(c1, c2, …, cn) are all the same constant c, we can replace this with v := c Can also incorporate copy propagation, constant folding, and others in the same worklist algorithm 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Simple Constant Propagation W := list of all statements in SSA program while W is not empty remove some statement S from W if S is v:=Φ(c, c, …, c), replace S with v:=c if S is v:=c delete S from the program for each statement T that uses v substitute c for v in T add T to W 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Converting Back from SSA Unfortunately, real machines do not include a Φ instruction So after analysis, optimization, and transformation, need to convert back to a “Φ-less” form for execution 5/31/2019 © 2002-09 Hal Perkins & UW CSE

Translating Φ-functions The meaning of x := Φ(x1, x2, …, xn) is “set x := x1 if arriving on edge 1, set x:= x2 if arriving on edge 2, etc.” So, for each i, insert x := xi at the end of predecessor block i Rely on copy propagation and coalescing in register allocation to eliminate redundant moves 5/31/2019 © 2002-09 Hal Perkins & UW CSE

SSA Wrapup More details in recent compiler books (but not the new dragon book!) Allows efficient implementation of many optimizations Used in many new compiler (e.g. llvm) & retrofitted into many older ones (gcc) Not a silver bullet – some optimizations still need non-SSA forms 5/31/2019 © 2002-09 Hal Perkins & UW CSE