CSE P501 – Compiler Construction

Slides:

Advertisements

Similar presentations

SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.

Advertisements

1 SSA review Each definition has a unique name Each use refers to a single definition The compiler inserts  -functions at points where different control.

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.

8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.

Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.

1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.

Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.

Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.

CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.

Some Properties of SSA Mooly Sagiv. Outline Why is it called Static Single Assignment form What does it buy us? How much does it cost us? Open questions.

Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.

Stanford University CS243 Winter 2006 Wei Li 1 SSA.

U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.

Program Representations. Representing programs Goals.

1 Constant Propagation for Loops with Factored Use-Def Chains Reporter ： Lai, Yen-Chang.

1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,

6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.

Common Sub-expression Elim Want to compute when an expression is available in a var Domain:

Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.

1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.

Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.

CS745: SSA© Seth Copen Goldstein & Todd C. Mowry Static Single Assignment.

U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA Emery Berger University.

4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)

1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.

CS 201 Compiler Construction

1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.

Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.

Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.

Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...

1 CS 201 Compiler Construction Lecture 9 Static Single Assignment Form.

Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.

Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.

Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.

Precision Going back to constant prop, in what cases would we lose precision?

Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.

U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.

U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Dominators, etc. Emery Berger University.

Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,

Program Representations. Representing programs Goals.

Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.

CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.

1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.

Control Flow Analysis Compiler Baojian Hua

Introduction to SSA Data-flow Analysis Revisited – Static Single Assignment (SSA) Form Liberally Borrowed from U. Delaware and Cooper and Torczon Text.

Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.

1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.

Optimization Simone Campanoni

Loops Simone Campanoni

Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.

Simone Campanoni SSA Simone Campanoni

Static Single Assignment

© Seth Copen Goldstein & Todd C. Mowry

Program Representations

Efficiently Computing SSA

Topic 10: Dataflow Analysis

Factored Use-Def Chains and Static Single Assignment Forms

CSC D70: Compiler Optimization Static Single Assignment (SSA)

Static Single Assignment Form (SSA)

Optimizations using SSA

Data Flow Analysis Compiler Design

Static Single Assignment

Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/4/2019 CPEG421-05S/Topic5.

Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/17/2019 CPEG421-05S/Topic5.

EECS 583 – Class 7 Static Single Assignment Form

CSC D70: Compiler Optimization Static Single Assignment (SSA)

CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019

Presentation transcript:

CSE P501 – Compiler Construction Overview of SSA IR Constructing SSA graphs SSA-based optimizations Converting back from SSA form Appel ch. 19 Cooper-Torczon sec. 9.3 Spring 2014 Jim Hogg - UW - CSE - P501

Def-Use (DU) Chains Common dataflow analysis problem: Find all sites where a variable is used ( =x ) Find all sites where that variable was last defined (x= ) Traditional solution: Def-Use (DU) chains – additional data structure on top of flowgraph Link each def (assigns-to) of a variable to all uses Use-Def (UD) chains Link each use of a variable to its def Spring 2014 Jim Hogg - UW - CSE - P501

Def-Use (DU) Chains Two DU chains intersect entry z>1 x=1 x=2 y=x+1 z=x-3 x=4 z=x+7 exit Spring 2014 Jim Hogg - UW - CSE - P501

DU-Chain Drawbacks Expensive if a typical variable has D defs and U uses, total cost is O(D*U) ie: O(n2) where n is number of statements in function Better if cost varied as O(n) Unrelated uses of same variable are mixed together. This complicates analysis. Eg: Variable v is used over and over; but uses are disjoint Register allocated might think it is live across all uses Spring 2014 Jim Hogg - UW - CSE - P501

SSA: Static Single Assignment IR form where each variable has only one def in the function Equivalent, in a source program, to never using same variable for different purposes static single definition. But that def can be in a loop that is executed dynamically many times In a loop, it really is the same variable each time, so makes sense Separates values from where they are stored in memory Before After: Manual, source SSA format int x = 0; while (x < a.length) a[x] = i+7; for(x = 42; x >= 0; --x) b[x] = a[x] * 3; int x = 0; while (x < a.length) a[x] = i+7; for(int x1 = 42; x1 >= 0; --x1) b[x1] = a[x1] * 3; Spring 2014 Jim Hogg - UW - CSE - P501

SSA in Basic Blocks We’ve seen this before when we looked at Value Numbering Original SSA a = x + y b = a – 1 a = y + b b = x * 4 a = a + b a1 = x0 + y0 b1 = a1 – 1 a2 = y0 + b1 b2 = x0 * 4 a3 = a2 + b2 Bump the SSA number each time a variable is def'd x0 is the value assigned before the current function - eg: input parameter Spring 2014 Jim Hogg - UW - CSE - P501

Merge Points Main issue Solution Meaning How to handle merge points in the flowgraph? Solution introduce a Φ-function a3 = Φ(a1, a2) Meaning a3 is assigned either a1 or a2 depending on which control path was used to reach the Φ-function Spring 2014 Jim Hogg - UW - CSE - P501

Example Original SSA b = M[x] a = 0 b1 = M[x0] a1 = 0 if b < 4 c = a + b Original b1 = M[x0] a1 = 0 if b1 < 4 a2 = b1 a3 = Φ(a1, a2) c1 = a3 + b1 SSA Spring 2014 Jim Hogg - UW - CSE - P501

How Does Φ know what to Pick? It doesn’t When we translate the program to executable form, we add code to copy either value to a common location on each incoming edge For analysis, all we may need to know is the connection of uses to defs – no need to “execute” anything After Before a1 = a = a1 a2 = a = a2 a1 = a2 = a3 = Φ(a1, a2) a3 = Φ(a1, a2) = a Spring 2014

Example With Loop Original SSA a1 = 0 a = 0 Notes: b = a + 1 c = c + b a = b * 2 if a < N return c Original a1 = 0 a3 = Φ(a1, a2) b1 = Φ(b0, b2) c2 = Φ(c0, c1) b2 = a3 + 1 c1 = c2 + b2 a2 = b2 * 2 if a2 < N return c1 SSA Notes: a0, b0, c0 are initial values of a, b, c b1 is dead – can delete later c is live on entry – either input parameter or uninitialized Spring 2014 Jim Hogg - UW - CSE - P501

What does SSA 'buy' us? No need for DU or UD Implicit in SSA entry entry No need for DU or UD Implicit in SSA Compact representation z>1 z1>1 x=1 z>2 x=2 x1=1 z1>2 x2=2 y=x+1 z=x-3 x=4 y=x1+1 x3=φ(x1,x2) z2=x3-3 x4=4 z=x+7 z3=x4+7 exit exit SSA form is "recent" (80s) Prevalent in real compilers for {} languages Spring 2014

Converting To SSA Form Basic idea First, add Φ-functions Then rename all defs and uses of variables by adding subscripts Spring 2014 Jim Hogg - UW - CSE - P501

Inserting Φ-Functions Add Φ-functions for every variable in the function, at every join point? This is called "maximal SSA" But Wastes lots of memory Slows compile time Not needed in many cases Spring 2014 Jim Hogg - UW - CSE - P501

Path-convergence criterion Insert a Φ-function for variable a at block B3 when: There are blocks B1 and B2, both containing definitions of a, and B1  B2 There are paths from B1 to B3 and from B2 to B3 These paths have no common nodes other than B3 B3 is not in both paths prior to the end (eg: loop), altho' it may appear in one of them a1 = a2 = B1 B2 a3 = Φ(a1, a2) B3 Spring 2014

Details The start node of the flowgraph is considered to define every variable (a0, b0, etc) even if to “undefined” Each Φ-function itself defines a variable, so keep adding Φ-functions until things converge Spring 2014 Jim Hogg - UW - CSE - P501

Dominators and SSA One property of SSA is that defs dominate uses. More specifically: If an = Φ(… ai …) is in block B, then the definition of ai dominates the ith predecessor of B If a is used in a non-Φ statement in block B, then the definition of a dominates block B a1 = a2 a3 = Φ(a1, a2) Spring 2014 B

Dominance Frontier (1) To get a practical algorithm for placing Φ-functions, we need to avoid looking at all combinations of nodes leading to a join-node Instead, use the dominator tree in the flow graph Spring 2014 Jim Hogg - UW - CSE - P501

Dominance Frontier (2) Definitions x strictly dominates y if x dominates y and x  y x dom y && x  y => x sdom y The dominance frontier of x is the set of all nodes z such that x dominates a predecessor of z, but x does not dominate z Dominance Frontier is the border between dominated and undominated nodes; it passes thru the undominated set Leave x in all directions; stop when you hit undominated nodes x x dom {x, y} x sdom {y} z DF(x) y z Spring 2014 Jim Hogg - UW - CSE - P501

Example 1 2 5 9 3 10 11 7 6 12 4 8 13 Appel fig. 19.5. Spring 2014 Jim Hogg - UW - CSE - P501

Node 5 dominates ... 5 dom {5,6,7,8} 5 sdom {6,7,8} 1 2 5 9 3 10 11 7 12 Appel fig. 19.5. 4 8 13 Spring 2014 Jim Hogg - UW - CSE - P501

Node 5 Dominance Frontier ... DF(5) 1 5 dom {5,6,7,8} 5 sdom {6,7,8} 2 x 5 9 3 10 11 7 6 12 4 8 Appel fig. 19.5. y DF(x) = {z |  y  pred(z), x dom y, x !dom z, x !sdom z } 13 z DF(5) excludes 5 itself (backedge 85) because 5 !sdmon 5 Spring 2014 Jim Hogg - UW - CSE - P501

Dominance Frontier Criterion If node x def's variable a, then every node in DF(x) needs a Φ-function for a Since the Φ-function itself is a def, this needs to be iterated until it reaches a fixed-point Theorem: this algorithm places exactly the same set of Φ-functions as the path criterion given previously Spring 2014 Jim Hogg - UW - CSE - P501

Placing Φ-Functions: Details Compute the DF for each node in the flowgraph Insert just enough Φ-functions to satisfy the criterion. Use a worklist algorithm to avoid revisiting nodes Walk the dominator tree and rename the different definitions of variable a to be a1, a2, a3, … Spring 2014 Jim Hogg - UW - CSE - P501

Efficient Dominator Tree Computation Goal: SSA makes optimizing compilers faster since we can find defs/uses without expensive bit-vector algorithms So, need to be able to compute SSA form quickly Computation of SSA from dominator trees is efficient, but… Spring 2014 Jim Hogg - UW - CSE - P501

Lengauer-Tarjan Algorithm Iterative, set-based algorithm for finding dominator trees is slow in worst case Lengauer-Tarjan is near linear time (in number of nodes) Uses depth-first spanning tree from start node of control flow graph See books for details Spring 2014 Jim Hogg - UW - CSE - P501

SSA Optimizations Given the SSA form, what can we do with it? First, what do we know? - what info is kept in the SSA graph? Spring 2014 Jim Hogg - UW - CSE - P501

SSA Data Structures Instruction Variable Block links to containing block, next and previous instructions, variables defined, variables used. Instruction kinds: ordinary, Φ-function, fetch, store, branch Variable link to def (instruction) and use sites Block list of contained instructions, ordered list of predecessors, successor(s) Spring 2014 Jim Hogg - UW - CSE - P501

Dead-Code Elimination (DCE) Many optimizations become easy to recognize in SSA form. Eg: A variable is live <=> list of uses is not empty To remove dead code: while (there is some variable v with no uses) { if (instruction that def's v has no side effects) { delete instruction } Need to remove this instruction from list of uses for its operands: may kill those variables too Spring 2014 Jim Hogg - UW - CSE - P501

Simple Constant Prop. In v = c, if c is a constant, then: Then update every use of variable v to use constant c instant In v = Φ(c1, c2 … cn), if all ci’s have same constant value c, replace instruction with v = c Incorporate copy prop, constant folding, and others in the same worklist algorithm Spring 2014 Jim Hogg - UW - CSE - P501

Simple Constant Propagation W = list of all instruction in SSA function while W is not empty { remove some instructions I from W if I is v = Φ(c, c, …, c), replace I with v = c if I is v = c { delete I from the function foreach instruction T that uses v { substitute c for v in T add T to W } Spring 2014 Jim Hogg - UW - CSE - P501

Converting Back from SSA Unfortunately, real machines do not include a Φ instruction So after analysis, optimization, and transformation, need to convert back to a “Φ-less” form for execution Spring 2014 Jim Hogg - UW - CSE - P501

Translating Φ-functions The meaning of x = Φ(x1, x2 … xn) is: set x = x1 if arriving on edge 1 set x = x2 if arriving on edge 2 etc So, foreach i, insert x = xi at the end of predecessor block i Rely on copy prop. and coalescing in register allocation to eliminate redundant moves Spring 2014 Jim Hogg - UW - CSE - P501

SSA Wrapup More details in recent compiler books (but not the new Dragon book!) Allows efficient implementation of many optimizations Used in many new compiler (eg: LLVM) & retrofitted into many older ones (GCC) Not a silver bullet – some optimizations still need non-SSA forms Spring 2014 Jim Hogg - UW - CSE - P501