Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

1 SSA review Each definition has a unique name Each use refers to a single definition The compiler inserts  -functions at points where different control.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Operator Strength Reduction From Cooper, Simpson, & Vick, “Operator Strength Reduction”, ACM TOPLAS, 23(5), See also § of EaC2e. 1COMP 512,
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
SSA.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
Lazy Code Motion Comp 512 Spring 2011
Lazy Code Motion C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Advanced Compiler Design – Assignment 1 SSA Construction & Destruction Michael Fäs (Original Slides by Luca Della Toffola)
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
CS745: SSA© Seth Copen Goldstein & Todd C. Mowry Static Single Assignment.
Lecture #17, June 5, 2007 Static Single Assignment phi nodes Dominators Dominance Frontiers Dominance Frontiers when inserting phi nodes.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
CSE P501 – Compiler Construction
Global Common Subexpression Elimination with Data-flow Analysis Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Static Single Assignment.
Dominators, control-dependence and SSA form. Organization Dominator relation of CFGs –postdominator relation Dominator tree Computing dominator relation.
Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Dominators, etc. Emery Berger University.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Generating SSA Form (mostly from Morgan). Why is SSA form useful? For many dataflow problems, SSA form enables sparse dataflow analysis that –yields the.
Building SSA Form (A mildly abridged account) For the full story, see the lecture notes for COMP 512 (lecture 8) and.
Terminology, Principles, and Concerns, III With examples from DOM (Ch 9) and DVNT (Ch 10) Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Introduction to SSA Data-flow Analysis Revisited – Static Single Assignment (SSA) Form Liberally Borrowed from U. Delaware and Cooper and Torczon Text.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Definition-Use Chains
Introduction to Optimization
Building SSA Form (A mildly abridged account)
Efficiently Computing SSA
Topic 10: Dataflow Analysis
Introduction to Optimization
Factored Use-Def Chains and Static Single Assignment Forms
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
Static Single Assignment Form (SSA)
Optimizations using SSA
Dominator Tree First BB is the root node, each node
EECS 583 – Class 7 Static Single Assignment Form
Introduction to Optimization
Static Single Assignment
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/4/2019 CPEG421-05S/Topic5.
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/17/2019 CPEG421-05S/Topic5.
EECS 583 – Class 7 Static Single Assignment Form
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

Previous Lectures Looked at iterative data-flow analysis  Round-robin and iterative algorithms  D OM, L IVE, A VAIL, V ERY B USY, C ONSTANTS, M OD, R EACHES  Looked at D EF -U SE chains Looked at solving constant propagation two ways  Sets in a traditional data-flow framework solved on the CFG  Adding value attributes to names and treating the D EF -U SE chains as a sparse evaluation graph Insights behind SSA  The right name space can simplify optimization (LVN, SVN)  Sparse evaluation graphs can reduce cost of analysis  Today: we can discover the spots where the analysis should compute new information COMP 512, Rice University2

3 Constant Propagation over D EF -U SE Chains Birth points x  x  a + b x  y - z x  13 z  x * q s  w - x Computes  of three values here Computes  of four values here Value is born here  y - z Value is born here  y - z  13 Value is born  y - z  13  a+b We should be able to compute the values that we need with fewer meet operations, if only we can find these birth points. Need to identify birth points Need to insert some artifact to force the evaluation to follow the birth points Enter Static Single Assignment form

COMP 512, Rice University4 Constant Propagation over D EF -U SE Chains Making Birth Points Explicit x 0  x 5  a + b x 1  y - z x 3  13 z  x 4 * q s  w - x 6 There are three birth points for x *

COMP 512, Rice University5 Constant Propagation over D EF -U SE Chains Making Birth Points Explicit x 0  x 5  a + b x 1  y - z x 3  13 z  x 4 * q s  w - x 6 x 2  Ø(x 1,x 0 ) x 4  Ø(x 3,x 2 ) x 6  Ø(x 5,x 4 ) Each needs a definition to reconcile the values of x Insert a ϕ -function at each birth point Rename values so each name is defined once Now, each use refers to one definition  Static Single Assignment Form *

COMP 512, Rice University6 DEF-USE chains on SSA form One def per use Meets occur at ø functions For C ONSTANTS, 3 binary  ’s Constant Propagation over D EF -U SE Chains Making Birth Points Explicit x 0  x 5  a + b x 1  y - z x 3  13 z  x 4 * q s  w - x 6 x 2  Ø(x 1,x 0 ) x 4  Ø(x 3,x 2 ) x 6  Ø(x 5,x 4 ) * How do we build SSA form? Simple algorithm 1. Insert a ϕ at each join point for each name 2. Rename to get single definition & single use This produces Correct SSA form More ϕ ’s than any other known algorithm for SSA construction The rest is optimization (!)

COMP 512, Rice University7 Building Static Single Assignment Form SSA-form Each name is defined exactly once Each use refers to exactly one name What’s hard Straight-line code is trivial Splits in the CFG are trivial Joins in the CFG are hard Building SSA Form Insert ϕ -functions at birth points of values Rename all values for uniqueness A ϕ -function is a special kind of copy that selects one of its parameters. The choice of parameter is governed by the CFG edge along which control reached the current block. Few machines implement a ϕ -function directly in hardware. y 1 ...y 2 ... y 3  Ø(y 1,y 2 ) *

COMP 512, Rice University8 SSA Construction Algorithm ( High-level sketch ) 1.Insert ϕ -functions 2.Rename values … that’s all... … of course, there is some bookkeeping to be done... *

COMP 512, Rice University9 SSA Construction Algorithm ( The simplest algorithm ) 1. Insert ϕ -functions at every join for every name 2. Solve reaching definitions 3. Rename each use to the def that reaches it ( will be unique ) What’s wrong with this approach Too many ϕ -functions ( precision ) Too many ϕ -functions ( space ) Too many ϕ -functions ( time ) Need to relate edges to ϕ -functions parameters ( bookkeeping ) To do better, we need a more complex approach Builds a version of SSA with the maximal number of ϕ - functions

COMP 512, Rice University10 SSA Construction Algorithm ( Detailed sketch ) 1. Insert ϕ -functions a.) calculate dominance frontiers b.) find global names for each name, build a list of blocks that define it c.) insert ϕ -functions  global name n  block b in which n is assigned  block d in b’s dominance frontier insert a ϕ -function for n in d add d to n’s list of defining blocks { Creates the iterated dominance frontier This adds to the worklist ! Use a checklist to avoid putting blocks on the worklist twice; keep another checklist to avoid inserting the same ϕ -function twice. Compute list of blocks where each name is assigned & use as a worklist Moderately complex *

COMP 512, Rice University11 SSA Construction Algorithm ( Detailed sketch ) 2. Rename variables in a pre-order walk over dominator tree (use an array of stacks, one stack per global name) Staring with the root block, b a.) generate unique names for each ϕ -function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack) ii. Rewrite definition by inventing & pushing new name c.) fill in ϕ -function parameters of successor blocks d.) recurse on b’s children in the dominator tree e.) pop names generated in b from stacks 1 counter per name for subscripts Need the end-of-block name for this path Reset the state *

COMP 512, Rice University12 SSA Construction Algorithm ( Low-level detail ) Computing Dominance First step in ϕ -function insertion computes dominance Recall that n dominates m iff n is on every path from n 0 to m  Every node dominates itself  n’s immediate dominator is its closest dominator, ID OM (n) † D OM (n 0 ) = { n 0 } D OM (n) = { n }  (  p  preds(n) D OM (p)) Computing DOM These equations form a rapid data-flow framework Iterative algorithm will solve them in d(G) + 3 passes  Each pass does N unions & E intersections,  E is O(N 2 ),  O(N 2 ) work † ID OM (n ) ≠ n, unless n is n 0, by convention. Initially, D OM (n) = N,  n≠n 0 Original idea: R.T. Prosser. “Applications of Boolean matrices to the analysis of flow diagrams,” Proceedings of the Eastern Joint Computer Conference, Spartan Books, New York, pages

COMP 512, Rice University13 Example B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 B7B7 B0B0 Flow Graph Progress of iterative solution for D OM Results of iterative solution for D OM *

COMP 512, Rice University14 Example Dominance Tree Progress of iterative solution for D OM Results of iterative solution for D OM B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 B7B7 B0B0 There are asymptotically faster algorithms. With the right data structures, the iterative algorithm can be made faster. See Cooper, Harvey, & Kennedy, on the web site.

Dominance Frontiers & Inserting ϕ -functions Where does an assignment in block n induce ϕ –functions? n DOM m ⇒ no need for a ϕ –function in m  Definition in n blocks any previous definition from reaching m If m has multiple predecessors, and n dominates one of them, but not all of them, then m needs a ϕ –function for each definition in n More formally, m is in the dominance frontier of n if and only if 1. ∃ p ∈ preds(m) such that n ∈ DOM(p), and 2. n does not strictly dominate m (n ∉ DOM(m) – { m }) This notion of dominance frontier is precisely what we need to insert ϕ –functions: a def in block n induces a ϕ –function in each block in DF(n). COMP 512, Rice University15 “strict” dominance allows a ϕ –function at the head of a single-block loop.

COMP 512, Rice University16 Example Dominance Frontiers Dominance Frontiers & ϕ -Function Insertion A definition at n forces a ϕ -function at m iff n  D OM (m) but n  D OM (p) for some p  preds(m) DF(n ) is fringe just beyond region n dominates B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 B7B7 B0B0 *  in 1 forces ϕ -function in DF(1) = Ø (halt ) x ... x  ϕ (...) DF(4) is {6}, so  in 4 forces ϕ -function in 6 x  ϕ (...)  in 6 forces ϕ -function in DF(6) = {7} x  ϕ (...)  in 7 forces ϕ -function in DF(7) = {1} For each assignment, we insert the ϕ -functions

COMP 512, Rice University17 Example Dominance Frontiers Computing Dominance Frontiers Only join points are in DF(n) for some n Leads to a simple, intuitive algorithm for computing dominance frontiers For each join point x ( i.e., |preds(x)| > 1 ) For each CFG predecessor p of x Run from p to ID OM (x) in the dominator tree, & add x to DF(n) for each n from p up to but not ID OM (x) B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 B7B7 B0B0 * For some applications, we need post-dominance, the post-dominator tree, and reverse dominance frontiers, RDF(n) > Just dominance on the reverse CFG > Reverse the edges & add unique exit node We will use these in dead code elimination

COMP 512, Rice University18 SSA Construction Algorithm ( Reminder ) 1.Insert ϕ -functions at every join for every name a.) calculate dominance frontiers b.) find global names for each name, build a list of blocks that define it c.) insert ϕ -functions  global name n  block b in which n is assigned  block d in b’s dominance frontier insert a ϕ -function for n in d add d to n’s list of defining blocks * Needs a little more detail Step 1.b is not in the original set of algorithms It produces an SSA form with fewer ϕ -functions

COMP 512, Rice University19 SSA Construction Algorithm Finding global names Different between two forms of SSA Minimal uses all names Semi-pruned uses names that are live on entry to some block  Shrinks name space & number of ϕ -functions  Pays for itself in compile-time speed For each “global name”, need a list of blocks where it is defined  Drives ϕ -function insertion  b defines x implies a ϕ -function for x in every c  DF(b) Pruned SSA adds a test to see if x is live at insertion point Otherwise, cannot need a ϕ -function Occasionally, building pruned is faster than building semi-pruned. Any algorithm that has non-linear behavior in the number of ϕ -functions will have a size where pruned is the SSA flavor of choice.

a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i  B0B0 b  c  d  B2B2 a  Ø(a,a) b  Ø(b,b) c  Ø(c,c) d  Ø(d,d) i  Ø(i,i) a  c  B1B1 a  d  B3B3 B4B4 c  B5B5 d  Ø(d,d) c  Ø(c,c) b  B6B6 i > 100 With all the ϕ -functions Lots of new ops Renaming is next Assume a, b, c, & d defined before B 0 Example Excluding local names avoids Ø’s for y & z 20COMP 512, Rice University

21 SSA Construction Algorithm ( Less high-level sketch ) 2.Rename variables in a pre-order walk over dominator tree (use an array of stacks, one stack per global name) Staring with the root block, b a.) generate unique names for each ϕ -function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack) ii. Rewrite definition by inventing & pushing new name c.) fill in ϕ -function parameters of successor blocks d.) recurse on b’s children in the dominator tree e.) pop names generated in b from stacks 1 counter per name for subscripts Need the end-of-block name for this path Reset the state