Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

1 SSA review Each definition has a unique name Each use refers to a single definition The compiler inserts  -functions at points where different control.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
- 1 - Dominator Tree BB0 BB1 BB2BB3 BB4 BB6 BB7 BB5 BB0 BB1 BB2BB3 BB4 BB6 BB5 BB7 BBDOM0 10,1 20,1,2 30,1,3 BBDOM 40,1,3,4 50,1,3,5 60,1,3,6 70,1,7 Dom.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
SSA.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Stanford University CS243 Winter 2006 Wei Li 1 SSA.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Advanced Compiler Design – Assignment 1 SSA Construction & Destruction Michael Fäs (Original Slides by Luca Della Toffola)
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
CS745: SSA© Seth Copen Goldstein & Todd C. Mowry Static Single Assignment.
Code Generation Professor Yihjia Tsai Tamkang University.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA Emery Berger University.
Lecture #17, June 5, 2007 Static Single Assignment phi nodes Dominators Dominance Frontiers Dominance Frontiers when inserting phi nodes.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
2015/6/24\course\cpeg421-10F\Topic1-b.ppt1 Topic 1b: Flow Analysis Some slides come from Prof. J. N. Amaral
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
1 CS 201 Compiler Construction Data Flow Analysis.
CSE P501 – Compiler Construction
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Static Single Assignment.
Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Dominators, etc. Emery Berger University.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Building SSA Form (A mildly abridged account) For the full story, see the lecture notes for COMP 512 (lecture 8) and.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Introduction to SSA Data-flow Analysis Revisited – Static Single Assignment (SSA) Form Liberally Borrowed from U. Delaware and Cooper and Torczon Text.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment John Cavazos.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Data flow analysis John Cavazos University.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Data Flow Analysis Suman Jana
Static Single Assignment
© Seth Copen Goldstein & Todd C. Mowry
Building SSA Form (A mildly abridged account)
Efficiently Computing SSA
Princeton University Spring 2016
Topic 10: Dataflow Analysis
Factored Use-Def Chains and Static Single Assignment Forms
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
CSC D70: Compiler Optimization Static Single Assignment (SSA)
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
Static Single Assignment Form (SSA)
Optimizations using SSA
Dominator Tree First BB is the root node, each node
Data Flow Analysis Compiler Design
EECS 583 – Class 7 Static Single Assignment Form
Static Single Assignment
EECS 583 – Class 8 Finish SSA, Start on Classic Optimization
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/4/2019 CPEG421-05S/Topic5.
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/17/2019 CPEG421-05S/Topic5.
EECS 583 – Class 7 Static Single Assignment Form
CSC D70: Compiler Optimization Static Single Assignment (SSA)
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Optimizing Compilers CISC 673 Spring 2009 Data flow analysis
Presentation transcript:

Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II John Cavazos University of Delaware

What is SSA? Many data-flow problems have been formulated To limit number of analyses, use single analysis to perform multiple transformations  SSA Compiler optimization algorithms enhanced or enabled by SSA: Constant propagation Dead code elimination Global value numbering Partial redundancy elimination Register allocation

SSA Construction Algorithm (High-level sketch) 1. Insert  -functions 2. Rename values

SSA Construction Algorithm (Less high-level sketch) Insert  -functions at every join for every name Solve reaching definitions Rename each use to def that reaches it (will be unique) What’s wrong with this approach Too many  -functions! To do better, we need a more complex approach Builds maximal SSA

SSA Construction Algorithm (Less high-level sketch) 1. Insert  -functions a.) calculate dominance frontiers b.) find global names for each name, build list of blocks that define it c.) insert  -functions

Insert  -functions global name n worklist ← Block(n) // blocks in which n is assigned  block b ∈ worklist  block d in b’s dominance frontier insert a  -function for n in d add d to worklist

Computing Dominance Frontiers Only join points are in DF(n) for some n Simple algorithm for computing dominance frontiers For each join point x (i.e., |preds(x)| > 1) For each CFG predecessor of x Run up to IDOM(x ) in dominator tree, add x to DF(n) for each n between x and IDOM(x )

Dominance Frontiers B1 B2 B3 B4 B5 B6 B7 B0 B1 B2 B3 B4 B5 B6 B7 B0 For each join point x For each CFG pred of x Run to IDOM(x ) in dom tree, add x to DF(n) for each n between x and IDOM(x ) Flow Graph Dominance Tree n B0 B1 B2 B3 B4 B5 B6 B7 DOM(n) 0,1 0,1,2 0,1,3 0,1,3,4 0,1,3,5 0,1,3,6 0,1,7 IDOM(n) -- 1 3 DF(n) 7 6

Dominance Frontiers & -Function Insertion B1 B2 B3 B4 B5 B6 B7 B0 A definition at n forces a -function at m iff n  DOM(m) but n  DOM(p) for some p  preds(m) DF(n ) is fringe just beyond region n dominates x (...)  in 7 forces -function in DF(7) = {1} x ... x (...) DF(4) is {6}, so  in 4 forces -function in 6 x (...)  in 6 forces -function in DF(6) = {7} Dominance Frontiers  in 1 forces -function in DF(1) = Ø (halt ) For each assignment, we insert the  -functions

Excluding local names avoids ’s for y & z a  (a,a) b  (b,b) c  (c,c) d  (d,d) y  a+b z  c+d i  i+1 B7 i > 100 i  ... B0 b  ... c  ... d  ... B2 c   (c,c) i  (i,i) a  ... B1 B3 B4 B5 B6 With all the -functions Lots of new ops Renaming is next Assume a, b, c, & d defined before B0

SSA Construction Algorithm (Less high-level sketch) 2. Rename variables in a pre-order walk over dominator tree (uses counter and a stack per global name) Staring with the root block, b a.) generate unique names for each  -function and push them on the appropriate stacks

SSA Construction Algorithm (Less high-level sketch) Rename variables (cont’d) b.) rewrite each operation in the block i. Rewrite uses of global names with the current version (from the stack) ii. Rewrite definition by creating & pushing new name c.) fill in  -function parameters of successor blocks d.) recurse on b’s children in the dominator tree e.) <on exit from block b > pop names generated in b from stacks

SSA Construction Algorithm Adding all the details ... Rename(b) for each  -function in b, x  (…) rename x as NewName(x) for each operation “x  y op z” in b rewrite y as top(stack[y]) rewrite z as top(stack[z]) rewrite x as NewName(x) for each successor of b in the CFG rewrite appropriate  parameters for each successor s of b in dom. tree Rename(s) pop(stack[x]) for each global name i counter[i]  0 stack[i]   call Rename(n0) NewName(n) i  counter[n] counter[n]  counter[n] + 1 push ni onto stack[n] return ni

Before processing B0 Assume a, b, c, & d defined before B0 a  (a,a) b  (b,b) c  (c,c) d  (d,d) y  a+b z  c+d i  i+1 B7 i > 100 i  ... B0 b  ... c  ... d  ... B2 i  (i,i) a  ... B1 B3 B4 B5 B6 Before processing B0 Assume a, b, c, & d defined before B0 i has not been defined a b c d i Counters Stacks 1 1 1 1 a0 b0 c0 d0 20

End of B0 a b c d i Counters Stacks 1 1 1 1 1 21 a0 b0 c0 d0 i0 B0 a  (a0,a) b  (b0,b) c  (c0,c) d  (d0,d) i  (i0,i) a  ... c  ... B1 End of B0 b  ... c  ... d  ... B2 a  ... d  ... B3 d  ... B4 c  ... B5 d  (d,d) c  (c,c) b  ... B6 a b c d i Counters Stacks 1 1 1 1 1 B7 a  (a,a) b  (b,b) c  (c,c) d  (d,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 i > 100 21

End of B1 a b c d i Counters Stacks 3 2 3 2 2 22 a0 b0 c0 d0 i0 a1 b1 a1  (a0,a) b1  (b0,b) c1  (c0,c) d1  (d0,d) i1  (i0,i) a2  ... c2  ... B1 End of B1 b  ... c  ... d  ... B2 a  ... d  ... B3 d  ... B4 c  ... B5 d  (d,d) c  (c,c) b  ... B6 a b c d i Counters Stacks 3 2 3 2 2 B7 a  (a,a) b  (b,b) c  (c,c) d  (d,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 22

End of B2 a b c d i Counters Stacks 3 3 4 3 2 23 a0 b0 c0 d0 i0 a1 b1 a1  (a0,a) b1  (b0,b) c1  (c0,c) d1  (d0,d) i1  (i0,i) a2  ... c2  ... B1 End of B2 b2  ... c3  ... d2  ... B2 a  ... d  ... B3 d  ... B4 c  ... B5 d  (d,d) c  (c,c) b  ... B6 a b c d i Counters Stacks 3 3 4 3 2 B7 a  (a2,a) b  (b2,b) c  (c3,c) d  (d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b2 c2 d2 c3 i > 100 23

Before starting B3 a b c d i Counters Stacks 3 3 4 3 2 24 a0 b0 c0 d0 a1  (a0,a) b1  (b0,b) c1  (c0,c) d1  (d0,d) i1  (i0,i) a2  ... c2  ... B1 Before starting B3 b2  ... c3  ... d2  ... B2 a  ... d  ... B3 d  ... B4 c  ... B5 d  (d,d) c  (c,c) b  ... B6 a b c d i Counters Stacks 3 3 4 3 2 B7 a  (a2,a) b  (b2,b) c  (c3,c) d  (d2,d) y  a+b z  c+d i  i+1 i ≤ 100 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 24

End of B3 a b c d i Counters Stacks 4 3 4 4 2 25 a0 b0 c0 d0 i0 a1 b1 a1  (a0,a) b1  (b0,b) c1  (c0,c) d1  (d0,d) i1  (i0,i) a2  ... c2  ... B1 End of B3 b2  ... c3  ... d2  ... B2 a3  ... d3  ... B3 d  ... B4 c  ... B5 d  (d,d) c  (c,c) b  ... B6 a b c d i Counters Stacks 4 3 4 4 2 B7 a  (a2,a) b  (b2,b) c  (c3,c) d  (d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 i > 100 25

End of B4 a b c d i Counters Stacks 4 3 4 5 2 26 a0 b0 c0 d0 i0 a1 b1 a1  (a0,a) b1  (b0,b) c1  (c0,c) d1  (d0,d) i1  (i0,i) a2  ... c2  ... B1 End of B4 b2  ... c3  ... d2  ... B2 a3  ... d3  ... B3 d4  ... B4 c  ... B5 d  (d4,d) c  (c2,c) b  ... B6 a b c d i Counters Stacks 4 3 4 5 2 B7 a  (a2,a) b  (b2,b) c  (c3,c) d  (d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 d4 i > 100 26

End of B5 a b c d i Counters Stacks 4 3 5 5 2 27 a0 b0 c0 d0 i0 a1 b1 a1  (a0,a) b1  (b0,b) c1  (c0,c) d1  (d0,d) i1  (i0,i) a2  ... c2  ... B1 End of B5 b2  ... c3  ... d2  ... B2 a3  ... d3  ... B3 d4  ... B4 c4  ... B5 d  (d4,d3) c  (c2,c4) b  ... B6 a b c d i Counters Stacks 4 3 5 5 2 B7 a  (a2,a) b  (b2,b) c  (c3,c) d  (d2,d) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 d3 a3 c4 i > 100 27

End of B6 a b c d i Counters Stacks 4 4 6 6 2 28 a0 b0 c0 d0 i0 a1 b1 a1  (a0,a) b1  (b0,b) c1  (c0,c) d1  (d0,d) i1  (i0,i) a2  ... c2  ... B1 End of B6 b2  ... c3  ... d2  ... B2 a3  ... d3  ... B3 d4  ... B4 c4  ... B5 d5  (d4,d3) c5  (c2,c4) b3  ... B6 a b c d i Counters Stacks 4 4 6 6 2 B7 a  (a2,a3) b  (b2,b3) c  (c3,c5) d  (d2,d5) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b3 c2 d3 a3 c5 d5 i > 100 28

Before B7 a b c d i Counters Stacks 4 4 6 6 2 29 a0 b0 c0 d0 i0 a1 b1 a1  (a0,a) b1  (b0,b) c1  (c0,c) d1  (d0,d) i1  (i0,i) a2  ... c2  ... B1 Before B7 b2  ... c3  ... d2  ... B2 a3  ... d3  ... B3 d4  ... B4 c4  ... B5 d5  (d4,d3) c5  (c2,c4) b3  ... B6 a b c d i Counters Stacks 4 4 6 6 2 B7 a  (a2,a3) b  (b2,b3) c  (c3,c5) d  (d2,d5) y  a+b z  c+d i  i+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 c2 i > 100 29

End of B7 a b c d i Counters Stacks 5 5 7 7 3 30 a0 b0 c0 d0 i0 a1 b1 a1  (a0,a4) b1  (b0,b4) c1  (c0,c6) d1  (d0,d6) i1  (i0,i2) a2  ... c2  ... B1 End of B7 b2  ... c3  ... d2  ... B2 a3  ... d3  ... B3 d4  ... B4 c4  ... B5 d5  (d4,d3) c5  (c2,c4) b3  ... B6 a b c d i Counters Stacks 5 5 7 7 3 B7 a4  (a2,a3) b4  (b2,b3) c6  (c3,c5) d6  (d2,d5) y  a4+b4 z  c6+d6 i2  i1+1 a0 b0 c0 d0 i0 a1 b1 c1 d1 i1 a2 b4 c2 d6 i2 a4 c6 i > 100 30

After renaming Semi-pruned SSA form We’re done … B0 i0  ... i > 100 a1  (a0,a4) b1  (b0,b4) c1  (c0,c6) d1  (d0,d6) i1  (i0,i2) a2  ... c2  ... B1 After renaming Semi-pruned SSA form We’re done … b2  ... c3  ... d2  ... B2 a3  ... d3  ... B3 d4  ... B4 c4  ... B5 d5  (d4,d3) c5  (c2,c4) b3  ... B6 B7 a4  (a2,a3) b4  (b2,b3) c6  (c3,c5) d6  (d2,d5) y  a4+b4 z  c6+d6 i2  i1+1 Semi-pruned  only names live in 2 or more blocks are “global names”. i > 100 31

SSA Construction Algorithm (Pruned SSA) What’s this “pruned SSA” stuff? Minimal SSA still contains extraneous  -functions Inserts some -functions where they are dead Would like to avoid inserting them

SSA Construction Algorithm (Two Ideas) Semi-pruned SSA: discard names used in only one block Significant reduction in total number of  -functions Needs only local Live information (cheap to compute) Pruned SSA: only insert  -functions where their value is live Inserts even fewer  -functions, but costs more to do Requires global Live variable analysis (more expensive) In practice, both are simple modifications to step 1.

SSA Deconstruction At some point, we need executable code Few machines implement  operations Need to fix up the flow of values Basic idea Insert copies -function pred’s Simple algorithm Works in most cases Adds lots of copies Many of them coalesce away X17 (x10,x11) ...  x17 ... ...  x17 X17  x10 X17  x11