Static Single Assignment Form in the COINS Compiler Infrastructure Masataka Sassa, Toshiharu Nakaya, Masaki Kohama, Takeaki Fukuoka and Masahito Takahashi.

Slides:



Advertisements
Similar presentations
Topic G: Static Single-Assignment Form José Nelson Amaral
Advertisements

Static Single-Assignment ? ? Introduction: Over last few years [1991] SSA has been Stablished as… Intermediate program representation.
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Comparison and Evaluation of Back Translation Algorithms for Static Single Assignment Form Masataka Sassa #, Masaki Kohama + and Yo Ito # # Dept. of Mathematical.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
6. Intermediate Representation
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
1 Authors: Vugranam C. Sreedhar, Roy Dz-Ching Ju, David M. Gilles and Vatsa Santhanam Reader: Pushpinder Kaur Chouhan Translating Out of Static Single.
Course Outline Traditional Static Program Analysis Software Testing
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
SSA.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
Program Representations. Representing programs Goals.
AUTOMATIC GENERATION OF CODE OPTIMIZERS FROM FORMAL SPECIFICATIONS Vineeth Kumar Paleri Regional Engineering College, calicut Kerala, India. (Currently,
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
TM Pro64™: Performance Compilers For IA-64™ Jim Dehnert Principal Engineer 5 June 2000.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Recap from last time Saw several examples of optimizations –Constant folding –Constant Prop –Copy Prop –Common Sub-expression Elim –Partial Redundancy.
Cpeg421-08S/final-review1 Course Review Tom St. John.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA Emery Berger University.
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
1/20 Data Communication Estimation and Reduction for Reconfigurable Systems Adam Kaplan Philip Brisk Ryan Kastner Computer Science Elec. and Computer Engineering.
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic H: SSA for Predicated Code José Nelson Amaral
Register Allocation (via graph coloring)
Register Allocation (via graph coloring). Lecture Outline Memory Hierarchy Management Register Allocation –Register interference graph –Graph coloring.
Intermediate Code. Local Optimizations
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
Improving Code Generation Honors Compilers April 16 th 2002.
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
Register Allocation and Spilling via Graph Coloring G. J. Chaitin IBM Research, 1982.
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
CSE P501 – Compiler Construction
Flex Compiler Compiler Case Study By Mee Ka Chang.
Static Single Assignment Form in the COINS Compiler Infrastructure - Current Status and Background - Masataka Sassa, Toshiharu Nakaya, Masaki Kohama (Tokyo.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 1 Developed By:
Detecting Equality of Variables in Programs Bowen Alpern, Mark N. Wegman, F. Kenneth Zadeck Presented by: Abdulrahman Mahmoud.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Hy-C A Compiler Retargetable for Single-Chip Heterogeneous Multiprocessors Philip Sweany 8/27/2010.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
PLC '06 Experience in Testing Compiler Optimizers Using Comparison Checking Masataka Sassa and Daijiro Sudo Dept. of Mathematical and Computing Sciences.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Optimization Simone Campanoni
Credible Compilation With Pointers Martin Rinard and Darko Marinov Laboratory for Computer Science Massachusetts Institute of Technology.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
High-level optimization Jakub Yaghob
Code Optimization.
Introduction to Advanced Topics Chapter 1 Text Book: Advanced compiler Design implementation By Steven S Muchnick (Elsevier)
Static Single Assignment
Intermediate Representations
Intermediate Representations
Optimizations using SSA
Building SSA Harry Xu CS 142 (b) 04/22/2018.
COINS‥ a COmpiler INfraStructure
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Static Single Assignment Form in the COINS Compiler Infrastructure Masataka Sassa, Toshiharu Nakaya, Masaki Kohama, Takeaki Fukuoka and Masahito Takahashi (Tokyo Institute of Technology)

0. COINS infrastructure and the SSA form 1. Current status of optimization using SSA form in COINS infrastructure 2. A comparison of two major algorithms for translating from normal form into SSA form 3. A comparison of two major algorithms for translating back from SSA form into normal form Outline Background Static single assignment (SSA) form facilitates compiler optimizations. Compiler infrastructure facilitates compiler development.

0. COINS infrastructure and Static Single Assignment Form (SSA Form)

COINS compiler infrastructure Multiple source languages Retargetable Two intermediate form, HIR and LIR Optimizations Parallelization C generation, source-to- source translation Written in Java 2000~ developed by Japanese institutions under Grant of the Ministry High Level Intermediate Representation (HIR) Basic analyzer & optimizer HIR to LIR Low Level Intermediate Representation (LIR) SSA optimizer Code generator C frontend Advanced optimizer Basic parallelizer frontend C generation SPARC SIMD parallelizer x86 New machine Fortran C New language COpenMP Fortran frontend C generation C

1: a = x + y 2: a = a + 3 3: b = x + y Static Single Assignment (SSA) Form (a)Normal (conventional) form (source program or internal form) 1: a 1 = x 0 + y 0 2: a 2 = a : b 1 = x 0 + y 0 (b) SSA form SSA form is a recently proposed internal representation where each use of a variable has a single definition point. Indices are attached to variables so that their definitions become unique.

1: a = x + y 2: a = a + 3 3: b = x + y Optimization in Static Single Assignment (SSA) Form (a) Normal form 1: a 1 = x 0 + y 0 2: a 2 = a : b 1 = x 0 + y 0 (b) SSA form 1: a 1 = x 0 + y 0 2: a 2 = a : b 1 = a 1 (c) After SSA form optimization 1: a1 = x0 + y0 2: a2 = a : b1 = a1 (d) Optimized normal form SSA translation Optimization in SSA form (common subexpression elimination) SSA back translation SSA form is becoming increasingly popular in compilers, since it is suited for clear handling of dataflow analysis and optimization.

x1 = 1x2 = 2 x3 =   (x1;L1, x2:L2) … = x3 x = 1x = 2 … = x L3 L2L1 L2 L3 (b) SSA form(a) Normal form Translating into SSA form (SSA translation)

x1=… = x1 y1=… z1=… x2=… = x2 y2=… z2=… = y1 = y2 x3=  (x1,x2) y3=  (y1,y2) z3=  (z1,z2) = z3 x1=… = x1 y1=… z1=… x2=… = x2 y2=… z2=… = y1 = y2 y3=  (y1,y2) z3=  (z1,z2) = z3 x1=… = x1 y1=… z1=… x2=… = x2 y2=… z2=… = y1 = y2 z3=  (z1,z2) = z3 x =… = x y =… z =… x =… = x y =… z =… = y = z Minimal SSA form Semi-pruned SSA formPruned SSA form Normal form Translating into SSA form (SSA translation)

x1 = 1x2 = 2 x3 =   (x1;L1, x2:L2) … = x3 x1 = 1 x3 = x1 x2 = 2 x3 = x2 … = x3 L3 L2L1 L2 L3 (a) SSA form(b) Normal form Translating back from SSA form (SSA back translation)

1. SSA form module in the COINS compiler infrastructure

High Level Intermediate Representation (HIR) Basic analyzer & optimizer HIR to LIR Low Level Intermediate Representation (LIR) SSA optimizer Code generator C frontend Advanced optimizer Basic parallelizer frontend C generation SPARC SIMD parallelizer x86 New machine Fortran C New language COpenMP Fortran frontend COINS compiler infrastructure C generation C

Low level Intermediate Representation (LIR) SSA optimization module Code generation Source program object code LIR to SSA translation (3 variations) LIR in SSA SSA basic optimization com subexp elimination copy propagation cond const propagation dead code elimination Optimized LIR in SSA SSA to LIR back translation (2 variations) + 2 coalescing 12,000 lines transformation on SSA copy folding dead phi elim edge splitting SSA optimization module in COINS

Outline of SSA module in COINS Translation into and back from SSA form on Low Level Intermediate Representation (LIR) ‐ SSA translation: Use dominance frontier [Cytron et al. 91] ‐ SSA back translation: [Sreedhar et al. 99] ‐ Basic optimization on SSA form: dead code elimination, copy propagation, common subexpression elimination, conditional constant propagation Useful transformation as an infrastructure for SSA form optimization ‐ Copy folding at SSA translation time, critical edge removal on control flow graph … ‐ Each variation and transformation can be made selectively Preliminary result ‐ 1.43 times faster than COINS w/o optimization ‐ 1.25 times faster than gcc w/o optimization

2. A comparison of two major algorithms for SSA translation Algorithm by Cytron [1991] Dominance frontier Algorithm by Sreedhar [1995] DJ-graph Comparison made to decide the algorithm to be included in COINS

x1 = 1x2 = 2 x3 =   (x1;L1, x2:L2) … = x3 x = 1x = 2 … = x L3 L2L1 L2 L3 (b) SSA form(a) Normal form Translating into SSA form (SSA translation)

No. of nodes of control flow graph Translation time (milli sec) Cytron Sreedhar Usual programs (The gap is due to the garbage collection)

(b) ladder graph(a) nested loop Peculiar programs

No. of nodes of control flow graph Translation time (milli sec) Cytron Sreedhar Nested loop programs

No. of nodes of control flow graph Translation time (milli sec) Cytron Sreedhar Ladder graph programs

3. A comparison of two major algorithms for SSA back translation Algorithm by Briggs [1998] Insert copy statements Algorithm by Sreedhar [1999] Eliminate interference There have been no studies of comparison Comparison made on COINS

x1 = 1x2 = 2 x3 =   (x1;L1, x2:L2) … = x3 x1 = 1 x3 = x1 x2 = 2 x3 = x2 … = x3 L3 L2L1 L2 L3 (a) SSA form(b) Normal form Translating back from SSA form (SSA back translation)

x0 = 1 x1 =  (x0, x2) y = x1 x2 = 2 return y x0 = 1 x1 =  (x0, x2) x2 = 2 return x1 x0 = 1 x1 = x0 x2 = 2 x1 = x2 return x1 not correct block1 block3 block2 block1 block3 block2 block1 block3 block2 Copy propagation Back translation by na ï ve method Problems of naïve SSA back translation (lost copy problem)

x0 = 1 x1 =  (x0, x2) x2 = 2 return x1 x0 = 1 x1 = x0 x2 = 2 x1 = x2 return temp block1 block3 block2 block1 block3 block2 temp = x1 (a) SSA form(b) normal form after back translation live range of x1 live range of temp To remedy these problems... (i) SSA back translation algorithm by Briggs

x0 = 1 x1 =  (x0, x2) x2 = 2 return x1 live range of x0 x1 x2 x0 = 1 x1’ =  (x0, x2) x1 = x1’ x2 = 2 return x1 {x0, x1 ’, x2}  A x1 = A A = 2 A = 1 block1 block3 block2 block1 block3 block2 return x1 (a) SSA form(b) eliminating interference (c) normal form after back translation live range of x0 x1' x2 block1 block3 block2 (ii) SSA back translation algorithm by Sreedhar

SSA form BriggsBriggs + Coalescing Sreedhar Lost copy031 (1) Simple ordering052 (2) Swap075 (5)3 (3) Swap-lost0107 (7)4 (4) do096 (4)4 (2) fib040 (0) GCD095 (2) Selection Sort090 (0) Hige Swap083 (3)4 (4) No. of copies (no. of copies in loops) Empirical comparison of SSA back translation

Previous work: SSA form in compiler infrastructure SUIF (Stanford Univ.): no SSA form machine SUIF (Harvard Univ.): only one optimization in SSA form Scale (Univ. Massachusetts): a couple of SSA form optimizations. But it generates only C programs, and cannot generate machine code like in COINS. GCC: some attempts but experimental Only COINS will have full support of SSA form as a compiler infrastructure

Summary SSA form module of the COINS infrastructure Empirical comparison of algorithms for SSA translation gave criterion to make a good choice Empirical comparison of algorithms for SSA back translation clarified there is no single algorithm which gives optimal result Hope COINS and its SSA module help the compiler writer to compare/evaluate/add optimization methods