Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,

Slides:



Advertisements
Similar presentations
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
Advertisements

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Operator Strength Reduction From Cooper, Simpson, & Vick, “Operator Strength Reduction”, ACM TOPLAS, 23(5), See also § of EaC2e. 1COMP 512,
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
SSA.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
Program Representations. Representing programs Goals.
Code Motion of Control Structures From the paper by Cytron, Lowry, and Zadeck, COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
Partial Redundancy Elimination & Lazy Code Motion
Lazy Code Motion C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Loop Invariant Code Motion — classical approaches — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Code Shape III Booleans, Relationals, & Control flow Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Intermediate Code. Local Optimizations
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Global Common Subexpression Elimination with Data-flow Analysis Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
What’s in an optimizing compiler?
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
First Principles (with examples from value numbering) C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Order from Chaos — the big picture — C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Operator Strength Reduction C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Algebraic Reassociation of Expressions Briggs & Cooper, “Effective Partial Redundancy Elimination,” Proceedings of the ACM SIGPLAN 1994 Conference on Programming.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Boolean & Relational Values Control-flow Constructs Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Terminology, Principles, and Concerns, II With examples from superlocal value numbering (Ch 8 in EaC2e) Copyright 2011, Keith D. Cooper & Linda Torczon,
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
Profile Guided Code Positioning C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Code Optimization Overview and Examples
Introduction to Optimization
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
Princeton University Spring 2016
Introduction to Optimization
Optimizing Transformations Hal Perkins Autumn 2011
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Code Shape III Booleans, Relationals, & Control flow
Optimizing Transformations Hal Perkins Winter 2008
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
The Last Lecture COMP 512 Rice University Houston, Texas Fall 2003
Optimizations using SSA
Introduction to Optimization
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Algebraic Reassociation of Expressions COMP 512 Rice University Houston, Texas Fall 2003 P. Briggs & K.D. Cooper, “Effective Partial Redundancy Elimination,”
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. This primary algorithm described in this lecture is due to Rob Shillner, when he worked in the MSCP group at Rice University. To our knowledge, the only published version of the algorithm is in Chapter 10 of EaC.

COMP 512, Fall Dead Code Elimination Three distinct problems Useless operations  Any operation whose value is not used in some visible way  Use the SSA-based mark/sweep algorithm (D EAD ) Useless control flow  Branches to branches, empty blocks  Simple CFG-based algorithm ( Shillner’s C LEAN ) Unreachable blocks  No path from n 0 to b  b cannot execute  Simple graph reachability problem

COMP 512, Fall Eliminating Useless Control Flow The Problem After optimization, the CFG can contain empty blocks “Empty” blocks still end with either a branch or a jump Produces jump to jump, which wastes time & space Need to simplify the CFG & eliminate these The Algorithm (C LEAN ) Use four distinct transformations Apply them in a carefully selected order Iterate until done Devised by Rob Shillner ( 1992 ), documented by John Lu ( 1994 ) We must carefully distinguish between these Branch is conditional Jump is absolute

COMP 512, Fall Eliminating Useless Control Flow Transformations B1B1 B2B2 B1B1 B2B2 Eliminating redundant branches Both sides of branch target B i Neither block must be empty Replace it with a jump to B i Simple rewrite of last op in B 1 How does this happen? Rewriting other branches How do we find it? Check each branch Branch, not a jump *

COMP 512, Fall Eliminating Useless Control Flow Transformations Merging an empty block Empty B 1 ends in a jump Coalesce B 1 with B 2 Move B 1 ’s incoming edges Eliminates extraneous jump Faster, smaller code How does this happen? Eliminate operations in B 1 How do we find it? Test for empty block Eliminating empty blocks B2B2 B1B1 B2B2 empty

COMP 512, Fall Eliminating Useless Control Flow TransformationsCoalescing blocks Neither block must be empty B 1 ends with a jump B 2 has 1 predecessor Combine the two blocks Eliminates a jump How does this happen? Simplifying edges out of B 1 How do we find it? Check target of jump |preds | Combining non-empty blocks B 1 B 2 B1B1 B2B2 B 1 and B 2 should be a single basic block If one executes, both execute, in linear order. *

COMP 512, Fall Eliminating Useless Control Flow TransformationsJump to a branch B 1 ends with jump, B 2 is empty Eliminates pointless jump Copy branch into end of B 1 Might make B 2 unreachable How does this happen? Eliminating operations in B 2 How do we find this? Jump to empty block Hoisting branches from empty blocks B1B1 B2B2 empty B1B1 B2B2

COMP 512, Fall Eliminating Useless Control Flow Putting the transformations together Process the blocks in postorder  Clean up B i ’s successors before B i  Simplifies implementation & understanding At each node, apply transformations in a fixed order  Eliminate redundant branch  Eliminate empty block  Merge block with successor  Hoist branch from empty successor May need to iterate  Postorder  unprocessed successors along back edges  Can bound iterations, but a deriving tight bound is hard  Must recompute postorder between iterations Creates a jump that should be checked in the same pass { Handle jump *  Eliminates an edge  Eliminates a node  Eliminates node & edge  Adds an edge

COMP 512, Fall Eliminating Useless Control Flow What about an empty loop? By itself, C LEAN cannot eliminate the loop Loop body branches to itself  Branch is not redundant  Doesn’t end with a jump  Hoisting does not help Key is to eliminate self-loop  Add a new transformation?  Then, B 1 merges with B 2 B0B0 B2B2 B1B1 * Targets two distinct blocks! B0B0 B2B2 B1B1 * B0B0 B2B2 B1B1 * New transformation must recognize that B 1 is empty. Presumably, it has code to test exit condition & (probably) increment an induction variable. This requires looking at code inside B 1 and doing some sophisticated pattern matching. This seems awfully complicated. *

COMP 512, Fall Eliminating Useless Control Flow What about an empty loop? How to eliminate ?  Pattern matching ?  Useless code elimination ? B0B0 B2B2 B1B1 *

COMP 512, Fall Eliminating Useless Control Flow What about an empty loop? How to eliminate ?  Pattern matching ?  Useless code elimination ? What does D EAD do to B 1 ?  Remember, it is empty  B 1 has only one exit  So, B 1  RDF(B 2 )  B 1 ’s branch is useless  D EAD rewrites it as a jump B0B0 B2B2 B1B1 *

COMP 512, Fall Eliminating Useless Control Flow What about an empty loop? How to eliminate ?  Pattern matching ?  Useless code elimination ? What does D EAD do to B 1 ?  Remember, it is empty  B 1 has only one exit  So, B 1  RDF(B 2 )  B 1 ’s branch is useless  D EAD rewrites it as a jump D EAD converts it to a form where C LEAN handles it ! B0B0 B2B2 B1B1 B0B0 B2B2 B1B1 D EAD

COMP 512, Fall Eliminating Useless Control Flow The Algorithm CleanPass() for each block i, in postorder if i ends in a branch then if both targets are identical then rewrite with a jump if i ends in a jump to j then if i is empty then merge i with j else if j has only one predecessor merge i with j else if j is empty & j has a branch then rewrite i’s jump with j’s branch Clean() until CFG stops changing compute postorder CleanPass() Summary Simple, structural algorithm Limited transformation set Cooperates with D EAD In practice, its quite fast

COMP 512, Fall The Problem Block with no entering edge Situation created by other optimizations The Cure Compute reachability & delete unreachable code Simple mark/sweep algorithm on CFG Mark during computation of postorder, reverse postorder... In M SCP, importing I LOC does this (every time) Eliminating Unreachable Code

COMP 512, Fall Dead Code Elimination Summary Useless Computations  D EAD Useless Control-flow  C LEAN Unreachable Blocks  Simple housekeeping Other Techniques Constant propagation can eliminate branches Algebraic identities eliminate some operations Redundancy elimination  Creates useless operations, or  Eliminates them

COMP 512, Fall Next lecture: Algebraic Reassociation Where are we? Machine Independent Redundancy Redundancy elim. Partial red. elim. Consolidation Code motion Loop-invariant c.m. Consolidation Global Scheduling [Click] Constant propagation Useless code Dead code elim. Partial d.c.e. Constant propagation Algebraic identities Create opportunities Reassociation Replication Specialization Replication Strength Reduction Method caching Heap  stack allocation Tail recursion elimination *

COMP 512, Fall The Lab Due end of semester You may work in teams of two people Either  Build SSA and implement 1 SSA-based optimization, or  Use our SSA implementation and build 2 optimizations (1 must be based on SSA) D EAD is too easy, so D EAD and C LEAN count as 1 optimization  MSCP infrastructure already removes unreachable code Possible optimizations  Constant propagation, strength reduction, lazy code motion, D EAD /C LEAN, D VNT, A WZ partitioning, others to be negotiated...