Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.

Slides:



Advertisements
Similar presentations
Code Optimization, Part II Regional Techniques Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Operator Strength Reduction From Cooper, Simpson, & Vick, “Operator Strength Reduction”, ACM TOPLAS, 23(5), See also § of EaC2e. 1COMP 512,
Lecture 11: Code Optimization CS 540 George Mason University.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
SSA.
Static Single Assignment CS 540. Spring Efficient Representations for Reachability Efficiency is measured in terms of the size of the representation.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
Code Motion of Control Structures From the paper by Cytron, Lowry, and Zadeck, COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
Lazy Code Motion Comp 512 Spring 2011
Lazy Code Motion C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Loop Invariant Code Motion — classical approaches — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Code Shape III Booleans, Relationals, & Control flow Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA Emery Berger University.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Intermediate Code. Local Optimizations
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
CSE P501 – Compiler Construction
Global Common Subexpression Elimination with Data-flow Analysis Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
First Principles (with examples from value numbering) C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Operator Strength Reduction C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Terminology, Principles, and Concerns, III With examples from DOM (Ch 9) and DVNT (Ch 10) Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Boolean & Relational Values Control-flow Constructs Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Terminology, Principles, and Concerns, II With examples from superlocal value numbering (Ch 8 in EaC2e) Copyright 2011, Keith D. Cooper & Linda Torczon,
Computer Science 313 – Advanced Programming Topics.
Profile-Guided Code Positioning See paper of the same name by Karl Pettis & Robert C. Hansen in PLDI 90, SIGPLAN Notices 25(6), pages 16–27 Copyright 2011,
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Profile Guided Code Positioning C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Definition-Use Chains
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Introduction to Optimization
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
Princeton University Spring 2016
Introduction to Optimization
Factored Use-Def Chains and Static Single Assignment Forms
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Code Shape III Booleans, Relationals, & Control flow
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
Optimizations using SSA
Data Flow Analysis Compiler Design
Introduction to Optimization
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due to Rob Shillner, when he worked on the MSCP at Rice. 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

COMP 512, Rice University2 Dead Code Elimination Three distinct problems Useless operations  Any operation whose value is not used in some visible way  Use the SSA-based mark/sweep algorithm (D EAD ) Useless control flow  Branches to branches, empty blocks  Simple CFG-based algorithm ( Shillner’s C LEAN ) Unreachable blocks  No path from n 0 to b  b cannot execute  Simple graph reachability problem

COMP 512, Rice University3 Eliminating Useless Control Flow The Problem After optimization, the CFG can contain empty blocks “Empty” blocks still end with either a branch or a jump Produces jump to jump, which wastes time & space Need to simplify the CFG & eliminate these The Algorithm (C LEAN ) Use four distinct transformations Apply them in a carefully selected order Iterate until done Devised by Rob Shillner ( 1992 ), documented by John Lu ( 1994 ) We must distinguish between these Branch is conditional Jump is absolute

COMP 512, Rice University4 Eliminating Useless Control Flow Transformations B1B1 B2B2 B1B1 B2B2 Eliminating redundant branches Both sides of branch target B i Neither block must be empty Replace it with a jump to B i Simple rewrite of last op in B 1 How does this happen? Rewriting other branches How do we find it? Check each branch Branch, not a jump *

COMP 512, Rice University5 Eliminating Useless Control Flow Transformations Merging an empty block Empty B 1 ends in a jump Coalesce B 1 with B 2 Move B 1 ’s incoming edges Eliminates extraneous jump Faster, smaller code How does this happen? Eliminate operations in B 1 How do we find it? Test for empty block Eliminating empty blocks B2B2 B1B1 B2B2 empty

COMP 512, Rice University6 Eliminating Useless Control Flow TransformationsCoalescing blocks Neither block must be empty B 1 ends with a jump B 2 has 1 predecessor Combine the two blocks Eliminates a jump How does this happen? Simplifying edges out of B 1 How do we find it? Check target of jump |preds | Combining non-empty blocks B 1 B 2 B1B1 B2B2 B 1 and B 2 should be a single basic block If one executes, both execute, in linear order. *

COMP 512, Rice University7 Eliminating Useless Control Flow TransformationsJump to a branch B 1 ends with jump, B 2 is empty Eliminates pointless jump Copy branch into end of B 1 Might make B 2 unreachable How does this happen? Eliminating operations in B 2 How do we find this? Jump to empty block Hoisting branches from empty blocks B1B1 B2B2 empty B1B1 B2B2

COMP 512, Rice University8 Eliminating Useless Control Flow Putting the transformations together Process the blocks in postorder  Clean up B i ’s successors before B i  Simplifies implementation & understanding At each node, apply transformations in a fixed order  Eliminate redundant branch  Eliminate empty block  Merge block with successor  Hoist branch from empty successor May need to iterate  Postorder  unprocessed successors along back edges  Can bound iterations, but a deriving tight bound is hard  Must recompute postorder between iterations Creates a jump that should be checked in the same pass { Handle jump *  Eliminates an edge  Eliminates a node  Eliminates node & edge  Adds an edge

COMP 512, Rice University9 Eliminating Useless Control Flow What about an empty loop? By itself, C LEAN cannot eliminate the loop Loop body branches to itself  Branch is not redundant  Doesn’t end with a jump  Hoisting does not help Key is to eliminate self-loop  Add a new transformation?  Then, B 1 merges with B 2 B0B0 B2B2 B1B1 * Targets two distinct blocks! B0B0 B2B2 B1B1 * B0B0 B2B2 B1B1 * New transformation must recognize that B 1 is empty. Presumably, it has code to test exit condition & (probably) increment an induction variable. This requires looking at code inside B 1 and doing some sophisticated pattern matching. This seems awfully complicated. *

COMP 512, Rice University10 Eliminating Useless Control Flow What about an empty loop? How to eliminate ?  Pattern matching ?  Useless code elimination ? B0B0 B2B2 B1B1 *

COMP 512, Rice University11 Eliminating Useless Control Flow What about an empty loop? How to eliminate ?  Pattern matching ?  Useless code elimination ? What does D EAD do to B 1 ?  Remember, it is empty  Contains only the branch  B 1 has only one exit  So, B 1  RDF(B 2 )  B 1 ’s branch is useless  D EAD rewrites it as a jump to B2 B0B0 B2B2 B1B1 *

COMP 512, Rice University12 Using SSA – Dead code elimination Mark for each op i clear i’s mark if i is critical then mark i add i to WorkList while (Worklist ≠ Ø) remove i from WorkList (i has form “x  y op z” ) if def(y ) is not marked then mark def(y) add def(y) to WorkList if def(z) is not marked then mark def(z) add def(z) to WorkList for each b  RDF (block(i)) mark the block-ending branch in b add it to WorkList Sweep for each op i if i is not marked then if i is a branch then rewrite with a jump to i’s nearest useful post-dominator if i is not a jump then delete i Notes: Eliminates some branches Reconnects dead branches to the remaining live code Find useful post-dominator by walking post-dom tree > Entry & exit nodes are useful Shillner added this clause

COMP 512, Rice University13 Eliminating Useless Control Flow What about an empty loop? How to eliminate ?  Pattern matching ?  Useless code elimination ? What does D EAD do to B 1 ?  Remember, it is empty  Contains only the branch  B 1 has only one exit  So, B 1  RDF(B 2 )  B 1 ’s branch is useless  D EAD rewrites it as a jump to B2 D EAD converts it to a form where C LEAN handles it ! B0B0 B2B2 B1B1 B0B0 B2B2 B1B1 D EAD

COMP 512, Rice University14 Eliminating Useless Control Flow The Algorithm CleanPass() for each block i, in postorder if i ends in a branch then if both targets are identical then rewrite with a jump if i ends in a jump to j then if i is empty then merge i with j else if j has only one predecessor merge i with j else if j is empty & j has a branch then rewrite i’s jump with j’s branch Clean() until CFG stops changing compute postorder CleanPass() Summary Simple, structural algorithm Limited transformation set Cooperates with D EAD In practice, its quite fast

COMP 512, Rice University15 The Problem Block with no entering edge Situation created by other optimizations The Cure Compute reachability & delete unreachable code Simple mark/sweep algorithm on CFG Mark during computation of postorder, reverse postorder... In M SCP, importing I LOC does this (every time) Eliminating Unreachable Code

COMP 512, Rice University16 Dead Code Elimination Summary Useless Computations  D EAD Useless Control-flow  C LEAN Unreachable Blocks  Simple housekeeping Other Techniques Constant propagation can eliminate branches Algebraic identities eliminate some operations Redundancy elimination  Creates useless operations, or  Eliminates them