Code Motion of Control Structures From the paper by Cytron, Lowry, and Zadeck, 1986 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda.

Slides:



Advertisements
Similar presentations
Code Optimization, Part II Regional Techniques Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Operator Strength Reduction From Cooper, Simpson, & Vick, “Operator Strength Reduction”, ACM TOPLAS, 23(5), See also § of EaC2e. 1COMP 512,
Lecture 11: Code Optimization CS 540 George Mason University.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
SSA.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
Lazy Code Motion Comp 512 Spring 2011
Partial Redundancy Elimination & Lazy Code Motion
Lazy Code Motion C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Example in SSA X := Y op Z in out F X := Y op Z (in) = in [ { X ! Y op Z } X :=  (Y,Z) in 0 out F X :=   (in 0, in 1 ) = (in 0 Å in 1 ) [ { X ! E |
Loop Invariant Code Motion — classical approaches — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
CS 536 Spring Global Optimizations Lecture 23.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Loops Guo, Yao.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Eliminating Memory References Joshua Dunfield Alina Oprea.
1 Region-Based Data Flow Analysis. 2 Loops Loops in programs deserve special treatment Because programs spend most of their time executing loops, improving.
Dynamic Optimization as typified by the Dynamo System See “Dynamo: A Transparent Dynamic Optimization System”, V. Bala, E. Duesterwald, and S. Banerjia,
1 CS 201 Compiler Construction Data Flow Analysis.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
First Principles (with examples from value numbering) C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Building SSA Form, III 1COMP 512, Rice University This lecture presents the problems inherent in out- of-SSA translation and some ways to solve them. Copyright.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Algebraic Reassociation of Expressions Briggs & Cooper, “Effective Partial Redundancy Elimination,” Proceedings of the ACM SIGPLAN 1994 Conference on Programming.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Boolean & Relational Values Control-flow Constructs Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in.
Terminology, Principles, and Concerns, II With examples from superlocal value numbering (Ch 8 in EaC2e) Copyright 2011, Keith D. Cooper & Linda Torczon,
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Optimization Simone Campanoni
Eliminating Array Bounds Checks — and related problems — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.
Definition-Use Chains
Introduction to Optimization
Lecture 5 Partial Redundancy Elimination
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
Princeton University Spring 2016
Topic 10: Dataflow Analysis
Introduction to Optimization
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
Optimizations using SSA
Introduction to Optimization
Algebraic Reassociation of Expressions COMP 512 Rice University Houston, Texas Fall 2003 P. Briggs & K.D. Cooper, “Effective Partial Redundancy Elimination,”
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
Presentation transcript:

Code Motion of Control Structures From the paper by Cytron, Lowry, and Zadeck, COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

Background Code Motion algorithms come in three basic flavors Ad-hoc techniques, such as code hoisting & sinking, & GCSE  Opportunities found with data-flow information  Little attention paid to placement Data-flow techniques, such as partial redundancy elimination ( PRE ) & lazy code motion ( LCM )  Opportunities found with data-flow information  Placement computed with scrupulous care Graphical techniques built around the structure of the CFG  Opportunities found using control dependeence  Placement dictated by control dependence (structure of CFG ) This lecture looks at a technique from that third category Popularity of PRE/LCM has caused these techniques to be ignored COMP 512, Rice University2 This paper compares against PRE, because it predates LCM by about six years.

COMP 512, Rice University3 Background Lazy Code Motion is (probably) the dominant technique Conditional control-flow inside loop causes problems LCM does not lengthen any path ( conservative not speculative ) … x + y … y + z LCM can move y + z, but not x + y … y + z … x + y After LCM Weaknesses of PRE/LCM (CLZ paper’s list ) 1. Control flow 2. Moving assignments 3. Speculative motion 4. Iterated code motion

COMP 512, Rice University4 Background Cytron, Lowry, & Zadeck “Code motion of control structures in high-level languages” Paper predates the papers that define SSA form This paper includes most of the critical ideas of SSA form We will recast it into modern terminology Loops Algorithm operates on loops, inside out …  Paper refers to intervals; think of them as SCCs of the CFG hanging from interval header ( loops, not maximal intervals ) Each interval (loop) has a landing pad on entry & on exit  Guarded region at the loop’s entry, another on exit path Traverse each interval in dominated topological order  Think of this as reverse postorder … Formally, a dominated topological order has the property that if n and m are both successors of l in topological order, and n dom m, then n precedes m

COMP 512, Rice University5 Landing Pads A B C F D Interval in the original CFG E CFG from paper, redrawn to have test at both top and bottom of the loop; see EaC2e § 7.8.2

COMP 512, Rice University6 Landing Pads A B C F D Interval in the original CFG E CFG from paper, redrawn to have test at both top and bottom of the loop; see EaC2e § Interval

COMP 512, Rice University7 Landing Pads A B C F D Interval in the original CFG E CFG from paper, redrawn to have test at both top and bottom of the loop; see EaC2e § A BC F D E LP 1 LP 3 LP 2 Augmented with landing pads Modern view: split critical edges Landing pad for backward code motion Landing pads for forward code motion LP 4 Extra landing pad generated by splitting critical edge; not needed by CLZ

COMP 512, Rice University8 Code Motion Two algorithms Strict — guaranteed not to lengthen paths  Restrictions similar to PRE & LCM Nonstrict — can (and does) lengthen some paths  “Speculative” code motion  Can introduce unsafe behavior Big Picture Visit intervals in innermost to outermost order Copy movable control-flow code into entry or exit landing pad Copy movable statements into landing pad Interval is summarized and treated (in surrounding interval) as an atomic operation Need a carefully constructed name space ( SSA names ) Should not use this one unless you modify it to make it safe!

COMP 512, Rice University9 Strict Algorithm ∀ interval I, in order from innermost to outermost ∀ statement s in I, in reverse postorder Test s to see if it is movable If s is movable then Copy s into the entry landing pad If s is a control structure then replace copy in loop with a bit test else delete original statement from the loop Need a usable test for movability Dominated total order

COMP 512, Rice University10 Strict Algorithm S is movable if and only if s is not control dependent on any immovable test Definitions that reach uses in s originate outside the loop  i.e., s is loop invariant s defines x and the algorithm has seen neither a  for x in the loop nor an immovable definition for x y is control dependent on x if and only if  a path from x to y such that every node in the path except x and y are post-dominated by y, and x is not post-dominated by y i  … while(i<n) do if (i<3) then j  5 else j  7 if (k = 0) then m  6 i  i + 1 end …

COMP 512, Rice University11 Strict Algorithm ∀ interval I, in order from innermost to outermost ∀ statement s in I, in reverse postorder test each statement s to see if it is movable if s is movable & a control structure then copy s into the entry landing pad replace copy in loop with a bit test else if s is movable & not a control structure then move s to the equivalent place in entry landing pad clean up any unused control structures Details Copy control-structure to the landing pad and redirect any loop exits to the bottom of the landing pad

COMP 512, Rice University12 Strict Algorithm How can they just delete the assignments in the loop? LCM moves expressions but not assignments  It needs to ensure that the original name gets the invariant value on each iteration  Those assignments would make control structures useful CLZ moves assignments & deletes empty control structures  Remember their renaming scheme (SSA)?  It creates a unique name for the definition & makes this particular kind of motion safe ! … if it could move the assignment out of the control structure … One key rationale for SSA: the assignment cannot be killed, so this kind of code motion becomes safe (1986 paper)

COMP 512, Rice University13 Non-strict Algorithm Some control structures are not movable Their behavior varies from iteration to iteration Moving them might  Raise an exception that the unmovable guard might prevent  Lengthen the computation if the guard is always false Nonetheless, moving them might be profitable … CLZ also give a nonstrict algorithm that moves these operations If s depends on an immovable test, it is placed (in landing pad) where it will execute independent of the outcome of that test Algorithm can still guard s with any movable tests … Unsafe!

COMP 512, Rice University14 Removing Redundancy Further improvement is possible After combining (hoisting), it needs no guard in landing pad  May render control-flow structure useless CLZ present new algorithm for “combining” such statements  Might just use hoisting, sinking, or adapt SSAPre Not moveable z  x + y

COMP 512, Rice University15 Reflections CLZ uses just one pass over each interval  Accounts for 2 nd, 3 rd, … order effects ( dto, rpo )  Would need to iterate LCM to achieve same results.. SSA name space is critical to correctness  Allows motion of assignments  Allows nonstrict movement of invariants Other techniques can achieve similar effects to COMMON  Peel the first iteration of the loop & apply strong techniques for redundancy & for constant propagation ( SCCP )  Hoisting, sinking, maybe a derivative of SSAPre SSA Pre ( Chow et al., PLDI 97 )  Produced same results as PRE or LCM on SSA  Maybe they should have tried for better results, a la CLZ  Peel each loop, use SSA Pre, then SCCP, DEAD, CLEAN

COMP 512, Rice University16 Significance This paper is important Underpinnings and motivations of SSA, years before SSA paper Code motion of control structures is still not commonly done  Community has seized on LCM as right paradigm  CLZ algorithms challenge the wisdom of that opinion  The SSA name space is a critical part of the big picture This work deserves wider recognition and use

End of Lecture Extra Slides Start Here COMP 512, Rice University17

COMP 512, Rice University18 Background Last lecture, we looked at Lazy Code Motion (LCM), which moves redundant and partially redundant code out of loops Conditional control-flow inside loop causes problems LCM does not lengthen any path ( conservative not speculative ) … x + y … y + z LCM can move y + z, but not x + y … y + z … x + y After LCM Weaknesses of PRE/LCM (CLZ paper’s list ) 1. Control flow 2. Moving assignments 3. Speculative motion 4. Iterated code motion