Presentation is loading. Please wait.

Presentation is loading. Please wait.

Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

Similar presentations


Presentation on theme: "Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled."— Presentation transcript:

1 Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

2 COMP 512, Rice University2 Optimization The subject is confusing Whole notion of optimality Incredible number of transformations Odd, inconsistent terminology Maybe this stuff is inherently hard Many intractable problems Many NP-complete problems Much overlap between problems and between solutions If optimization wasn’t confusing, why take C OMP 512 ? { Value numbering Redundancy elimination Common subexpressions Cooper McKinley, & Torczon cite 237 distinct papers in the survey!

3 COMP 512, Rice University3 Optimization A brief catalog (circa 1982 ) And this was before the literature exploded

4 COMP 512, Rice University4 Optimization The literature throws fuel on the fire Terminology is non-standard & non-intuitive Explanations are terse and incomplete Little comparative data that is believable No sense of perspective Papers give conflicting advice An example – Is inline substitution profitable? Holler’s thesis: it almost always helps Hall’s thesis: it occasionally helps, but has lots of problems MacFarland’s thesis: it causes instruction cache misses Reality lies somewhere in the middle  Waterman showed that program-specific heuristics win { Revival Partially-dead code Forward propagation } Not all those papers can be the best !

5 COMP 512, Rice University5 Optimization Improvement should be objective Easy to quantify Produce concrete improvements Taking measurements seems easy Code either gets better or it gets worse But, … Linear-time heuristics for hard problems Unforeseen consequences & poorly understood interactions “Obvious wins” have non-obvious downsides Multiple ways to achieve the same end Experimental computer science takes a lot of work

6 COMP 512, Rice University6 The Role of Comp 512 Bringing order out of chaos Provide a framework for thinking about optimization Differentiate analysis from transformation † Think about how things help, not what they do Goal: a rational approach to the subject matter Objective criteria for evaluating ideas & papers Bring high school level science back into the game † The Comp 512 Motto: Knowledge alone does not make code run faster. You have to change the code to make it run faster.

7 COMP 512, Rice University7 Classic Taxonomy Machine independent transformations Applicable across a broad range of machines Decrease ratio of overhead to real work Reduce running time or space Examples: dead code elimination Machine dependent transformations Capitalize on specific machine properties Improve the mapping from IR to this machine Might use an exotic instruction ( shift the reg. window for a loop ) Example: instruction scheduling

8 COMP 512, Rice University8 Classic Taxonomy Distinction is not always clear Replacing multiply with shifts and adds Eliminating a redundant expression The truth is somewhat muddled Machine independent means that we deliberately & knowingly ignore target-specific constraints Machine dependent means that we explicitly consider target- specific constraints Redundancy elimination might fit in either category  Versions that consider register pressure

9 COMP 512, Rice University9 The Comp 512 Taxonomy An effects-based classification (for speed) Five machine-independent ways to speed up code  Eliminate a redundant computation  Move code to a place where it executes less often  Eliminate dead code  Specialize a computation based on context  Enable another transformation Three machine-dependent ways to speed up the code  Manage or hide latency  Take advantage of special hardware features  Manage finite resources For scalar optimization, this covers most of them

10 COMP 512, Rice University10 Dead code Dead code elim. Partial d.c.e. Constant propagation Algebraic identities The Comp 512 Taxonomy Machine Independent From §6 of Cooper, McKinley, & Torczon Redundancy Redundancy elim. Partial red. elim. Consolidation Code motion Loop-invariant c.m. Consolidation Global Scheduling [Click] Constant propagation Create opportunities Reassociation Replication Specialization Replication Strength Reduction Method caching Heap  stack allocation Tail recursion elimination

11 COMP 512, Rice University11 The Comp 512 Taxonomy Machine Dependent From §6 of Cooper, McKinley, & Torczon Hide latency Scheduling Blocking references Prefetching Code layout Data packing Manage resources Allocate (registers, tlb slots) Schedule Data packing Coloring memory locations Special features Instruction selection Peephole optimization

12 COMP 512, Rice University12 What have we seen so far? Redundancy elimination  LVN, SVN, DVN  It is a category in taxonomy by itself Loop Unrolling  Form of specialization Dead store elimination  Form of dead code elimination Block Placement  Form of latency hiding & resource management Inline substitution  Form of specialization

13 COMP 512, Rice University13 What about Fortran H? (Lecture 2) Eliminate a redundant computation  Commoning Move code to a place where it executes less often  Backward motion Eliminate dead code  (must have done it, but don’t talk about it) Specialize a computation based on context  Strength reduction Enable another transformation  Reassociation Manage or hide latency Take advantage of special hardware features Manage finite resources  Register allocation } Didn’t really talk about these, if they did them *

14 COMP 512, Rice University14 Scope of Optimization ( another axis ) Local Handles individual basic blocks  Maximal length sequence of straight line code  Each of the b i’ s is a basic block Basic blocks are easy to analyze Can prove strongest results Code quality suffers at block boundaries Local methods Value numbering, instruction scheduling b4b4 b5b5 b6b6 b1b1 b3b3 b2b2 *

15 COMP 512, Rice University15 Superlocal Handles extended basic blocks  Sequence of blocks where each has a unique predecessor  Use results for b i to help with b j Analysis & transformation over larger region Fewer rough edges Can make it efficient by reusing results Superlocal methods Value numbering, instruction scheduling Scope of Optimization b4b4 b5b5 b6b6 b1b1 b3b3 b2b2 *

16 COMP 512, Rice University16 Regional Arbitrary subset of blocks ( loop nests, dominator subtrees ) Use results from one block to improve others Limiting scope can increase focus on performance critical regions Can eliminate some global impediments Regional methods Loop xforms (unroll, fuse, interchange, strip mining, blocking), DVNT, OSR, register promotion, prefetch insertion, software pipelining, trace scheduling... Scope of Optimization b1b1 b2b2 b3b3 b4b4 b5b5 Remember Fortran H *

17 COMP 512, Rice University17 Whole procedure ( global or intraprocedural ) Handles entire procedure Make decisions based on global knowledge & global benefit No rough edges inside procedure (tied to compilation unit) Classic data-flow analysis Global methods CSE (A VAIL, P RE, L CM ), constant propagation, G CRA, dead code elim., hoisting, copy coalescing,... Scope of Optimization b4b4 b5b5 b6b6 b1b1 b3b3 b2b2

18 COMP 512, Rice University18 Whole program ( interprocedural ) Handles more than one procedure, up to entire program Create even larger scopes for optimization Limited interactions between procedures  Parameters + global variables Analysis problems are harder Opportunities are different Issues for compiler structure Interprocedural methods Inline substitution, procedure cloning, constant propagation, using whole program analysis to support global xforms Scope of Optimization b4b4 b5b5 b6b6 b1b1 b3b3 b2b2 b4b4 b5b5 b6b6 b1b1 b3b3 b2b2 b4b4 b5b5 b6b6 b1b1 b3b3 b2b2

19 COMP 512, Rice University19 Decision Complexity Yet another axis on which to categorize optimizations Constant time Low-order polynomial time Hard Problems LVN, SVN, DVNT Block placement Tree balancing ( ILP ) Loop unrollingReassociation Inline substitution Spill Choice in RA Copy Coalescing

20 COMP 512, Rice University20 Near-term Roadmap Data-flow analysis Deeper treatment of iterative data-flow theory Quick look at other solvers Static single assignment form Construction and destruction of SSA Example algorithms that use SSA Populating the Taxonomy More transformations, more transformations, more … Try to fill in the odd corners of the taxonomy


Download ppt "Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled."

Similar presentations


Ads by Google