Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza.

Slides:



Advertisements
Similar presentations
CSC 4181 Compiler Construction Code Generation & Optimization.
Advertisements

Code Optimization and Performance Chapter 5 CS 105 Tour of the Black Holes of Computing.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Computer Architecture Lecture 7 Compiler Considerations and Optimizations.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Program Representations. Representing programs Goals.
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Recap from last time Saw several examples of optimizations –Constant folding –Constant Prop –Copy Prop –Common Sub-expression Elim –Partial Redundancy.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
ECE1724F Compiler Primer Sept. 18, 2002.
9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.
Previous finals up on the web page use them as practice problems look at them early.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Hardware-Software Interface Machine Program Performance = t cyc x CPI x code size X Available resources statically fixed Designed to support wide variety.
Intermediate Code. Local Optimizations
Improving Code Generation Honors Compilers April 16 th 2002.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Precision Going back to constant prop, in what cases would we lose precision?
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
Compiler Code Optimizations. Introduction Introduction Optimized codeOptimized code Executes faster Executes faster efficient memory usage efficient memory.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
What’s in an optimizing compiler?
Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a.
Lengthening Traces to Improve Opportunities for Dynamic Optimization Chuck Zhao, Cristiana Amza, Greg Steffan, University of Toronto Youfeng Wu Intel Research.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
University of Amsterdam Computer Systems – optimizing program performance Arnoud Visser 1 Computer Systems Optimizing program performance.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Overview of Compilers and JikesRVM John.
Introduction to ECE 454 Computer Systems Programming Topics: Lecture topics and assignments Profiling rudiments Lab schedule and rationale Cristiana Amza.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Programming for Performance CS 740 Oct. 4, 2000 Topics How architecture impacts your programs How (and how not) to tune your code.
CS 3214 Computer Systems Godmar Back Lecture 8. Announcements Stay tuned for Project 2 & Exercise 4 Project 1 due Sep 16 Auto-fail rule 1: –Need at least.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
Credible Compilation With Pointers Martin Rinard and Darko Marinov Laboratory for Computer Science Massachusetts Institute of Technology.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
High-level optimization Jakub Yaghob
Code Optimization.
Optimization Code Optimization ©SoftMoore Consulting.
CS 3114 Many of the following slides are taken with permission from
Code Optimization I: Machine Independent Optimizations
Princeton University Spring 2016
Machine-Independent Optimization
Optimizing Transformations Hal Perkins Autumn 2011
Optimizing Transformations Hal Perkins Winter 2008
Compiler Code Optimizations
Code Optimization Overview and Examples Control Flow Graph
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Intermediate Code Generation
Lecture 19: Code Optimisation
Tour of common optimizations
Tour of common optimizations
CS 3114 Many of the following slides are taken with permission from
Machine-Independent Optimization
Tour of common optimizations
Code Optimization.
Presentation transcript:

Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza

– 2 – Inside a Basic Compiler Front End High-level language (C, C++, Java) Low-level language (IA64) HLL IR Code Generator LLL Intermediate Representation (similar to assembly)

– 3 – Inside an optimizing compiler Front EndOptimizer High-level language (C, C++, Java) Low-level language (IA32) HLLIR (Improved) Code Generator LLL

– 4 – Control Flow Graph: (how a compiler sees your program) add … L1: add … add … branch L2 add … L2: add … branch L1 return … Example IR: Basic Blocks: Basic Block: a group of consecutive instructions with a single entry point and a single exit point add … L1: add … add … branch L2 add … L2: add … branch L1 return …

– 5 – Performance Optimization: 3 Requirements  Preserve correctness the speed of an incorrect program is irrelevant  On average improve performance Optimized may be worse than original if unlucky  Be “worth the effort” Is this example worth it? 1 person-year of work to implement compiler optimization 2x increase in compilation time 0.1% improvement in speed

– 6 – How do optimizations improve performance? Execution_time = num_instructions * CPI Fewer cycles per instruction  Schedule instructions to avoid dependencies  Improve cache/memory behavior Eg., locality Fewer instructions  Eg: Target special/new instructions

– 7 – Role of Optimizing Compilers Provide efficient mapping of program to machine eliminating minor inefficiencies code selection and ordering register allocation Don’t (usually) improve asymptotic efficiency up to programmer to select best overall algorithm big-O savings are (often) more important than constant factors but constant factors also matter

– 8 – Limitations of Optimizing Compilers Operate Under Fundamental Constraints Must not cause any change in program behavior under any possible condition Most analysis is performed only within procedures inter-procedural analysis is too expensive in most cases Most analysis is based only on static information compiler has difficulty anticipating run-time inputs When in doubt, the compiler must be conservative

– 9 – Role of the Programmer How should I write my programs, given that I have a good, optimizing compiler? Don’t: Smash Code into Oblivion Hard to read, maintain, & assure correctnessDo: Select best algorithm Write code that’s readable & maintainable Procedures, recursion Even though these factors can slow down code Eliminate optimization blockers Allows compiler to do its job Focus on Inner Loops Do detailed optimizations where code will be executed repeatedly Will get most performance gain here

– 10 – Compiler Optimizations Machine independent (apply equally well to most CPUs) Constant propagation Constant folding Common Subexpression Elimination Dead Code Elimination Loop Invariant Code Motion Function Inlining Machine dependent (apply differently to different CPUs) Instruction Scheduling Loop unrolling Parallel unrolling Could do these manually, better if compiler does them Many optimizations make code less readable/maintainable

– 11 – Constant Propagation (CP) a = 5; b = 3; : n = a + b; for (i = 0 ; i < n ; ++i) { : } Replace variables with constants when possible n = 5 + 3

– 12 – Constant Folding (CF) Evaluate expressions containing constants Can lead to further optimization  Eg., another round of constant propagation : n = 5 + 3; for (i = 0 ; i < n ; ++i) { : } n = 8 8

– 13 – Common Sub-expression Elimination (CSE) a = c * d; : d = (c * d + t) * u  Try to only compute a given expression once (assuming the variables have not been modified) a = c * d; : d = (a + t) * u

– 14 – Dead Code Elimination (DCE) debug = 0; // set to False : if (debug) { : } a = f(b);  Compiler can determine if certain code will never execute:  Compiler will remove that code  You don’t have to worry about such code impacting performance  i.e., you are more free to have readable/debugable programs!  debug = 0; : a = f(b);

– 15 – Loop Invariant Code Motion (LICM) for (i=0; i < 100 ; ++i) { for (j=0; j < 100 ; ++j) { for (k=0 ; k < 100 ; ++k) { a[i][j][k] = i*j*k; } Loop invariant: value does not change across iterations LICM: move invariant code out of the loop Big performance wins:  Inner loop will execute 1,000,000 times  Moving code out of inner loop results in big savings  for (i = 0; i < 100 ; ++i) { t1 = a[i]; for (j = 0; j < 100 ; ++j) { tmp = i * j; t2 = t1[j]; for (k = 0 ; k < 100 ; ++k) { t2[k] = tmp * k; }

Function Inlining main(){ … x = foo(x); … } foo(int z){ int m = 5; return z + m; } main(){ … { int foo_z = x; int m = 5; int foo_return = foo_z + m; x = foo_return; } … } main(){ … x = x + 5; … } Code size can decrease if small procedure body and few calls can increase if big procedure body and many calls Performance eliminates call/return overhead can expose potential optimizations can be hard on instruction-cache if many copies made As a Programmer: a good compiler should inline for best performance feel free to use procedure calls to make your code readable!

– 17 – Loop Unrolling reduces loop overhead Fewer adds to update j Fewer loop condition tests enables more aggressive instruction scheduling more instructions for scheduler to move around j = 0; while (j < 100){ a[j] = b[j+1]; j += 1; } j = 0; while (j < 99){ a[j] = b[j+1]; a[j+1] = b[j+2]; j += 2; }

– 18 – Summary: gcc Optimization Levels -g: Include debug information, no optimization-O0: Default, no optimization-O1: Do optimizations that don’t take too long CP, CF, CSE, DCE, LICM, inlining small functions-O2: Take longer optimizing, more aggressive scheduling-O3: Make space/speed trade-offs: loop unrolling, more inlining-Os: Optimize program size