Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research Tomofumi Yuki EJCP 2016 June 29, Lille.

Similar presentations


Presentation on theme: "Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research Tomofumi Yuki EJCP 2016 June 29, Lille."— Presentation transcript:

1 Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research
Tomofumi Yuki EJCP 2016 June 29, Lille

2 Background Defended Ph.D. in C.S. on October 2012
Colorado State University Advisor: Dr. Sanjay Rajopadhye Currently Inria Chargé de Recherche Rhône-Alpes, ENS Lyon Optimizing compiler + programming language static analysis (polyhedral model) parallel programming models High-Level Synthesis I will start with a short reminder of my background. I did my PhD work at CSU under the supervision of Dr.Sanjay Rajopadhye. Currently, I am a post-doc at Inria Rennes, within the CAIRN team. My past work is in optimizing compilers and programming languages. In particular, I have worked on static analyses, parallel programming models, and high-level synthesis. EJCP 2016, June 29, Lille

3 What is this Course About?
Research in compilers a bit about compiler itself Understand compiler research what are the problems? what are the techniques? what are the applications? may be do research in compilers later on! Be able to (partially) understand work by “compiler people” at conferences. Many domain is related to compilation, much more than a typical starting student may think of, and when you meet ppl at conferences, basic understanding of the context. Not going into parsing/lexing blah blah EJCP 2016, June 29, Lille

4 What is a Compiler? What does compiler mean to you?
EJCP 2016, June 29, Lille

5 Compiler Advances Old compiler vs recent compiler
modern architecture gcc -O3 vs gcc -O0 How much speedup by compiler alone after 45 years of research? EJCP 2016, June 29, Lille

6 Proebsting’s Law HW gives 60%/year
Compiler Advances Double Computing Power Every 18 Years Someone actually tried it: On Proebsting’s Law, Kevin Scott, 2001 SPEC95, compared against –O0 3.3x for int 8.1x for float HW gives 60%/year EJCP 2016, June 29, Lille

7 Compiler Advances Old compiler vs recent compiler Not so much?
modern architecture gcc -O3 vs gcc -O0 3~8x difference after 45 years Not so much? EJCP 2016, June 29, Lille

8 Compiler Advances Old compiler vs recent compiler Not so much?
modern architecture gcc -O3 vs gcc -O0 3~8x difference after 45 years Not so much? “The most remarkable accomplishment by far of the compiler field is the widespread use of high-level languages.” by Mary Hall, David Padua, and Keshav Pingali [Compiler Research: The Next 50 Years, 2009] EJCP 2016, June 29, Lille

9 Earlier Accomplishments
Getting efficient assembly register allocation instruction scheduling ... High-level language features object-orientation dynamic types automated memory management EJCP 2016, June 29, Lille

10 What is Left? Parallelism Security/Reliability Power/Energy
multi-cores, GPUs, ... language features for parallelism Security/Reliability verification certified compilers Power/Energy data movement voltage scaling EJCP 2016, June 29, Lille

11 Agenda for today Part I: What is Compiler Research?
Part II: Compiler Optimizations Lab: Introduction to Loop Transformations EJCP 2016, June 29, Lille

12 What is a Compiler? Bridge between “source” and “target” source target
EJCP 2016, June 29, Lille

13 Compiler vs Assembler What are the differences? source target compile
object/machine code assembly object assemble EJCP 2016, June 29, Lille

14 Compiler vs Assembler Compiler Assembler
Many possible targets (semi-portable) Many decisions are taken Assembler Specialized output (non-portable) Usually a “translation” EJCP 2016, June 29, Lille

15 Goals of the Compiler Higher abstraction Performance
No more writing assemblies! enables language features loops, functions, classes, aspects, ... Performance while increasing productivity speed, space, energy, ... compiler optimizations EJCP 2016, June 29, Lille

16 Productivity vs Performance
Higher Abstraction ≈ Less Performance Python Java Abstraction C Fortran Assembly Performance EJCP 2016, June 29, Lille

17 Productivity vs Performance
How much can you regain? Python Python Java C Fortran Java Abstraction C Fortran Assembly Performance EJCP 2016, June 29, Lille

18 Productivity vs Performance
How sloppy can you write code? Python Java C Fortran Python Java Abstraction C Fortran Assembly Performance EJCP 2016, June 29, Lille

19 Compiler Research Branch of Programming Languages
Program Analysis, Transformations Formal Semantics Type Theory Runtime Systems Compilers ... EJCP 2016, June 29, Lille

20 Current Uses of Compiler
Optimization important for vendors many things are better left to the compiler parallelism, energy, resiliency, ... Code Analysis IDEs static vs dynamic analysis New Architecture IBM Cell, GPU, Xeon-Phi, ... EJCP 2016, June 29, Lille

21 Examples Two classical compiler optimizations register allocation
instruction scheduling EJCP 2016, June 29, Lille

22 Case 1: Register Allocation
Classical optimization problem 3 registers 8 instructions 2 registers 6 instructions C = A + B; D = B + C; load %r1, A load %r2, B add %r3, %r1, %r2 store %r3, C load %r1, B load %r2, C store %r3, D naïve translation load %r1, A load %r2, B add %r1, %r1, %r2 store %r1, C add %r1, %r2, %r1 store %r1, D smart compilation EJCP 2016, June 29, Lille

23 Register Allocation in 5min.
Often viewed as graph coloring Live Range: when a value is “in use” Interference: both values are “in use” e.g., two operands of an instruction Coloring: conflicting nodes to different reg. a b c d Live Range Analysis a b d c Interference Graph b c = a + b; d = b + c; add %r1, %r1, %r2 add %r1, %r2, %r1 Assume unbounded number of registers and load memory into “virtual” registers. EJCP 2016, June 29, Lille

24 Register Allocation in 5min.
Registers are limited a b c d x y a b d c y x a b d c y x c = a + b; d = b + c; x = c + d; y = a + x; Live Range Splitting a b c d x y z Assume unbounded number of registers and load memory into “virtual” registers. a b d c z x a b d c z x a = load A; c = a + b; d = b + c; x = c + d; z = load A; y = z + x; EJCP 2016, June 29, Lille 24

25 Research in Register Allocation
How to do a good allocation which variables to split which values to spill How to do it fast? Graph-coloring is expensive Just-in-Time compilation “Solved” EJCP 2016, June 29, Lille

26 Case 2: Instruction Scheduling
Another classical problem X = A * B * C; Y = D * E * F; R = A * B; X = R * C; S = D * E; Y = S * F; naïve translation R = A * B; S = D * E; X = R * C; Y = S * F; smart compilation Pipeline Stall (if mult. takes 2 cycles) Also done in hardware (out-of-order) EJCP 2016, June 29, Lille

27 Research in Instruction Scheduling
Not much anymore for speed/parallelism beaten to death hardware does it for you Remains interesting in specific contexts faster methods for JIT energy optimization “predictable” execution in-order cores, VLIW, etc. EJCP 2016, June 29, Lille

28 Case 1+2: Phase Ordering Yet another classical problem
practically no solution Given optimization A and B A after B vs A before B which order is better? can you solve the problem globally? Parallelism requires more memory trade-off: register pressure vs parallelism EJCP 2016, June 29, Lille

29 Job Market Where do they work at? Many opportunities in France IBM
Mathworks amazon Xilinx start-ups Many opportunities in France Grenoble Many start-ups EJCP 2016, June 29, Lille


Download ppt "Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research Tomofumi Yuki EJCP 2016 June 29, Lille."

Similar presentations


Ads by Google