Presentation is loading. Please wait.

Presentation is loading. Please wait.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science John Cavazos Architecture and Language Implementation Lab Thesis Seminar University.

Similar presentations


Presentation on theme: "U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science John Cavazos Architecture and Language Implementation Lab Thesis Seminar University."— Presentation transcript:

1 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science John Cavazos Architecture and Language Implementation Lab Thesis Seminar University of Massachusetts, Amherst Learning for Optimizing Compilers

2 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 2 Motivation Compiler writers have a difficult task optimizations are NP-hard computer architectures are complex computer architects need rapid evaluation Generating heuristics manually is slow, complicated, and ad hoc.

3 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 3 Propose Supervised Learning Induces heuristics automatically Training examples a,b,c,…,z label a,b,c,…z : properties of problem label : proper decision to make Two objectives: Minimize error Prefer less complicated function LOCO (Learning for Optimizing COmpilers)

4 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 4 Benefits of Supervised Learning Heuristic construction sped up Determines relative importance of features Effective heuristics Comparable to hand-tuned heuristics Theoretically sound Traditional approach ad hoc

5 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 5 Taxonomy of Compiler Heuristics 1. What Order to Apply Optimizations Phase-ordering heuristics 2. When to Optimize Filters 3. Which Optimization Algorithm to Apply Hybrid Optimizations 4. How to Optimize Priority Functions

6 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 6 The LOCO Methodology Determine class of heuristic Generate raw data Instrument compiler Process raw data Thresholds Generates training data Induce heuristic Integrate into compiler

7 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 7 The LOCO Methodology Supervised Learning Instrumented Compiler Training Set Production Compiler Generate raw learning data Process raw data (Thresholding) Rule induction Induces heuristic LOCO

8 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 8 Experimental Setup Java JIT compiler Jikes RVM 2.0.2 PowerPC 533 MHz G4, model 7410 Case Study 1: SPEC JVM benchmarks Case Study 2: Scientific benchmarks Scheduling improves by 4% or more

9 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 9 Hybrid Register Allocation Case Study 1

10 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 10 Motivation Register Allocation: important Effective use of registers Different Algorithms to choose from Graph coloring: possibly expensive Linear scan: not always effective Which algorithm to apply?

11 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 11 Solution Features predict which algorithm to use Heuristic function controls allocator Reduces cost significantly Retains most benefit Successful with simple features Applicable to other optimizations

12 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 12 Hybrid Register Allocation

13 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 13 Features of Methods FeaturesMeaning Out, In, and Exception Out Edges Out, in, and exception out edges in CFG (total, avg) Live on Entry Live on Exit Number of edges live on entry and exit (total, min, max) Insts and BlocksNumber of instructions and blocks in method (total) Block sizeSize of blocks (max, min, avg) IntervalsNumber of live intervals (max, total, avg) SymbolicsNumber of symbolics (total, avg)

14 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 14 Hybrid Register Allocation

15 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 15 Inducing Heuristic Controller For each block generate raw training data Features of method Additional spills incurred Cost of allocation algorithms Process raw data to generate training set Leave-one-out cross-validation Output of LOCO = heuristic controller

16 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 16 Labeling Training Instances Two factors: Cost of register allocation Spill benefit of different allocators Prefer graph coloring If benefit above threshold Prefer linear scan If graph coloring cost above threshold No spill benefit

17 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 17 Motivation for Threshold Technique Noise reduction technique Simplifies learning Removes cases of fine distinction Separation by a threshold gap For example: T=10% model estimates improvement by 10%

18 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 18 Thresholding Cost Threshold (0.5) Spill Threshold(8192) Graph ColoringLinear Scan No Instance

19 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 19 Labeling Training Instances If (LS_Spill – GC_Spill > Spill_Threshold) Print “GC”; Else If (LS_Cost/GC_Cost > Cost_Threshold) Print “LS”; Else if (LS_Spill – GC_Spill <= 0) Print “LS”; Else { // No Label } High Spill Benefit High Cost No Spill Benefit Skip Training Instance

20 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 20 Threshold Example

21 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 21 Spill Loads (Opt Level 3, 8 Regs)

22 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 22 Benchmark Running Times (Opt Level 3, 8 Regs)

23 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 23 Register Allocation Stats (Opt Level 3, 8 Regs) REG ALGRun TimeAllocation Cost GC91.9%100% B0C093.4%83.0% B8kC093.1%71.2% B64kC093.7%66.7% B0C5093.3%82.4% B8kC5094.0%40.9% B64kC5096.6%27.9% LS100%13.0%

24 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 24 Register Allocation Cost (Opt Level 3, 8 Regs)

25 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 25 Hybrid Register Allocation is Successful Significantly reduce register allocation time Reduced allocation time by 60% Preserve benefit of graph coloring Achieved 93% of graph coloring benefit LOCO effective for this heuristic

26 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 26 Instruction Scheduling Filters Case Study 2:

27 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 27 Motivation Instruction scheduling: important Improvements over 15% But: Expensive Frequently not beneficial Problem: Can we predict which blocks benefit from scheduling?

28 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 28 Solution Features of block predict when to schedule Heuristic controls scheduling Reduces cost of scheduling Retains benefit of scheduling Successful with simple features Filter for applying scheduler

29 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 29 An Optimization Filter

30 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 30 Features of Block FeaturesKindMeaning BBLenBlock sizeNumber of Instructions Load, Store, Branch, Call Return OperationFraction of that type of instruction Integer, Float, SystemFunctional unitFraction of instruction that executes on that FU PEI, GC, Yield, ThreadHazardFraction of that type of hazard instruction

31 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 31 Inducing a Filter Construct cheap-to-compute features of a block Obtain training instances that include: Features of the block Labels (Scheduling benefit to block) Induce a filter using LOCO We used rule induction Use the filter to control when compiler schedules

32 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 32 Block Timing Estimator Estimate of cycles to execute block Simple model of real machine Determines cost of block in isolation Relative cycle differences important Not absolute cycle counts

33 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 33 Labeling using Thresholds

34 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 34 Running Time with Filtering

35 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 35 Running Time with Filtering

36 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 36 Running Time with Filtering

37 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 37 Scheduling Time with Filtering

38 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 38 Scheduling Time with Filtering

39 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 39 Filtering Statistics

40 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 40 Filters are Successful Significantly reduce scheduling time Reduced scheduling time by 75% Preserve benefit of scheduling Achieved 93% of scheduling benefit LOCO effective for this heuristic

41 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 41 Related Work Supervised learning Loop-unrolling and tiling Genetic algorithms Hyperblocks, reg allocation, prefetching (MIT) Application-specific compilation strategy (Rice) Reinforcement learning Used to induce heuristic for scheduling (UMass) We argue LOCO is better

42 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 42 Future Work More work on filters Inlining and SSA-based opts More work on hybrid optimizations Garbage collection More work on priority functions Register allocation spill heuristic Use LOCO anywhere a heuristic is used

43 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 43 Conclusion LOCO effective at constructing heuristics Faster than most alternatives LOCO can lead to insights More readable than other alternatives LOCO heuristics competitive Comparable to hand-tuned heuristics LOCO easier to use

44 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 44 Spill Loads (Opt Level 1, 8 Regs)

45 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 45 Register Allocation Cost (Opt Level 1, 8 Regs)

46 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 46 Benchmark Running Times (Opt Level 1, 8 Regs)


Download ppt "U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science John Cavazos Architecture and Language Implementation Lab Thesis Seminar University."

Similar presentations


Ads by Google