Presentation is loading. Please wait.

Presentation is loading. Please wait.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.

Similar presentations


Presentation on theme: "U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710."— Presentation transcript:

1 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710 Spring 2003 Balanced Scheduling

2 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 2 Topics Last time Instruction scheduling Gibbons & Muchnick This time Balanced scheduling Kerns & Eggers

3 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 3 List Scheduling, Redux Build dependence dag Choose instructions from ready list Schedule using heuristics [Gibbons & Muchnick] Instruction with greatest latency Instruction with most successors Instruction on critical path

4 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 4 Fly in the Ointment When scheduling loads, assume hit in primary cache On older architectures, this makes sense: Stall execution on cache miss But newer architectures are nonblocking: Processor executes other instructions while load in progress Good – creates more ILP – but…

5 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 5 Scheduling Options Now what? Assume cache miss takes N cycles N typically 10 or more Do we schedule load: Anticipating 1 cycle delay (a hit)? optimistic Or N cycle delay (a miss)? pessimistic

6 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 6 Optimistic vs. Pessimistic Optimistic: fine for hits, inferior for misses Pessimistic: fine for hits, better for misses Optimistic L0 X2 X1 X3 X4 Pessimistic L0 X2 X3 X1 X4

7 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 7 Optimistic vs. Pessimistic, Multiple Loads Optimistic: better for hits, same for misses Pessimistic: worse for hits, same for misses Optimistic L1 X1 L2 X2 X3 Pessimistic L1 X1 X2 L2 X3

8 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 8 Balanced Scheduling Key insights: No fixed estimate of memory latency is best Schedule based available parallelism in the code Load level parallelism Balanced scheduling: Computes each weight separately Takes other possible instructions into account Space out loads, using available instructions as “filler”

9 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 9 Balanced Scheduling, Example Maximizes distance between L0 & X1 Good in case of miss Balanced L0 X2 X3 X1 X4

10 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 10 Balanced Scheduling, Example W: load instruction weight W=5 – over-estimate Greedy schedule W=1 – under-estimate Lazy schedule Balanced scheduler: W=3 (= load-level parallelism)

11 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 11 Balanced Scheduling, Results Always achieves fewest interlocks

12 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 12 Algorithm Idea Examine each instruction i in dag Determine which loads can run in parallel with i Use all (or part) of i’s execution time to cover latency of loads

13 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 13 Balanced Scheduling, Weight Calculation Time complexity?

14 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 14 Balanced Scheduling, Example Locate longest load paths in connected components Add 1/(# of loads) to load’s weights

15 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 15 Balanced Scheduling, Example II Consider instruction X1 Locate longest load paths in connected components Add 1/(# of loads) to load’s weights “contributions of X1”

16 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 16 Balanced Scheduling, All Weights

17 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 17 Balanced Scheduling Algorithm After computing weights, perform list scheduling where: Priority = weight plus max priority of successors Break ties: Largest delta between consumed & defined registers Rank based on successors in dag that would be exposed Select instruction generated earliest Bottom-up scheduler: Reverse-order, schedule from leaves toward roots

18 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 18 Balanced Scheduling, Example I Balanced L0 X2 X3 X1 X4

19 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 19 Balanced Scheduling, Example II

20 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 20 Limitations Performed after register allocation But: introduces false dependences Reuse of registers ) dag has extra edges Can be fixed with software register renaming Had to modify gcc’s RTL Approach required manual pipelining Profile-based feedback… Benchmark based on FORTRAN converted to C with f2c Can’t disambiguate memory Adds many edges to dag

21 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 21 “Workaround”: Simulate Fortran Modify code to avoid aliases Improves results, but incorrect! Needs advanced alias analysis

22 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 22 Empirical Results Evaluated using simulation 3% to 18% improvement over regular scheduler across different models Mean: 9.9% Unfortunately: No results presented without above-mentioned modifications…

23 U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 23 Conclusion Balanced scheduling Spreads out instructions to cover load latency Based on exploitable load-level parallelism Effective at improving performance Modulo methodological limitations… Not so great for C/C++, possibly useful for Java Next time: interprocedural analysis ACDI: Ch. 19, pp. 607-636, 641-656


Download ppt "U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710."

Similar presentations


Ads by Google