Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt

Similar presentations


Presentation on theme: "Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt"— Presentation transcript:

1 Wish Branches Combining Conditional Branching and Predication for Adaptive Predicated Execution
Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt The University of Texas at Austin *Oregon Microarchitecture Lab Electrical and Computer Engineering Intel Corporation

2 Talk Outline Problem Wish Branches Experimental Methodology Results
Conclusion

3 Predicated Execution C B D A B C D A if (cond) { b = 0; } else {
(normal branch code) C B D A T N p1 = (cond) branch p1, TARGET mov b, 1 jmp JOIN TARGET: mov b, 0 (predicated code) B C D A if (cond) { b = 0; } else { b = 1; A p1 = (cond) (!p1) mov b, 1 (p1) mov b, 0 B C D add x, b, 1 Convert control flow dependency to data dependency Pro: Eliminate hard-to-predict branches Cons: (1) Fetch blocks B and C all the time (2) Wait until p1 is resolved

4 The Overhead of Predicated Execution
-2% 16% 13% non-predicated A p1 = (cond) (0) mov b,1 (1) mov b,0 p1 = (cond) (!p1) mov b, 1 (p1) mov b, 0 B C D add x, b, 1 (Predicated code) If all overhead is ideally eliminated, predicated execution would provide 16% improvement in average execution time

5 The Problem Due to the predication overhead, predicated execution sometimes reduces performance Branch misprediction characteristics are dependent on run-time behavior: input set, control-flow path and phase behavior The compiler cannot accurately estimate the run-time behavior of branches

6 Talk Outline Problem Wish Branches Experimental Methodology Results
Conclusion

7 Wish Branches A new type of control flow instruction types: wish jump/join and wish loop The compiler generates code (with wish branches) that can be executed either as predicated code or non-predicated code (normal branch code) The hardware decides to execute predicated code or normal branch code at run-time based on the confidence of branch prediction Easy to predict: normal branch code Hard to predict: predicated code

8 Wish Jump/Join A wish jump C B D A nop B C D A B wish join Taken
High Confidence Low Confidence A wish jump C B D A T N mov b, 1 jmp JOIN TARGET: mov b,0 normal branch code nop B C D A p1 = (cond) (!p1) mov b,1 (p1) mov b,0 predicated code B wish join Taken Not-Taken C D A p1=(cond) wish.jump p1 TARGET p1 = (cond) branch p1, TARGET B nop (!p1) mov b,1 wish.join !p1 JOIN (1) mov b,1 wish.join (1) JOIN C TARGET: (1) mov b,0 TARGET: (p1) mov b,0 D JOIN: wish jump/join code

9 Wish Loop H X X High Confidence Low Confidence Y Y T T N N H X X (1) Y
do { a++; i++; } while (i<N); X T N N High Confidence Low Confidence Y Y H mov p1, 1 LOOP: (p1) add a, a, 1 (p1) add i, i, 1 (p1) p1 = (cond) wish. loop p1, LOOP EXIT: X X LOOP: add a, a, 1 add i, i, 1 p1 = (i<N) branch p1, LOOP EXIT: (1) Y Y normal backward branch code wish loop code

10 Mispredicted Case 1: Early-Exit
H Correct execution: H X1 X2 X3 Y T T N X T Early-exit: (Low confidence) Flush pipeline N H X1 X2 Y T N Y X3 Y N Compared to normal branch code: predicate data dependency and one extra instruction (-)

11 Mispredicted Case 2: Late-Exit
H Correct execution: H X1 X2 X3 Y T T N X T nop nop Late-exit: (Low confidence) N H X1 X2 X3 X4 X5 Y T T T T N Y Compared to normal branch code: pro: reduce flush penalty (+++) cons: predicate data dependency and one extra instruction (-)

12 Mispredicted Case 3: No-Exit
H Correct execution: H X1 X2 X3 Y T T N Flush pipeline X T No-exit: (Low confidence) N H X1 X2 X3 X4 X5 X6 T T T T T T Y Y Compared to normal branch code: predicate data dependency and one extra instruction (-)

13 Advantages/Disadvantages of Wish Branches
Advantages compared to predicated execution Reduce the overhead of predication Increase the benefits of predicated code by allowing the compiler to generate more aggressively-predicated code Provide a mechanism to exploit predication to reduce the branch misprediction penalty for backward branches (Wish loops) Make predicated code less dependent on machine configuration (eg. branch predictor)

14 Advantages/Disadvantages of Wish Branches
Disadvantages compared to predicated execution Extra branch instructions use machine resources Extra branch instructions increase the contention for branch predictor table entries May constrain the compiler’s scope for code optimizations

15 Wish Branch Support ISA Support Compiler Support Hardware Support
predicated execution, wish branch instruction Compiler Support Wish branch generation algorithms The compiler needs to decide which branches are predicated, which are converted to wish branches, and which stay as normal branches Hardware Support Confidence estimator Front-end and branch misprediction detection/recovery module

16 Talk Outline Problem Wish Branches Experimental Methodology Results
Conclusion

17 Experimental Infrastructure
Source Code IA-64 Binary IA-64 Trace µops IA-64 Compiler (ORC) Trace generation module Micro-op Translator Micro-op Simulator IA-64 provides full support for predication Convert IA-64 traces to micro-ops to simulate an out-of-order superscalar processor model

18 Simulation Methodology
Nine SPEC 2000 integer benchmarks Baseline Processor Configuration Front End Large and accurate branch predictor (64KB hybrid branch predictor: gshare + local) Minimum 30-cycle branch misprediction penalty 64KB, 2-cycle latency I-cache Execution Core 8-wide out-of-order processor 512-entry instruction window Confidence Estimator 1KB tagged 16-bit history JRS confidence estimator (Jacobsen et al. MICRO-29)

19 Talk Outline Problem Wish Branches Experimental Methodology Results
Conclusion

20 Performance Improvement
-4% 14% 2.02 8% 24% non-predicated 16% over conditional branch prediction (w/o mcf) 11% over selective-predication (w/o mcf) 7 % over aggressive predication (w/o mcf) 14% over conditional branch prediction and 13% over selective-predication and 16% over aggressive-predication 12% over conditional branch prediction 11% over selective-predication 13 % over aggressive predication SELECTIVE-PREDICATION: branches are selectively predicated using compile-time cost-benefit analysis AGGRESSIVE-PREDICATION: all branches that are suitable for if-conversion are predicated

21 Talk Outline Problem Wish Branches Experimental Methodology Results
Conclusion

22 Conclusion New control flow instructions: wish branches (jump/join/loop) Wish branches improve performance by dividing the work of predication between the compiler and the microarchitecture Compiler: analyzes the control-flow graph and generates code Microarchitecture: makes run-time decision to use predication Wish branches provide significant performance benefits 16% compared to conditional branch prediction 13% compared to selectively predicated code Wish branches can make predicated execution more viable and effective in high performance processors By enabling adaptive and aggressive predicated execution


Download ppt "Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt"

Similar presentations


Ads by Google