Wish Branches Combining Conditional Branching and Predication for Adaptive Predicated Execution The University of Texas at Austin *Oregon Microarchitecture.

Presentation on theme: "Wish Branches Combining Conditional Branching and Predication for Adaptive Predicated Execution The University of Texas at Austin *Oregon Microarchitecture."— Presentation transcript:

Wish Branches Combining Conditional Branching and Predication for Adaptive Predicated Execution The University of Texas at Austin *Oregon Microarchitecture Lab Electrical and Computer Engineering Intel Corporation Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt

2 Talk Outline  Problem  Wish Branches  Experimental Methodology  Results  Conclusion

3 Predicated Execution Convert control flow dependency to data dependency Pro: Eliminate hard-to-predict branches (normal branch code) CB D A T N p1 = (cond) branch p1, TARGET mov b, 1 jmp JOIN TARGET: mov b, 0 A B C B C D A (predicated code) A B C if (cond) { b = 0; } else { b = 1; } Cons:(1) Fetch blocks B and C all the time (2) Wait until p1 is resolved D add x, b, 1 p1 = (cond) (!p1) mov b, 1 (p1) mov b, 0

4 p1 = (cond) (!p1) mov b, 1 (p1) mov b, 0 The Overhead of Predicated Execution If all overhead is ideally eliminated, predicated execution would provide 16% improvement in average execution time A B C (Predicated code) D add x, b, 1 non-predicated p1 = (cond) (0) mov b,1 (1) mov b,0 -2% 13%16%

5 The Problem  Due to the predication overhead, predicated execution sometimes reduces performance  Branch misprediction characteristics are dependent on run-time behavior: input set, control-flow path and phase behavior. The compiler cannot accurately estimate the run-time behavior of branches

6 Talk Outline  Problem  Wish Branches  Experimental Methodology  Results  Conclusion

7 Wish Branches  A new type of control flow instruction 3 types: wish jump/join and wish loop  The compiler generates code (with wish branches) that can be executed either as predicated code or non-predicated code (normal branch code)  The hardware decides to execute predicated code or normal branch code at run-time based on the confidence of branch prediction  Easy to predict: normal branch code  Hard to predict: predicated code

8 TARGET: (p1) mov b,0 TARGET: (1) mov b,0 (!p1) mov b,1 wish.join !p1 JOIN (1) mov b,1 wish.join (1) JOIN Low Confidence Wish Jump/Join p1 = (cond) branch p1, TARGET CB D A T N mov b, 1 jmp JOIN TARGET: mov b,0 normal branch code A B C B C D A p1 = (cond) (!p1) mov b,1 (p1) mov b,0 predicated code A B C wish jump/join code B A C D wish jump p1=(cond) wish.jump p1 TARGET A B C wish join D JOIN: High Confidence nop Taken Not-Taken

9 Low Confidence Wish Loop X Y N T LOOP: add a, a, 1 add i, i, 1 p1 = (i { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/12/3342207/slides/slide_9.jpg", "name": "9 Low Confidence Wish Loop X Y N T LOOP: add a, a, 1 add i, i, 1 p1 = (i

10 Mispredicted Case 1: Early-Exit X1X1 X2X2 X3X3 Y TTN Correct execution: Early-exit: (Low confidence) X1X1 X2X2 T Y N X3X3 Y N Flush pipeline Compared to normal branch code: predicate data dependency and one extra instruction (-) … X Y N T H H H

11 Mispredicted Case 2: Late-Exit X1X1 X2X2 X3X3 Y TTN Correct execution: Late-exit: (Low confidence) X1X1 X2X2 T X3X3 T Compared to normal branch code: pro : reduce flush penalty (+++) cons: predicate data dependency and one extra instruction (-) T X4X4 T X5X5 N Y… nop X Y N T H H H

12 Mispredicted Case 3: No-Exit X1X1 X2X2 X3X3 Y TTN Correct execution: No-exit: (Low confidence) X1X1 X2X2 T X3X3 T Compared to normal branch code: predicate data dependency and one extra instruction (-) T X4X4 T X5X5 T X6X6 … T Flush pipeline Y X Y N T H H H

13 Advantages/Disadvantages of Wish Branches  Advantages compared to predicated execution Reduce the overhead of predication Increase the benefits of predicated code by allowing the compiler to generate more aggressively-predicated code Provide a mechanism to exploit predication to reduce the branch misprediction penalty for backward branches (Wish loops) Make predicated code less dependent on machine configuration (eg. branch predictor)

14 Advantages/Disadvantages of Wish Branches  Disadvantages compared to predicated execution Extra branch instructions use machine resources Extra branch instructions increase the contention for branch predictor table entries May constrain the compiler ’ s scope for code optimizations

15 Wish Branch Support  ISA Support predicated execution, wish branch instruction  Compiler Support Wish branch generation algorithms The compiler needs to decide which branches are predicated, which are converted to wish branches, and which stay as normal branches  Hardware Support Confidence estimator Front-end and branch misprediction detection/recovery module

16 Talk Outline  Problem  Wish Branches  Experimental Methodology  Results  Conclusion

17 Experimental Infrastructure  IA-64 provides full support for predication  Convert IA-64 traces to micro-ops to simulate an out-of-order superscalar processor model IA-64 Compiler (ORC) Source Code IA-64 Binary IA-64 Trace µ ops Trace generation module Micro-op Translator Micro-op Simulator

18 Simulation Methodology  Nine SPEC 2000 integer benchmarks  Baseline Processor Configuration Front End  Large and accurate branch predictor (64KB hybrid branch predictor: gshare + local)  Minimum 30-cycle branch misprediction penalty  64KB, 2-cycle latency I-cache Execution Core  8-wide out-of-order processor  512-entry instruction window Confidence Estimator  1KB tagged 16-bit history JRS confidence estimator (Jacobsen et al. MICRO-29)

19 Talk Outline  Problem  Wish Branches  Experimental Methodology  Results  Conclusion

20 SELECTIVE-PREDICATION: branches are selectively predicated using compile-time cost-benefit analysis AGGRESSIVE-PREDICATION: all branches that are suitable for if- conversion are predicated 16% over conditional branch prediction (w/o mcf) 11% over selective-predication (w/o mcf) 7 % over aggressive predication (w/o mcf) 14% over conditional branch prediction and 13% over selective-predication and 16% over aggressive-predication 12% over conditional branch prediction 11% over selective-predication 13 % over aggressive predication Performance Improvement 24% 8% 14% -4% non-predicated 2.02

21 Talk Outline  Problem  Wish Branches  Experimental Methodology  Results  Conclusion

22 Conclusion  New control flow instructions: wish branches (jump/join/loop)  Wish branches improve performance by dividing the work of predication between the compiler and the microarchitecture Compiler: analyzes the control-flow graph and generates code Microarchitecture: makes run-time decision to use predication  Wish branches provide significant performance benefits 16% compared to conditional branch prediction 13% compared to selectively predicated code  Wish branches can make predicated execution more viable and effective in high performance processors By enabling adaptive and aggressive predicated execution

Download ppt "Wish Branches Combining Conditional Branching and Predication for Adaptive Predicated Execution The University of Texas at Austin *Oregon Microarchitecture."

Similar presentations