Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio.

Similar presentations


Presentation on theme: "1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio."— Presentation transcript:

1 1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio

2 2 For 25 years, branch predictors exploit: While (..){ If ((X % 3) || (X % 5)) {..} X++; } If (X< -2) {..} If (X> 1) {..} If (X==0) {..}  Local history predictors  Global history predictors

3 3 In practice, on real hardware, Just global history predictors  + a loop predictor (sometimes) local history is not very efficient  CBP4: ~5 % misprediction reduction  a mess to implement

4 4 The messy management of speculative local history Local History Table Local History Table update at commit time to prediction tables B h4 B h3 B h2 B h1 Speculative History for the most recent occurrence of branch B Window of inflight branches Several (many) instances of the same branch inflight: wrong history  wrong prediction

5 5 State-of-the-art global history predictors Neural predictors:  Piecewise linear, Hashed perceptron, SNAP, GEHL TAGE-GSC:  TAGE + a neural predictor TAGE-GSC= (TAGE-SC-L – local hist – loop pred.)

6 6 PC+ Glob hist ++ Prediction = sign Neural predictors

7 7 TAGE-GSC (Main) TAGE Predictor Stat. Cor. Prediction + Confidence PC +Global history Just a neural predictor: with TAGE prediction as an input

8 8 How predictors work Evers98: Branch B correlated with a few past branches  Not so many paths from correlators to B  Try to capture every path to B Kind of brute force approach

9 9 How to identify correlator branches The loop predictor does it smoothly for loops Albericio et al 2014 Correlation in multidimensinal loops

10 10 Wormhole branch prediction Albericio et al. Micro 2014  Correlation in multidimensional loops for (i=0;i <Nmax; i++) for (j=0; j < Mmax; j++){ if (A[j+i] >0) {..} if (B[i][j]-B[i-1][j])>0) if (C[j]>0){..} } Correlation with neighboring iterations but in the previous outer iteration j+i=Const same output j= Const weak correlation j= Const strong correlation

11 11 Wormhole predictor: a side predictor Monitor hard to predict branches:  in a loop with constant iteration number N (use the loop pred.)  Monitor the local history for this branch  Very long local history  Predict with a few bits in the local history (from the previous outer iteration) J-1J+1 JJ-1 N Outer iteration iOuter iteration i -1

12 12 Wormhole predictor + state-of-the-art global history predictor Capture correlation with a small number of entries  On a few branches  On a few benchmarks  CBP4 traces: 2 benchs / 40  CBP3 traces: 2 benchs / 40 But quite efficient on those traces

13 13 Wormhole predictor: not worth the implementation Requires a loop predictor Requires the branch to be executed on each iteration Unresolved issue of speculative local history management But let us keep the seminal observation

14 14 Let us analyze the problem Correlation to be captured is:  For branches in the inner most loop  With neighboring iterations, but in previous outer iteration(s) Would be nice to determine the iteration number !! for (i=0;i <Nmax; i++) for (j=0; j < Mmax; j++){ if (A[j+i] >0) {..} if (B[i][j]-B[i-1][j])>0) if (C[j]>0){..} }

15 15 The Inner Most Loop Iteration counter Most loops end by a conditional backward branch …B0...B1…..B3……B4….B5…...B6 if backward if taken IMLIcount ++ else IMLIcount =0 Perfectly counts the iteration numbers for the inner most loop

16 16 Same Iteration Correlation IMLI-SIC component IMLI-SIC component A predictor table indexed with IMLIcount and PC Just added to the neural part of predictor ++ IMLI SIC for (i=0;i <Nmax; i++) for (j=0; j < Mmax; j++){ if (A[j+i] >0) {..} if (B[i][j]-B[i-1][j])>0) if (C[j]>0){..} } correlation with Out[..][j]

17 17 IMLI-SIC component A simple add-on to TAGE-GSC or GEHL:  Brings higher accuracy than WH  Also captures most of the (small) benefit of the loop predictor  Get rid of the loop predictor !! Speculative IMLI counter easy to manage !! Works on different benchmarks than WH !!

18 18 What remains from WH ? for (i=0;i <Mmax; i++) for (j=0; j < Nmax; j++){ if (B[i-j])>0) {..} if (A[j]>0){ A[j]= -A[j];..} } Branch 1: correlation with Out[i-1][j-1] Branch 2: Correlation Out[i][j]=1-Out[i-1][j] Not the exact correlations but their forms

19 19 IMLI-OH component (PC<<6) +IMLI IMLI OH IMLI History PIPEPIPE PIPEPIPE PC prediction counter Provides Out[i-1][j] and Out[i-1][j-1] ++ IMLI SIC IMLI OH

20 20 Yes, but IMLI-OH uses local history ? Several (many) instances of the same branch inflight: wrong history  wrong prediction Instances of the branch with equal IMLI counter wrong history  read wrong IMLI OH entries The targeted branches feature large iteration numbers Use of effective OH history: Same (PC,IMLIcount) = already comitted The others branches don’t suffer:  the beauty of neural predictors

21 21 Accuracy improvement on TAGE-GSC 80 benchmarks CBP3+CBP4 6-7 % misprediction reduction avg

22 22 Shrinking the potential benefit of local history Add local history + loop predictor  Over TAGE-GSC:  5-6 % misp. reduction  Over TAGE-GSC-IMLI:  3-4 % misp. reduction Loop predictor alone?  < 0.5 % misp. reduction

23 23 Summary Fundamental observation by Albericio et al. :  Correlation in multidimensional loops IMLI-based components for TAGE-based and neural predictors  Simple implementation  Simple management of speculative states  Directly suitable for hardware implementation


Download ppt "1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio."

Similar presentations


Ads by Google