Presentation is loading. Please wait.

Presentation is loading. Please wait.

Out of Order SuperScalar Ankit Sethia Daya Shanker Gaurav Chadha Kuldeep Singh.

Similar presentations


Presentation on theme: "Out of Order SuperScalar Ankit Sethia Daya Shanker Gaurav Chadha Kuldeep Singh."— Presentation transcript:

1 Out of Order SuperScalar Ankit Sethia Daya Shanker Gaurav Chadha Kuldeep Singh

2 Basic Design  Out of Order (T3)  2 way SuperScalar  Number of RS – 16  Number of ROB – 64 (tested for 8 as well).  PRF entry – 64  ALU – 2  Multiplier - 2  System Verilog used for the design process.  Helpful in designing.  We had just 5 synthesis runs. 2

3 Advanced Features  2 way superscalar  Instruction Prefetcher  Stride Prefetcher  RAS  Load Store Queue (4 loads, 4 stores)  BTB, Local Branch Predictor  Non-blocking D - Cache Attempted Features  Unconditional branch resolution in IF stage. 3

4 LSQ  Out of order load launch.  After dependency resolution with preceding stores.  Forwarding of data from store queue to load structure.  Load structure is not a queue  Auxiliary load queue for outstanding loads 4

5 DCache  Handles Hit under Miss and Miss under Miss.  Can support 16 outstanding load requests.  Highest priority to eviction, followed by current request, followed by outstanding misses.  Has the highest priority among requests to memory. 5

6 Features Contd.  Heavy instruction Prefetching  60 at the max  Varied a lot  BTB/ Branch Predictor  2 bit local branch predictor 6

7 Features Contd.  Unconditional branch resolution in IF-stage.  Calculate the next PC for br/bsr in the IF-stage  RAS  Lot of difficulties in implementing RAS 7

8 Stride Prefetcher  A data prefetching mechanism, which prefetches data from stride based access pattern.  Can handle upto four loads.  Keeps a table of four non-stride loads that may be present.  3 rd highest priority among requests to memory. 8

9 Results  Final clock period after synthesis 6.7 ns  All 33 benchmarks passed in simulation and synthesis  CPI varies from 0.59 – 5.00 9

10 Results 10

11 Interesting Bugs  In the I-cache the input address doesn’t change but the data has changed. so the fetching stops  Eviction during branch squash - reason same reset.  Speculative load with invalid address returns a continuous Nack from the cache-controller. 11

12 Suggestions  System-Verilog was really helpful  Always_comb (no inferred latches ), always_ff  Don’t worry about wire, reg. Use logic type.  Structures, multiple dimensional arrays, literal assignment.  As queues are used a lot (ROB, LSQ, DCACHE, PREFETCHER). A robust queue could be given beforehand.  Faced problems with bottom up synthesis. This could be made as a tutorial section. 12

13 13


Download ppt "Out of Order SuperScalar Ankit Sethia Daya Shanker Gaurav Chadha Kuldeep Singh."

Similar presentations


Ads by Google