Presentation is loading. Please wait.

Presentation is loading. Please wait.

IA-64 Microarchitecture --- Itanium Processor

Similar presentations


Presentation on theme: "IA-64 Microarchitecture --- Itanium Processor"— Presentation transcript:

1 IA-64 Microarchitecture --- Itanium Processor
Jun Feng Jun Xie Huafeng Lü

2 Outline Introduction Pipeline Issue Performance Comparison Summary

3 Itanium Processor First implementation of IA-64
Compiler based exploitation of ILP Also has many features of superscalar

4

5

6 10-stage Pipeline Front-end Instruction delivery Operand delivery
Execution

7

8 Front-end IPG, Fetch, Rotate
Prefetches up to 32 bytes per cycle (2 bundles) into a prefetch buffer (up to hold 8 bundles) Branch prediction is done using a multilevel adaptive predictor

9

10 Instruction delivery EXP and REN
Distributes up to 6 instructions to the 9 functional units Implements registers renaming for both rotation and register stacking

11

12 Operand delivery WLD and REG Accesses the register file
Performs register bypassing Accesses and updates a register scoreboard Checks predicate dependences

13

14 Execution EXE, DET and WRB
Executes instructions through ALUs and load/store units Detects exceptions and posts NaTs Retires instructions and performs write-back

15

16

17 Integer Performance SPECint benchmark: considerably slower
Itanium is considerably slower than Alpha and Pentium 4. Only: 60% of of P4, 68% of Alpha Itanium: HP rx4610, 800MHz, 4MB off-chip L3 cache Alpha 21264: Compaq GS320, 1GHz, on-chip L2 cache Pentium 4: Compaq Precision 330, 2GHz, 256KB on-chip L2 cache

18 Floating Point Performance SPECfp benchmarks: a different story
Itanium is quicker than Alpha and Pentium 4. 108% of of P4, 120% of Alpha Itanium: HP rx4610, 800MHz, 4MB off-chip, L3 cache Alpha 21264: Compaq GS320, 1GHz, on-chip L2 cache Pentium 4: Compaq Precision 330, 2GHz, on-chip L2 cache

19 Discussion on SPECfp Floating point app: competitive
.higher degrees of ILP .aggressive memory system Art benchmark: 4 times of Pentium 4 Alpha: outperform when tuned In terms of power: worse than P4 56% of floating point performance per watt

20 Summary By Us Good floating point performance Poor integer performance
Overall: not so good as Intel has advertised

21 Conclusion Large code size Only static instruction-level parallelism
Cannot manage cache misses/hits flexibly Lack of applications


Download ppt "IA-64 Microarchitecture --- Itanium Processor"

Similar presentations


Ads by Google