Presentation is loading. Please wait.

Presentation is loading. Please wait.

Superscalar Architecture Design Framework for DSP Operations Rehan Ahmed.

Similar presentations


Presentation on theme: "Superscalar Architecture Design Framework for DSP Operations Rehan Ahmed."— Presentation transcript:

1 Superscalar Architecture Design Framework for DSP Operations Rehan Ahmed

2 Overview Optimization tool. Alters superscalar architectural configuration parameters to suit a given DSP application. It alters the architectural blocks (Number of ALU, Cache Size etc).

3 Motivation Giving designers an initial idea about how their design should look like. Particularly useful for software defined radio applications.

4 Optimizations can target both power consumption and speed. Target Function: Simplescalar WATTCH Stage 1: Search and optimization algorithm (Simulated Annealing) Stage 2: Heuristic Approach

5 Simulated Annealing

6 Simulated Annealing Parameter set Sr NoParameterConfiguration 1IFQ1, 2, 4, 16, 32 2Branch Table16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384 3RAS16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192 4BTB16 4, 32 4, 64 4, 128 4, 256 4, 512 4, 1024 4, 2048 4, 4096 4, 8192 4 5Decode Width1, 2, 4, 16, 32 6Issue Width1, 2, 4, 16, 32 7Commit Width1, 2, 4, 16, 32 8RUU8, 16, 32, 64, 128, 256, 512, 1024 9LSQ8, 16, 32, 64, 128, 256, 512, 1024 10I Cache4:32:4:l, 8:32:4:l, 16:32:4:l, 32:32:4:l, 64:32:4:l, 128:32:4:l, 256:32:4:l, 1024:32:4:l, 2048:32:4:l, 8192:32:4:l 11D Cache4:32:4:l, 8:32:4:l, 16:32:4:l, 32:32:4:l, 64:32:4:l, 128:32:4:l, 256:32:4:l, 1024:32:4:l, 2048:32:4:l 12Bus Width4, 8, 16, 32, 64 13I TIB1:1024:4:l, 2:1024:4:l, 4:1024:4:l, 8:1024:4:l, 16:1024:4:l, 32:1024:4:l, 64:1024:4:l, 128:1024:4:l 14D TLB1:1024:4:l, 2:1024:4:l, 4:1024:4:l, 8:1024:4:l, 16:1024:4:l, 32:1024:4:l, 64:1024:4:l, 128:1024:4:l 15I ALU1, 2, 4, 8 16I Mul/Div1, 2, 4, 8 17Memory Ports1, 2, 4, 8 18FP ALU1, 2, 4, 8 19FP Mul/Div1, 2, 4, 8

7

8 Final configuration from simulated annealing further optimized using the heuristic approach Heuristic approach based on the operating principle of superscalar architecture.

9 Configuration ChangeMonitored Resultdir =0dir=1 1Branch TableBranch_MissesIncrDec 2BTBGainIncrDec 3Return Address StackGainIncrDec 4IFQ, Exec Win, I ALUIFQ_full, Eff_Gain, IPBIncrDec 5I ALUGainIncrDec 6I Mul/DivGainIncrDec 7FP ALUGainIncrDec 8FP Mul/DivGainIncrDec 9RUU Gain DecInc 10LSQ Gain DecInc 11I-Compress Gain En 12I-Cache Gain DecInc 13D-Cache Gain DecInc 14Instruction TLB Gain DecInc 15Data TLB Gain DecInc 16Bus Width Gain IncDec 17Memory To System Ports Gain IncDec 18Exit Stage Gain Nil

10 Optimization Results IFFT Operation Scale=40 (High precedence given to efficiency)

11

12 Results Summary Optimized Configuration performance measures Instructions per Cycle:1.1934 Average Power per Instruction:4.6744 Instructions per second (1GHz)1.193421 G Transistor Count10,645,929 Transistor Count for Pentium III9,500,000

13 IFFT Configuration Parameter Instruction Fetch Queue32 Branch Table Size32768 Return Address Stack16 Branch Target Buffer1024 Instruction Decode Width32 Instruction Issue Width2 Instruction Commit Width32 Register Update Unit16 Load Store Queue8 D Cache2 KB I Cache4KB Memory Bus Width64 bytes Instruction TLB32KB Data TLB16 KB Integer ALUs4 Integer Mul/Div1 Memory to System Ports2 Floating Point ALU1 Floating Point Mul/Div4


Download ppt "Superscalar Architecture Design Framework for DSP Operations Rehan Ahmed."

Similar presentations


Ads by Google