Presentation is loading. Please wait.

Presentation is loading. Please wait.

CA406 Computer Architecture Pipelines... continued.

Similar presentations


Presentation on theme: "CA406 Computer Architecture Pipelines... continued."— Presentation transcript:

1 CA406 Computer Architecture Pipelines... continued

2 Pipelines Data Hazards Code: lw $4, 0($1) add $15, $1, $1 sub$2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) The last four instructions all depend on a result produced by the first! MIPS instructions have the format op dest, src a, src b

3 Pipelines - Data hazards Examine the pipeline (ignore first 2!) r2 only updated in time for add!

4 Pipelines - Data Hazards Compiler solution Insert NOOPs Inefficient!

5 Pipelines - Data Hazards Second compiler solution Reorder lw $4, 0($1) add $15, $1, $1 sub $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) sub $2, $1, $3 lw $4, 0($1) add $15, $1, $1 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) These two must not define $1 or $3! Read Written

6 Pipelines - Data Hazards Second compiler solution Reorder sub $2, $1, $3 lw $4, 0($1) add $15, $1, $1 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) Read Written First use of $2

7 Pipelines - Data Hazards Compiler analyses dependencies Register definitions Register use Read After Write (RAW) dependency No dependencies Instruction can be moved! sub $2, $1, $3 lw $4, 0($1) add $15, $1, $1 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15,100($2) Written Uses of $2

8 Pipelines - Data Hazards Hardware solution Value forwarding Hardware detects dependency scoreboard Forwards result from WB to EX for subsequent use Hardware Transparent to software!

9 Data Hazards - classification Read after Write (RAW) Instruction 1 must write before instruction 2 reads Write after Write (WAW) Instructions 1 and 2 both write Instruction 2 must write after 1 Write after Read (WAR) Instruction 1 reads Instruction 2 writes (overwrites) Instruction 2 must not write before 1 reads Reordering algorithms must consider all three!

10 Lecture 5 - Key Points Data Hazards RAW - most common WAW WAR Compiler looks for dependencies then re-orders Hardware Scoreboard Monitors dependencies ensures correct operation Value forwarding hardware Forwards results from EX stage

11 Pipelines - Exceptions Caused by overflow, underflow Example add $1, $2, $1 Overflow detected in EX stage Causes jump to exception handler as branch - remainder of pipeline flushed but Compiler needs original $1 causing overflow  Register must not be overwritten EX stage needs to squash WB operation Precise Exception problem - more later!

12 Pipelines - Depth Pipeline can’t be too deep Hazards are frequent èmany stalls in deep pipelines 0.5 1.0 1.5 2.0 2.5 124816 Relative Performance Pipeline Depth Too Deep!

13 Pipelines - Depth Pipeline can’t be too deep Hazards are frequent èmany stalls in deep pipelines 0.5 1.0 1.5 2.0 2.5 124816 Relative Performance Pipeline Depth Too Deep! Superpipelined

14 CISC and pipelines High Speed CISC processors are pipelined Overlap IF, EX Variable instruction length running time (number of microcode cycles) èpipeline imbalance è“backup” in pipe stages ècomplicate hazard detection Complex addressing modes èauto-increment updates address register èmultiple memory accesses required èsmooth pipeline flow more difficult!

15 Instruction Queues Vital performance determinant Rate of instruction fetch High Performance processors Fetch multiple instructions in each cycle 2 - 4 common Use wide datapath to memory PowerPC 604128 bits = 4 instructions Despatch unit Examine dependencies Determine which instructions can be despatched

16 Instruction Queues Q “matches” fetch/despatch rates General Strategy for matching Producers - Consumers Use of FIFO-style Queues Absorb Asynchronous Delivery / Consumption Rates Provides Elasticity in pipelines Producer FIFO Consumer Differing Instantaneous Rates

17 Superscalar Processors Multiple Functional Units PowerPC 604 ð6-way superscalar Despatch Unit Sends “ready” instructions to all free units PowerPC 604: potential 4 instructions/cycle (pipeline lengths are different!) reality: 2-3 instructions/cycle? (program dependent!) Branch Unit LoadStore Unit 3 Integer Units Floating Point Unit


Download ppt "CA406 Computer Architecture Pipelines... continued."

Similar presentations


Ads by Google