Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCE 212 Chapter 6 Enhancing Performance with Pipelining Instructor: Jason D. Bakos.

Similar presentations


Presentation on theme: "CSCE 212 Chapter 6 Enhancing Performance with Pipelining Instructor: Jason D. Bakos."— Presentation transcript:

1 CSCE 212 Chapter 6 Enhancing Performance with Pipelining Instructor: Jason D. Bakos

2 CSCE 212 2 Pipelining

3 CSCE 212 3 MIPS Pipeline Basic idea: –Execute multiple instructions in parallel –Split instruction execution into 5 stages –Instructions execute in “assembly-line” PCRegFile control ALU fetchdecodeexecutememorywrite back address MemoryDataIn rs/rt A instruction register op/func 4 SE/imm SE/imm*4 B SHAMT MemRead MemWrite Address MemoryOut MemoryIn R register control for: memory/wb rs/rt/rd ctrl/NOOP R A, B registers control for: execute/memory/wb rs/rt/rd MDR register control for: wb rs/rt/rd

4 CSCE 212 4 Pipelined MIPS

5 CSCE 212 5 Pipelined MIPS

6 CSCE 212 6 Pipelined Control

7 CSCE 212 7 Pipelined Control

8 CSCE 212 8 Pipelined Control

9 CSCE 212 9 MIPS ISA MIPS pipeline stages –Fetch (F) read next instruction from memory, increment address counter assume 1 cycle to access memory –Decode (D) read register operands, resolve instruction in control signals, compute branch target –Execute (E) execute arithmetic/resolve branches –Memory (M) perform load/store accesses to memory, take branches assume 1 cycle to access memory –Write back (W) write arithmetic results to register file

10 CSCE 212 10 Hazards Hazards are data flow problems that arise as a result of pipelining –Limits the amount of parallelism, sometimes induces “penalties” that prevent one instruction per clock cycle –Structural hazards Two operations require a single piece of hardware Structural hazards can be overcome by adding additional hardware –Control hazards Conditional control instructions are not resolved until late in the pipeline, requiring subsequent instruction fetches to be predicted –Flushed if prediction does not hold (make sure no state change) Branchhazards can use dynamic prediction/speculation, branch delay slot –Data hazards Instruction from one pipeline stage is “dependant” of data computed in another pipeline stage

11 CSCE 212 11 Hazards Data hazards –Register values “read” in decode, written during write-back RAW hazard occurs when dependent inst. separated by less than 2 slots Examples: –ADD $2,$X,$X(E)ADD $2,$X,$X (M)ADD $2,$3,$4 (W) –ADD $X,$2,$X(D)…… –…ADD $X,$2,$X (D)… –……ADD $X,$2,$3 (D) –In most cases, data generated in same stage as data is required (EX) Data forwarding –ADD $2,$X,$X(M)ADD $2,$X,$X (W)ADD $2,$3,$4 (out-of-pipe) –ADD $X,$2,$X(E)…… –…ADD $X,$2,$X (E)… –……ADD $X,$2,$3 (E)

12 CSCE 212 12 “Load” Hazards Stalls required when data is not produced in same stage as it is needed for a subsequent instruction –Example: LW $2, 0($X) (M) ADD $X, $2(E) When this occurs, insert a “bubble” into EX state, stall F and D LW $2, 0($X) (W) NOOP (M) ADD $X, $2 (E) –Forward from W to E

13 CSCE 212 13 Data Hazards: Forwarding

14 CSCE 212 14 Data Hazards: Stalling for Load Hazard

15 CSCE 212 15 Control Hazards Need to make a branch decision based on data that has yet to be produced: –add $2,$3,$4 –beqz $2,loop Which stage is branch resolved? Approaches: –stall insert bubbles after all branches –always predict untaken if taken, instructions entering DEC and EX (and MEM?) transfer as NOOPs –branch delay slot instruction following branch is always executed –dynamic branch predictors

16 CSCE 212 16 Control Hazards Instructions are fetched every clock cycle Branch decisions happen in the EX stage Solutions: –Assume branch not taken (performs a flush of IF, ID, EX by inserting a nop into the pipeline registers on the clock edge) –Reduce the delay by moving the branch decision up Requires additional hardware (comparators, etc.) –Might increase cycle time, since register read and resolution are now in series and must be performed in half a cycle to allow for parallel register writes! Requires forwarding and stall hardware for new data hazards

17 CSCE 212 17 Example add $6,$5,$2 lw $7,0($6) addi $7,$7,10 add $6,$4,$2 sw $7,0($6) addi $2,$2,4 blt $2,$3,loop add $6,$5,$2 FDEMW FDEMW 123456789101112 FD EMW FDEMW FDEMW FDEMW FDEMW 13 FDEMW 1415 8 instructions, 15 - 4 cycles, CPI = 11/8

18 CSCE 212 18 Moving up Branch Resolution

19 CSCE 212 19 Moving up Branch Resolution

20 CSCE 212 20 Scheduling the Branch Delay Slot

21 CSCE 212 21 Dynamic Branch Prediction Assume taken/not-taken (static) –Loops have branches that are usually taken When wrong, we flush pipeline stages Deeper pipelines have higher branch penalties (misprediction penalty) Solution: –Look up address of branch to check if branch was previously taken –One-bit schemes –Two-bit schemes (must be wrong twice to change prediction)


Download ppt "CSCE 212 Chapter 6 Enhancing Performance with Pipelining Instructor: Jason D. Bakos."

Similar presentations


Ads by Google