Presentation is loading. Please wait.

Presentation is loading. Please wait.

Morgan Kaufmann Publishers The Processor

Similar presentations


Presentation on theme: "Morgan Kaufmann Publishers The Processor"— Presentation transcript:

1 Morgan Kaufmann Publishers The Processor
10 November, 2018 Chapter 4 The Processor Chapter 4 — The Processor

2 Morgan Kaufmann Publishers
10 November, 2018 Pipeline Hazards Situations that prevent starting the next instruction in the next cycle Structural hazard A required resource is busy hardware cannot support the combination of instructions that we want to execute in the same clock cycle. Data hazard Need to wait for previous instruction to complete its data read/write Control hazard Deciding on control action depends on previous instruction Chapter 4 — The Processor

3 Morgan Kaufmann Publishers
10 November, 2018 Structural Hazards Conflict for use of a resource In LEGv8 pipeline with a single memory Load/store requires data access Instruction fetch would have to stall for that cycle Would cause a pipeline “bubble” Hence, pipelined datapaths require separate instruction/data memories Or separate instruction/data caches Chapter 4 — The Processor

4 Morgan Kaufmann Publishers
10 November, 2018 Data Hazards An instruction depends on completion of data access by a previous instruction ADD X19, X0, X1 SUB X2, X19, X3 The add instruction doesn’t write its result until the fifth stage, i.e. have to waste three clock cycles in the pipeline. Chapter 4 — The Processor

5 Forwarding (aka Bypassing)
Morgan Kaufmann Publishers Forwarding (aka Bypassing) 10 November, 2018 Use result when it is computed Don’t wait for it to be stored in a register Requires extra connections in the datapath Forwarding/bypassing: adding extra hardware to retrieve the missing item early from the internal resources Chapter 4 — The Processor

6 Morgan Kaufmann Publishers
Load-Use Data Hazard 10 November, 2018 Can’t always avoid stalls by forwarding If value not computed when needed Can’t forward backward in time! Called pipeline stall or bubble Chapter 4 — The Processor

7 Code Scheduling to Avoid Stalls
Morgan Kaufmann Publishers 10 November, 2018 Code Scheduling to Avoid Stalls Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; LDUR X1, [X0,#0] LDUR X2, [X0,#8] ADD X3, X1, X2 STUR X3, [X0,#24] LDUR X4, [X0,#16] ADD X5, X1, X4 STUR X5, [X0,#32] LDUR X1, [X0,#0] LDUR X2, [X0,#8] LDUR X4, [X0,#16] ADD X3, X1, X2 STUR X3, [X0,#24] ADD X5, X1, X4 STUR X5, [X0,#32] stall stall 13 cycles 11 cycles Chapter 4 — The Processor

8 Morgan Kaufmann Publishers
10 November, 2018 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can’t always fetch correct instruction Still working on ID stage of branch In LEGv8 pipeline Need to compare registers and compute target early in the pipeline Add hardware to do it in ID stage extra hardware can test a register, calculate the branch address, and update the PC during the second stage of the pipeline Chapter 4 — The Processor

9 Morgan Kaufmann Publishers
Stall on Branch 10 November, 2018 Wait until branch outcome determined before fetching next instruction Pipeline showing stalling on every conditional branch as solution to control hazards Chapter 4 — The Processor

10 Morgan Kaufmann Publishers
10 November, 2018 Branch Prediction Longer pipelines can’t readily determine branch outcome early Stall penalty becomes unacceptable Predict outcome of branch Only stall if prediction is wrong In LEGv8 pipeline Can predict branches not taken Fetch instruction after branch, with no delay Chapter 4 — The Processor

11 Branch Prediction

12 More-Realistic Branch Prediction
Morgan Kaufmann Publishers 10 November, 2018 More-Realistic Branch Prediction Static branch prediction Conditional branches predicted as taken and some as untaken Based on typical branch behavior Example: loop and if-statement branches Predict backward branches taken Predict forward branches not taken Dynamic branch prediction Hardware measures actual branch behavior e.g., record recent history of each branch as taken or untaken and then use the recent past behavior to predict the future Assume future behavior will continue the trend When wrong, stall while re-fetching, and update history Chapter 4 — The Processor

13 Morgan Kaufmann Publishers
Pipeline Summary 10 November, 2018 The BIG Picture Pipelining improves performance by increasing instruction throughput Executes multiple instructions in parallel Each instruction has the same latency Subject to hazards Structure, data, control Instruction set design affects complexity of pipeline implementation Chapter 4 — The Processor

14 LEGv8 Pipelined Datapath
Morgan Kaufmann Publishers LEGv8 Pipelined Datapath 10 November, 2018 §4.6 Pipelined Datapath and Control PC update leads to control hazards MEM Right-to-left flow leads to hazards WB WB leads to data hazards Chapter 4 — The Processor

15 Morgan Kaufmann Publishers
Pipeline registers 10 November, 2018 Need registers between stages To hold information produced in previous cycle Chapter 4 — The Processor

16 Morgan Kaufmann Publishers
10 November, 2018 Pipeline Operation Cycle-by-cycle flow of instructions through the pipelined datapath “Single-clock-cycle” pipeline diagram Shows pipeline usage in a single cycle Highlight resources used c.f. “multi-clock-cycle” diagram Graph of operation over time We’ll look at “single-clock-cycle” diagrams for load & store Chapter 4 — The Processor

17 Morgan Kaufmann Publishers
10 November, 2018 IF for Load, Store, … Instruction is read from memory using the address in the PC and then placed in the IF/ID pipeline register. The PC address is incremented by 4 and then written back into the PC to be ready for the next clock cycle. This incremented address is also saved in the IF/ID pipeline register Chapter 4 — The Processor

18 Morgan Kaufmann Publishers
ID for Load, Store, … 10 November, 2018 The instruction portion of the IF/ID pipeline register supplying the immediate field is sign-extended to 64 bits, and the register numbers to read the two registers. All three values are stored in the ID/EX pipeline register, along with the incremented PC address. Chapter 4 — The Processor

19 Morgan Kaufmann Publishers
EX for Load 10 November, 2018 The load instruction reads the contents of a register and the sign-extended immediate from the ID/EX pipeline register and adds them using the ALU. That sum is placed in the EX/MEM pipeline register. Chapter 4 — The Processor

20 Morgan Kaufmann Publishers
MEM for Load 10 November, 2018 The load instruction reads the data memory using the address from the EX/MEM pipeline register and loads the data into the MEM/WB pipeline register. Chapter 4 — The Processor

21 Morgan Kaufmann Publishers
WB for Load 10 November, 2018 Read the data from the MEM/WB pipeline register and writ it into the register file in the middle of the figure. Wrong register number Chapter 4 — The Processor

22 Morgan Kaufmann Publishers
Pipeline registers 10 November, 2018 Need registers between stages To hold information produced in previous cycle Chapter 4 — The Processor

23 Morgan Kaufmann Publishers
10 November, 2018 Pipeline Operation Cycle-by-cycle flow of instructions through the pipelined datapath “Single-clock-cycle” pipeline diagram Shows pipeline usage in a single cycle Highlight resources used c.f. “multi-clock-cycle” diagram Graph of operation over time We’ll look at “single-clock-cycle” diagrams for load & store Chapter 4 — The Processor

24 Morgan Kaufmann Publishers
10 November, 2018 IF for Load, Store, … Instruction is read from memory using the address in the PC and then placed in the IF/ID pipeline register. The PC address is incremented by 4 and then written back into the PC to be ready for the next clock cycle. This incremented address is also saved in the IF/ID pipeline register Chapter 4 — The Processor

25 Morgan Kaufmann Publishers
ID for Load, Store, … 10 November, 2018 The instruction portion of the IF/ID pipeline register supplying the immediate field is sign-extended to 64 bits, and the register numbers to read the two registers. All three values are stored in the ID/EX pipeline register, along with the incremented PC address. Chapter 4 — The Processor

26 Morgan Kaufmann Publishers
EX for Load 10 November, 2018 The load instruction reads the contents of a register and the sign-extended immediate from the ID/EX pipeline register and adds them using the ALU. That sum is placed in the EX/MEM pipeline register. Chapter 4 — The Processor

27 Morgan Kaufmann Publishers
MEM for Load 10 November, 2018 The load instruction reads the data memory using the address from the EX/MEM pipeline register and loads the data into the MEM/WB pipeline register. Chapter 4 — The Processor

28 Morgan Kaufmann Publishers
WB for Load 10 November, 2018 Read the data from the MEM/WB pipeline register and write it into the register file in the middle of the figure. Which instruction supplies the write register number? The instruction in the IF/ID pipeline register supplies the write register number, yet this instruction occurs considerably after the load instruction! Wrong register number Chapter 4 — The Processor

29 Corrected Datapath for Load
Morgan Kaufmann Publishers Corrected Datapath for Load 10 November, 2018 Preserve the destination register number in the load instruction Load must pass the register number from the ID/EX through EX/MEM to the MEM/WB pipeline register for use in the WB stage. In other words, to share the pipelined datapath, we need to preserve the instruction read during the IF stage, so each pipeline register contains a portion of the instruction needed for that stage and later stages. Chapter 4 — The Processor

30 Corrected Datapath for Load
Hardware used in all five stages

31 Morgan Kaufmann Publishers
EX for Store 10 November, 2018 The first two stages – Instruction Fetch and Instruction Decode/ register read are the same as Load instruction In the third stage, the effective address is placed in the EX/MEM pipeline register. Chapter 4 — The Processor

32 Morgan Kaufmann Publishers
MEM for Store 10 November, 2018 The data is written to memory in the fourth stage. The register containing the data to be stored was read in an earlier stage and stored in ID/EX. The only way to make the data available during the MEM stage is to place the data into the EX/MEM pipeline register in the EX stage Chapter 4 — The Processor

33 Morgan Kaufmann Publishers
WB for Store 10 November, 2018 Nothing happens in the write-back stage Chapter 4 — The Processor

34 Multi-Cycle Pipeline Diagram
Morgan Kaufmann Publishers Multi-Cycle Pipeline Diagram 10 November, 2018 Form showing resource usage Chapter 4 — The Processor

35 Multi-Cycle Pipeline Diagram
Morgan Kaufmann Publishers Multi-Cycle Pipeline Diagram 10 November, 2018 Traditional form Chapter 4 — The Processor

36 Single-Cycle Pipeline Diagram
Morgan Kaufmann Publishers Single-Cycle Pipeline Diagram 10 November, 2018 State of pipeline in a given cycle Chapter 4 — The Processor


Download ppt "Morgan Kaufmann Publishers The Processor"

Similar presentations


Ads by Google