Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions.

Similar presentations


Presentation on theme: "Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions."— Presentation transcript:

1 Pipelining CS365 Lecture 9

2 D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions are overlapped in execution  Subset of MIPS instructions lw, sw, and, or, add, sub, slt, beq  Outline  Pipeline high-level introduction Stages, hazards  Pipelined datapath and control design

3 D. Barbara Pipeline CS465 3 ABCD Pipelining is Natural!  Laundry example  Ann, Brian, Cathy, Dave each has one load of clothes to wash, dry, and fold  Washer takes 30 minutes  Dryer takes 40 minutes  “Folder” takes 20 minutes

4 D. Barbara Pipeline CS465 4 ABCD 304020304020304020304020 6 PM 789 10 11 Midnight TaskOrderTaskOrder Time Sequential Laundry  Sequential laundry takes 6 hours for 4 loads  If they learned pipelining, how long would laundry take?

5 D. Barbara Pipeline CS465 5 ABCD 6 PM 789 10 11 Midnight TaskOrderTaskOrder Time 3040 20 Pipelined Laundry  Start work ASAP  Pipelined laundry takes 3.5 hours for 4 loads

6 D. Barbara Pipeline CS465 6 Pipelining Lessons (I)  Multiple tasks operating simultaneously using different resources  Pipelining doesn’t help latency of single task, it helps throughput of entire workload  Pipeline rate is limited by slowest pipeline stage  Unbalanced lengths of pipeline stages reduces speedup ABCD 6 PM 789 TaskOrderTaskOrder Time 3040 20

7 D. Barbara Pipeline CS465 7 Pipelining Lessons (II)  Potential speedup = Number pipeline stages  Time to “fill” pipeline and time to “drain” it reduces speedup- startup and wind down  Stall for dependencies ABCD 6 PM 789 TaskOrderTaskOrder Time 3040 20

8 D. Barbara Pipeline CS465 8 Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IfetchReg/DecExecMemWrLoad Five Stages of Workload  Ifetch: Instruction Fetch  Fetch the instruction from the Instruction Memory  Reg/Dec: Registers Fetch and Instruction Decode  Exec: Calculate the memory address  Mem: Read the data from the Data Memory  Wr: Write the data back to the register file

9 D. Barbara Pipeline CS465 9 Clk Cycle 1 Multiple Cycle Implementation: IfetchRegExecMemWr Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 LoadIfetchRegExecMemWr IfetchRegExecMem LoadStore Pipeline Implementation: IfetchRegExecMemWrStore Ifetch R-type IfetchRegExecMemWrR-type Clk Single Cycle Implementation: LoadStore Waste Cycle 1Cycle 2 Single Cycle, Multi-Cycle, Pipeline

10 D. Barbara Pipeline CS465 10 Why Pipeline? (Performance)  Suppose we execute 100 instructions  Single cycle machine  45 (ns/cycle) x 1 (CPI) x 100 (inst) = 4500 ns  Multicycle machine  10 (ns/cycle) x 4.4 (CPI) (due to inst mix) x 100 (inst) = 4400 ns  Ideal pipelined machine  10 (ns/cycle) x (1 (CPI) x 100 (inst) + 4 cycle drain) = 1040 ns

11 D. Barbara Pipeline CS465 11  Ideal speedup is no. of stages in the pipeline; in practice:  Pipeline stage time are limited by the slowest resource, either the ALU or memory access  Fill and drain time Pipelining Throughput

12 D. Barbara Pipeline CS465 12 I n s t r. O r d e r Time (clock cycles) Inst 0 Inst 1 Inst 2 Inst 4 Inst 3 ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg Why Pipeline? (Resource)

13 D. Barbara Pipeline CS465 13 Pipeline Hazards  Hazards prevent next instruction from executing during its designated clock cycle  Structural hazards: attempt to use the same resource two different ways at the same time E.g., combined washer/dryer would be a structural hazard or folder busy doing something else (watching TV) One memory port  Data hazards: attempt to use data before it is ready E.g., one sock of pair in dryer and one in washer; can’t fold until you get sock from washer through dryer Instruction depends on result of prior instruction still in the pipeline  Control hazards: attempt to make a decision before condition is evaluated Branch instructions

14 D. Barbara Pipeline CS465 14 Mem I n s t r. O r d e r Time (clock cycles) Load Instr 1 Instr 2 Instr 3 Instr 4 ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Reg MemReg ALU Mem Reg MemReg Structural Hazard: One Memory Solution 1: add more HW Hazards can always be resolved by waiting

15 D. Barbara Pipeline CS465 15 I n s t r. O r d e r Time (clock cycles) Load Instr 1 Instr 2 stall Instr 3 ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Mem Reg MemReg Structural Hazard: One Memory Hazards can always be resolved by waiting Bubble

16 D. Barbara Pipeline CS465 16 add r1,r2,r3 sub r4, r1,r3 and r6, r1,r7 or r8, r1,r9 xor r10, r1,r11 Data Hazard Example  Data hazard: an instruction depends on the result of a previous instruction still in the pipeline

17 D. Barbara Pipeline CS465 17 I n s t r. O r d e r Time (clock cycles) add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 IFIF ID/R F EXEX ME M WBWB ALU Im Reg Dm Reg ALU Im Reg DmReg ALU Im Reg DmReg Im ALU Reg DmReg ALU Im Reg DmReg Data Hazard Example  Dependences backward in time are hazards  Compilers can help, but it gets messy and difficult

18 D. Barbara Pipeline CS465 18 I n s t r. O r d e r Time (clock cycles) add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 IFIF ID/R F EXEX ME M WBWB ALU Im Reg Dm Reg ALU Im Reg DmReg ALU Im Reg DmReg Im ALU Reg DmReg ALU Im Reg DmReg Data Hazard Solution  Solution : “forward” result from one stage to another

19 D. Barbara Pipeline CS465 19 Reg Time (clock cycles) lw r1,0(r2) sub r4,r1,r3 IFIF ID/R F EXEX ME M WBWB ALU Im Reg Dm ALU Im Reg DmReg Data Hazard Even with Forwarding  Can’t go back in time! Must delay/stall instruction dependent on loads

20 D. Barbara Pipeline CS465 20 Reg Time (clock cycles) lw r1,0(r2) sub r4,r1,r3 IFIF ID/R F EXEX ME M WBWB ALU Im Reg Dm ALU Im Reg DmReg Stall Data Hazard Even with Forwarding  Must delay/stall instruction dependent on loads  Sometimes the instruction sequence can be reordered to avoid pipeline stalls

21 D. Barbara Pipeline CS465 21  Branch instructions may change execution flow  Suppose we can do decoding/branch decision/branch target computation at stage 2 Still introduce 1-cycle stall Implementation details later Control Hazards

22 D. Barbara Pipeline CS465 22 Control Hazard Solution: Predict  Predict: guess one direction then back up if wrong  Impact: 0 lost cycles per branch instruction if right, 1 if wrong  Need to “Squash” and restart following instruction if wrong  Prediction scheme  Random prediction: correct ­ 50% of time  History-based prediction: correct­ 90% of time

23 D. Barbara Pipeline CS465 23 Control Hazard Solution: Predict

24 D. Barbara Pipeline CS465 24 Pipeline Overview Summary  Pipelining is a fundamental concept  Multiple steps using distinct resources  Utilize capabilities of the datapath by pipelined instruction processing  Start next instruction while working on the current one  Detect and resolve hazards Structural hazards, data hazards, control hazards All hazards can be solved by stall Other approaches: forwarding, prediction, reordering  In modern processors, what really makes it hard:  Exception handling  Out-of-order execution  Next: datapath design for pipeling

25 D. Barbara Pipeline CS465 25 Single Cycle Datapath

26 D. Barbara Pipeline CS465 26 Multi Cycle Datapath  Divide the work into stages; internal registers

27 D. Barbara Pipeline CS465 27 Single-Cycle Pipeline Datagram  What do we need to add to split the datapath into stages?

28 D. Barbara Pipeline CS465 28 Pipelined Datapath  How many bits stored in each pipeline register? 64128 97 64

29 D. Barbara Pipeline CS465 29 Observations  5-stage pipeline  IF, ID, EX, MEM, WB  Left-to-right flow of instructions  Instructions and data move generally from left to right  Two exceptions: WB stage and the selection of PC May lead to data hazards and control hazards  Why there is no pipeline register at the end of the WB stage?  Last stage must update either register file, or memory, or PC

30 D. Barbara Pipeline CS465 30 Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7 IfetchReg/DecExecMemWr1st lw IfetchReg/DecExecMemWr2nd lw IfetchReg/DecExecMemWr3rd lw Pipelining the Load Instruction  The five independent functional units in the pipeline datapath are:  Instruction Memory for the IF stage  Register File’s Read Ports (busA and busB) for the ID stage  ALU for the EXE stage  Data Memory for the MEM stage  Register File’s Write port (bus W) for the WB stage

31 D. Barbara Pipeline CS465 31 Cycle 1Cycle 2Cycle 3Cycle 4 IfetchReg/DecExecWrR-type The Four Stages of R-type  IF: Instruction Fetch  Fetch the instruction from the Instruction Memory  ID: Registers Fetch and Instruction Decode  EXE: ALU operates on the two register operands  WB: Write the ALU output back to the register file

32 D. Barbara Pipeline CS465 32 Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9 IfetchReg/DecExecWrR-type IfetchReg/DecExecWrR-type IfetchReg/DecExecMemWrLoad IfetchReg/DecExecWrR-type IfetchReg/DecExecWrR-type Oops! We have a problem! Pipelining R-type and Load Instruction  We have pipeline conflict or structural hazard:  Two instructions try to write to the register file at the same time!  Only one write port

33 D. Barbara Pipeline CS465 33 Important Observation  Each functional unit can only be used once per instruction  Each functional unit must be used at the same stage for all instructions  Delay R-type’s register write by one cycle:  Now R-type instructions also use Reg File’s write port at Stage 5  Mem stage is a NO-OP stage: nothing is being done IfetchReg/Dec Exec WrR-type Mem 123 4 5 IfetchReg/DecExecMemStoreWr IfetchReg/DecExecMemBeqWr

34 D. Barbara Pipeline CS465 34 Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9 IfetchReg/DecMemWrR-type IfetchReg/DecMemWrR-type IfetchReg/DecExecMemWrLoad IfetchReg/DecMemWrR-type IfetchReg/DecMemWrR-type Exec Pipelined Execution  All instruction types have five pipeline stages  Some stages may be wasted for some instructions

35 D. Barbara Pipeline CS465 35 Pipelined Execution of Load Instruction

36 D. Barbara Pipeline CS465 36 Pipelined Execution of Load Instruction

37 D. Barbara Pipeline CS465 37 Pipelined Execution of Load Instruction

38 D. Barbara Pipeline CS465 38 Pipelined Execution of Load Instruction

39 D. Barbara Pipeline CS465 39 Pipelined Execution of Load Instruction

40 D. Barbara Pipeline CS465 40 Pipelined Execution of Store Instruction

41 D. Barbara Pipeline CS465 41 Pipelined Execution of Store Instruction

42 D. Barbara Pipeline CS465 42 Observations from Load and Store  Pass information needed from an earlier stage to a latter stage  Each logical component of the datapath – such as IM, Reg read ports, ALU, DM, Reg write port – can be used only within a single pipeline stage. Otherwise, we would have structural hazard  A bug in the pipelined datapath for load. Can you tell?

43 D. Barbara Pipeline CS465 43 Modified Datapath For basic R-Type, LW/SW, and BEQ

44 D. Barbara Pipeline CS465 44 Pipelined Execution for Multiple Instr.

45 D. Barbara Pipeline CS465 45 Pipelined Execution for Multiple Instr.

46 D. Barbara Pipeline CS465 46 Pipelined Execution for Multiple Instr.

47 D. Barbara Pipeline CS465 47 Pipelined Execution for Multiple Instr.

48 D. Barbara Pipeline CS465 48 Pipelined Execution for Multiple Instr.

49 D. Barbara Pipeline CS465 49 Pipelined Execution for Multiple Instr.

50 D. Barbara Pipeline CS465 50 Pipelined Datapath Control Fig. 6.22

51 D. Barbara Pipeline CS465 51 Overview on Datapath Control  For the subset of instructions under consideration  ALUOp = 00 for Add, 01 for Sub, and 10 for R-type

52 D. Barbara Pipeline CS465 52 Observations  No write control for all pipeline registers and PC since they are updated at every clock cycle  To specify the control for the pipeline, set the control values during each pipeline stage  Control lines can be divided into 5 groups:  IF –NONE  ID – NONE  ALU – RegDst, ALUOp, ALUSrc  MEM – Branch, MemRead, MemWrite  WB – MemtoReg, RegWrite  Group these nine control lines into 3 subsets:  ALUControl, MEMControl, WBControl  Control signals are generated at ID stage, how to pass them to other stages?

53 D. Barbara Pipeline CS465 53 Pass Control Signals  Extend the pipeline registers to include control information

54 D. Barbara Pipeline CS465 54 The Complete Pipelined Datapath Fig 6.27

55 D. Barbara Pipeline CS465 55 Example Pipeline Execution  Show the five instructions going through the pipeline: lw$10, 20($1) sub$11, $2, $3 and$12, $4, $5 or$13, $6, $7 add$14, $8, $9 Note that these instructions are independent from each other!

56 D. Barbara Pipeline CS465 56 Clock1

57 D. Barbara Pipeline CS465 57 Clock2

58 D. Barbara Pipeline CS465 58 Clock3

59 D. Barbara Pipeline CS465 59 Clock4

60 D. Barbara Pipeline CS465 60 Clock5

61 D. Barbara Pipeline CS465 61 Clock6

62 D. Barbara Pipeline CS465 62 Clock7

63 D. Barbara Pipeline CS465 63 Clock8

64 D. Barbara Pipeline CS465 64 Clock9

65 D. Barbara Pipeline CS465 65 Summary  Overview of pipeline  Stages  Hazards  Pipelined datapath  Pipeline registers  Pipelined execution  Pipelined control  Different signals for different stages  Propagate control signals

66 D. Barbara Pipeline CS465 66 Next Lecture  Topic:  Pipeline hazards and solutions  Exception handling  Reading  Patterson & Hennessy Ch6.4-6.9


Download ppt "Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions."

Similar presentations


Ads by Google