Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECE 232 L22.Pipeline3.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 22 Pipelining,

Similar presentations


Presentation on theme: "ECE 232 L22.Pipeline3.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 22 Pipelining,"— Presentation transcript:

1 ECE 232 L22.Pipeline3.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 22 Pipelining, data and control hazards Maciej Ciesielski www.ecs.umass.edu/ece/labs/vlsicad/ece232/spr2002/index_232.html

2 ECE 232 L22.Pipeline3.2 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Outline °Pipelining hazards, review Structural hazards Control hazards Data hazards °Pipelined control °Delayed branches Means to resolve control hazards

3 ECE 232 L22.Pipeline3.3 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers °The Five Classic Components of a Computer °Today’s Topics: Recap last lecture: hazards Hazards/Forwarding Pipelined Control The Big Picture: Where are We Now? Control Datapath Memory Processor Input Output Pipelined datapath

4 ECE 232 L22.Pipeline3.4 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Recap: Can pipelining get us into trouble? °Yes: Pipeline Hazards structural hazards: attempt to use the same resource two different ways at the same time -e.g., multiple memory accesses, multiple register writes -solutions: multiple memories, stretch pipeline control hazards: attempt to make a decision before condition is evaulated -e.g., any conditional branch -solutions: prediction, delayed branch data hazards: attempt to use item before it is ready, eg. -add r1,r2,r3; sub r4, r1,r5; -lw r6, 0(r7); or r8, r6,r9 -solutions: forwarding/bypassing, stall/bubble

5 ECE 232 L22.Pipeline3.5 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Recap °Pipelining is a fundamental concept multiple steps using distinct resources °Utilize capabilities of the Datapath by pipelined instruction processing start next instruction while working on the current one limited by length of longest stage (plus fill/flush) detect and resolve hazards °What makes it easy all instructions are the same length just a few instruction formats memory operands appear only in loads and stores °Hazards make it hard °We’ll build a simple pipeline and look at these issues

6 ECE 232 L22.Pipeline3.6 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Recap: Pipelined Datapath with Data Stationary Control Operand Register Selects ALU Op MEM Op Result Reg Select and Enable npc I mem Regs B alu S D mem m IAU PC lw $2,20($5) Regs A imoprwn PC <= PC + 4 + immed

7 ECE 232 L22.Pipeline3.7 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Details of “Data Stationary Control” °The Main Control generates the control signals during Reg/Dec Control signals for Exec (ExtOp, ALUSrc,...) are used 1 cycle later Control signals for Mem (MemWr Branch) are used 2 cycles later Control signals for Wr (MemtoReg MemWr) are used 3 cycles later IF/ID Register ID/Ex Register Ex/Mem Register Mem/Wr Register Reg/DecExecMem ExtOp ALUOp RegDst ALUSrc Branch MemWr MemtoReg RegWr Main Control ExtOp ALUOp RegDst ALUSrc MemtoReg RegWr MemtoReg RegWr MemtoReg RegWr Branch MemWr Branch MemWr Wr

8 ECE 232 L22.Pipeline3.8 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipeline Hazards I-Fet ch DCD MemOpFetch OpFetch Exec Store IFetch DCD ° ° ° Structural hazard I-Fet ch DCD OpFetch Jump IFetch DCD ° ° ° Control hazard IF DCD EX Mem WB IF DCD OF Ex Mem RAW (read after write) Data Hazard WAW Data Hazard (write after write) IF DCD OF Ex RSWAR Data Hazard (write after read) IF DCD EX Mem WB Data hazards

9 ECE 232 L22.Pipeline3.9 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Data Hazards °Avoid some “by design” eliminate WAR by always fetching operands early (DCD) in pipe eleminate WAW by doing all WBs in order (last stage, static) °Detect and resolve remaining ones stall or forward (if possible) IF DCD EX Mem WB IF DCD OF Ex Mem RAW Data Hazard WAW Data Hazard IF DCD OF Ex RSRAW Data Hazard IF DCD EX Mem WB

10 ECE 232 L22.Pipeline3.10 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Hazard Detection °Suppose instruction i is about to be issued and a predecessor instruction j is in the instruction pipeline. °A RAW hazard exists on register  if  Rregs( i )  Wregs( j ) Keep a record of pending writes (for inst's in the pipe) and compare with operand regs of current instruction. When instruction issues, reserve its result register. When on operation completes, remove its write reservation. °A WAW hazard exists on register  if  Wregs( i )  Wregs( j ) °A WAR hazard exists on register  if  Wregs( i )  Rregs( j )

11 ECE 232 L22.Pipeline3.11 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Record of Pending Writes °Current operand registers °Pending writes °hazard <= ((rs == rw ex) & regW ex ) OR ((rs == rw mem) & regW me ) OR ((rs == rw wb) & regW wb ) OR ((rt == rw ex) & regW ex ) OR ((rt == rw mem) & regW me ) OR ((rt == rw wb ) & regW wb ) npc I mem Regs B alu S D mem m IAU PC Regs A imoprwn oprwn oprwn op rw rs rt

12 ECE 232 L22.Pipeline3.12 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Resolve RAW by forwarding °Detect nearest valid write op operand register and forward into op latches, bypassing remainder of the pipe Increase muxes to add paths from pipeline registers Data Forwarding = Data Bypassing

13 ECE 232 L22.Pipeline3.13 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Stall: Freeze above & Bubble Below npc I mem Regs B alu S D mem m IAU PC Regs A imoprwn oprwn oprwn op rw rs rt bubble freeze

14 ECE 232 L22.Pipeline3.14 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Recap: Control Diagram Equal IR <- Mem[PC]; PC < PC+4; A <- R[rs]; B<– R[rt] S <– A + B; R[rd] <– S; S <– A + SX; M <– Mem[S] R[rd] <– M; S <– A or ZX; R[rt] <– S; S <– A + SX; Mem[S] <- B If Cond PC < PC+SX; M <– S Exec Reg. File Mem Access Data Mem A B S Reg File PC Next PC IR Inst. Mem D M

15 ECE 232 L22.Pipeline3.15 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Datapath + Data Stationary Control - old Exec Reg. File Mem Access Data Mem A B S Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl M rsrt op rs rt fun im ex me wb rw v me wb rw v wb rw v

16 ECE 232 L22.Pipeline3.16 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Datapath + Data Stationary Control - new Do you see the difference ? (location of comparator (=) unit and next PC computation) What does it buy us ? Exec Reg. File Mem Access Data Mem A B S Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M rsrt im nn nn PC Next PC 10 =

17 ECE 232 L22.Pipeline3.17 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Let’s Try it Out 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 ……… 100andr13, r14, 15 these addresses are octal °Execute these instructions on a pipelined machine

18 ECE 232 L22.Pipeline3.18 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Start: Fetch 10 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15 IF Exec Reg. File Mem Access Data Mem A B S Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M rsrt im nn nn PC Next PC 10 =

19 ECE 232 L22.Pipeline3.19 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 14, Decode 10 Exec Reg. File Mem Access Data Mem A B S Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M 2rt im n nn lw r1, r2(35) PC Next PC 14 = ID IF 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15

20 ECE 232 L22.Pipeline3.20 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 20, Decode 14, Exec 10 Exec Reg. File Mem Access Data Mem r2 B S Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M 2rt 35 nn lw r1 addI r2, r2, 3 PC Next PC 20 = ID IF EX 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15

21 ECE 232 L22.Pipeline3.21 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 24, Decode 20, Exec 14, Mem 10 Exec Reg. File Mem Access Data Mem r2 B r2+35 Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M 45 3 n lw r1 sub r3, r4, r5 addI r2, r2, 3 PC Next PC 24 = ID IF EX M 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15

22 ECE 232 L22.Pipeline3.22 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 30, Dcd 24, Ex 20, Mem 14, WB 10 Note: SIngle delayed branch: always execute ori after beq ID IF EX M WB 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15 Exec Reg. File Mem Access Data Mem r4 r5 r2+3 Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M[r2+35] 67 lw r1 beq r6,r7 100 addI r2 sub r3 PC Next PC 30 = 100 Assume: branch taken

23 ECE 232 L22.Pipeline3.23 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 100, Dcd 30, Ex 24, Mem 20, WB 14 Exec Reg. File Mem Access Data Mem r6 r7 r2+3 Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl r1=M[r2+35] 9xx beq addI r2 sub r3 r4-r5 100 ori r8, r9 17 PC Next PC 100 = ID EX M WB IF 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15

24 ECE 232 L22.Pipeline3.24 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 104, Dcd 100, Ex 30, Mem 24, WB 20 Fill it in yourself! Exec Reg. File Mem Access Data Mem Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl PC Next PC ___ = ? ID EX M WB 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15

25 ECE 232 L22.Pipeline3.25 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 110, Dcd 104, Ex 100, Mem 30, WB 24 Fill it in yourself! Exec Reg. File Mem Access Data Mem Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl PC Next PC ___ = ?? ? ? EX M WB 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15

26 ECE 232 L22.Pipeline3.26 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 114, Dcd 110, Ex 104, Mem 100, WB 30 Exec Reg. File Mem Access Data Mem Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl PC Next PC ___ = ?? ? ? ? ? M WB 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12..... 100andr13, r14, 15

27 ECE 232 L22.Pipeline3.27 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Issues in Pipelined design ° Pipelining ° Super-pipeline - Issue one instruction per (fast) cycle - ALU takes multiple cycles ° Super-scalar - Issue multiple scalar instructions per cycle ° VLIW (“EPIC”) - Each instruction specifies multiple scalar operations - Compiler determines parallelism ° Vector operations - Each instruction specifies series of identical operations Limitation Issue rate, FU stalls, FU depth Clock skew, FU stalls, FU depth Hazard resolution Packing Applicability IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW IFDExMW MW MW MW IFDExMW MW MW MW

28 ECE 232 L22.Pipeline3.28 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Partitioned Instruction Issue (simple Superscalar) Int Reg Inst Issue and Bypass FP Reg Operand / Result Busses Int Unit I-Cache Load / Store Unit FP AddFP Mul D-Cache Single Issue Total Time = Int Time + FP Time Max Speedup: Total Time MAX(Int Time, FP Time) Independent Int and FP issue to separate pipelines

29 ECE 232 L22.Pipeline3.29 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Multiple Pipes/ Harder Superscalar Register File AB R T D$ AB R T IR0IR1 Issues: Reg. File ports Detecting Data Dependences Bypassing RAW Hazard WAR Hazard Multiple load /store ops? Branches

30 ECE 232 L22.Pipeline3.30 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Summary °Pipelines pass control information down the pipe just as data moves down pipe °Forwarding/Stalls handled by local control °Exceptions stop the pipeline °MIPS I instruction set architecture made pipeline visible (delayed branch, delayed load) °More performance from deeper pipelines, parallelism


Download ppt "ECE 232 L22.Pipeline3.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 22 Pipelining,"

Similar presentations


Ads by Google