Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pipelined Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.

Similar presentations


Presentation on theme: "Pipelined Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili."— Presentation transcript:

1 Pipelined Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili

2 Reading Sections 4.5 – 4.10 Practice Problems: 1, 3, 8, 12
Note: Appendices A-E in the hardcopy text correspond to chapters 7-11 in the online text.

3 Morgan Kaufmann Publishers
9 May, 2019 Pipeline Performance Assume time for stages is 100ps for register read or write 200ps for other stages Compare pipelined datapath with single-cycle datapath Instr Instr fetch Register read ALU op Memory access Register write Total time lw 200ps 100 ps 800ps sw 700ps R-format 600ps beq 500ps Chapter 4 — The Processor

4 Morgan Kaufmann Publishers
9 May, 2019 Pipeline Performance Single-cycle (Tc= 800ps) Pipelined (Tc= 200ps) Chapter 4 — The Processor

5 Morgan Kaufmann Publishers
9 May, 2019 Pipeline Speedup If all stages are balanced i.e., all take the same time If not balanced, speedup is less Speedup due to increased throughput Latency (time for each instruction) does not decrease Chapter 4 — The Processor

6 Basic Idea All instructions are 32-bits
Few & regular instruction formats Alignment of memory operands

7 Pipelining What makes it easy All instructions are the same length
Simple instruction formats Memory operands appear only in loads and stores What makes it hard? structural hazards: suppose we had only one memory control hazards: need to worry about branch instructions data hazards: an instruction depends on a previous instruction What really makes it hard: exception handling trying to improve performance with out-of-order execution, etc.

8 Morgan Kaufmann Publishers
9 May, 2019 Pipeline registers Need registers between stages To hold information produced in previous cycle Pipeline stage execution time Chapter 4 — The Processor

9 Graphically Representing Pipelines
2 4 6 8 10 Time lw IF ID EX MEM WB IF ID MEM WB EX add Shading indicates the unit is being used by the instruction Shading on the right half of the register file (ID or WB) or memory means the element is being read in that stage Shading on the left half means the element is being written in that stage

10 Graphically Representing Pipelines
Can help with answering questions like: how many cycles does it take to execute this code? what is the ALU doing during cycle 4? use this representation to help understand datapaths T i m e ( i n c l o c k c y c l e s ) P r o g r a m C C 1 C C 2 C C 3 C C 4 C C 5 C C 6 e x e c u t i o n o r d e r ( i n i n s t r u c t i o n s ) l w $ 1 , 2 ( $ 1 ) I M R e g A L U D M R e g s u b $ 1 1 , $ 2 , $ 3 I M R e g A L U D M R e g

11 Structural Hazard 2 4 6 8 10 Time lw add sub add
IF ID EX MEM WB IF ID MEM WB EX add IF ID MEM WB EX sub IF ID MEM WB EX add Need to separate instruction and data memory

12 Morgan Kaufmann Publishers
9 May, 2019 IF for Load, Store, … Pipeline stage execution time Chapter 4 — The Processor

13 Morgan Kaufmann Publishers
9 May, 2019 ID for Load, Store, … Pipeline stage execution time Chapter 4 — The Processor

14 Morgan Kaufmann Publishers
9 May, 2019 EX for Load Pipeline stage execution time Chapter 4 — The Processor

15 Morgan Kaufmann Publishers
9 May, 2019 MEM for Load Pipeline stage execution time Chapter 4 — The Processor

16 Morgan Kaufmann Publishers
9 May, 2019 WB for Load Wrong register number Chapter 4 — The Processor

17 Corrected Datapath for Load
Morgan Kaufmann Publishers 9 May, 2019 Corrected Datapath for Load Pipeline stage execution time Chapter 4 — The Processor

18 Morgan Kaufmann Publishers
9 May, 2019 EX for Store Pipeline stage execution time Chapter 4 — The Processor

19 Morgan Kaufmann Publishers
9 May, 2019 MEM for Store Pipeline stage execution time Chapter 4 — The Processor

20 Morgan Kaufmann Publishers
9 May, 2019 WB for Store Pipeline stage execution time Chapter 4 — The Processor

21 Note what is happening in the register file
Pipelining Example add $14, $5, $6 lw $13, 24($1) add $12, $3, $4 sub $11, $2, $3 lw $10, 20($1) Note what is happening in the register file M u x 1 I F / I D I D / E X E X / M E M M E M / W B A d d 4 A d d A d d r e s u l t S h i f t l e f t 2 n R e a d P C A d d r e s s o t i r e g i s t e r 1 c R e a d u t r s R e a d d a t a 1 Z e r o I n s t r u c t i o n n I r e g i s t e r 2 R e g i s t e r s A L U m e m o r y R e a d A L U W r i t e d a t a 2 r e s u l t A d d r e s s R e a d r e g i s t e r M d a t a 1 u D a t a M W r i t e x u m e m o r y d a t a x 1 W r i t e d a t a 1 6 3 2 S i g n e x t e n d I n s t r u c t i o n [ 2 1 6 ] M I n s t r u c t i o n u [ 1 5 1 1 ] x 1 Pipeline stage execution time R e g D s t

22 Pipelined Control (Simplified)
Morgan Kaufmann Publishers 9 May, 2019 Pipelined Control (Simplified) Chapter 4 — The Processor

23 Morgan Kaufmann Publishers
9 May, 2019 Pipelined Control Control signals derived from instruction As in single-cycle implementation Pass control signals along like data Chapter 4 — The Processor

24 Morgan Kaufmann Publishers
9 May, 2019 Pipelined Control Chapter 4 — The Processor

25 Datapath with Control IF: lw $10, 9($1) P C I n s t r u c i o m e y A
[ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f E X F / IF: lw $10, 9($1)

26 Datapath with Control IF: sub $11, $2, $3 ID: lw $10, 9($1) “lw” 11
m e y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f X F / E IF: sub $11, $2, $3 ID: lw $10, 9($1) 11 010 0001 “lw”

27 Datapath with Control ID: sub $11, $2, $3 EX: lw $10, 9($1)
m e y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f X F / E 11 010 00 ID: sub $11, $2, $3 EX: lw $10, 9($1) IF: and $12, $4, $5 10 000 1100 “sub”

28 Datapath with Control EX: sub $11, $2, $3 MEM: lw $10, 9($1)
y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f X F / E 10 000 EX: sub $11, $2, $3 MEM: lw $10, 9($1) ID: and $12, $4, $5 1100 IF: or $13, $6, $7 11 “and”

29 Datapath with Control MEM: sub $11, .. WB: lw $10, 9($1)
y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f X F / E 10 000 MEM: sub $11, .. WB: lw $10, 9($1) EX: and $12, $4, $5 1100 ID: or $13, $6, $7 “or” IF: add $14, $8, $9

30 Datapath with Control WB: sub $11, .. MEM: and $12… EX: or $13, $6, $7
y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f X F / E 10 000 WB: sub $11, .. MEM: and $12… 1100 EX: or $13, $6, $7 “add” ID: add $14, $8, $9 IF: xxxx

31 Datapath with Control WB: and $12… MEM: or $13, .. EX: add $14, $8, $9
s t r u c i o m e y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f F / E X 10 000 WB: and $12… MEM: or $13, .. EX: add $14, $8, $9 IF: xxxx ID: xxxx

32 Datapath with Control MEM: add $14, .. EX: xxxx IF: xxxx ID: xxxx
s t r u c i o m e y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f F / E X MEM: add $14, .. 10 EX: xxxx IF: xxxx ID: xxxx WB: or $13…

33 Datapath with Control IF: xxxx ID: xxxx EX: xxxx MEM: xxxx
WB: add $14.. P C I n s t r u c i o m e y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f F / E X

34 Morgan Kaufmann Publishers
9 May, 2019 Data Hazards (4.7) An instruction depends on completion of data access by a previous instruction add $s0, $t0, $t1 sub $t2, $s0, $t3 Chapter 4 — The Processor

35 Dependencies Problem with starting next instruction before first is finished dependencies that “go backward in time” are data hazards

36 Software Solution Have compiler guarantee no hazards
Where do we insert the “nops” ? sub $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15, 100($2) Problem: this really slows us down!

37 Morgan Kaufmann Publishers
9 May, 2019 A Better Solution Consider this sequence: sub $2, $1,$3 and $12,$2,$5 or $13,$6,$2 add $14,$2,$2 sw $15,100($2) We can resolve hazards with forwarding How do we detect when to forward? Chapter 4 — The Processor

38 Dependencies & Forwarding
Morgan Kaufmann Publishers 9 May, 2019 Dependencies & Forwarding Do not wait for results to be written to the register file – find them in the pipeline  forward to ALU Chapter 4 — The Processor

39 Forwarding MEM: sub $11, .. WB: lw $10, 9($1) EX: and $6, $4, $5
P C I n s t r u c i o m e y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f X F / E 10 000 MEM: sub $11, .. WB: lw $10, 9($1) EX: and $6, $4, $5 1100 ID: or $13, $6, $7 “or” IF: add $14, $8, $9

40 Forwarding (simplified)
ID/EX EX/MEM MEM/WB Register File ALU Data Memory MUX

41 Forwarding (from EX/MEM)
ID/EX EX/MEM MEM/WB MUX Register File ALU Data Memory MUX MUX

42 Forwarding (from MEM/WB)
ID/EX EX/MEM MEM/WB MUX Register File ALU Data Memory MUX MUX

43 Forwarding (operand selection)
ID/EX EX/MEM MEM/WB MUX Register File ALU Data Memory MUX MUX Forwarding Unit

44 Forwarding (operand propagation)
ID/EX EX/MEM MEM/WB MUX Register File ALU Data Memory MUX MUX Rd MUX Rt EX/MEM Rd Rt Forwarding Unit Rs MEM/WB Rd Combinational Logic!

45 Detecting the Need to Forward
Morgan Kaufmann Publishers 9 May, 2019 Detecting the Need to Forward Pass register numbers along pipeline e.g., ID/EX.RegisterRs = register number for Rs sitting in ID/EX pipeline register ALU operand register numbers in EX stage are given by ID/EX.RegisterRs, ID/EX.RegisterRt Data hazards when 1a. EX/MEM.RegisterRd = ID/EX.RegisterRs 1b. EX/MEM.RegisterRd = ID/EX.RegisterRt 2a. MEM/WB.RegisterRd = ID/EX.RegisterRs 2b. MEM/WB.RegisterRd = ID/EX.RegisterRt Fwd from EX/MEM pipeline reg Fwd from MEM/WB pipeline reg Chapter 4 — The Processor

46 Detecting the Need to Forward
Morgan Kaufmann Publishers 9 May, 2019 Detecting the Need to Forward But only if forwarding instruction will write to a register! EX/MEM.RegWrite, MEM/WB.RegWrite And only if Rd for that instruction is not $zero EX/MEM.RegisterRd ≠ 0, MEM/WB.RegisterRd ≠ 0 Chapter 4 — The Processor

47 Morgan Kaufmann Publishers
9 May, 2019 Forwarding Paths Chapter 4 — The Processor

48 Forwarding Conditions
Morgan Kaufmann Publishers 9 May, 2019 Forwarding Conditions EX hazard if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10 if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10 MEM hazard if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01 Chapter 4 — The Processor

49 Morgan Kaufmann Publishers
9 May, 2019 Double Data Hazard Consider the sequence: add $1,$1,$2 add $1,$1,$3 add $1,$1,$4 Both hazards occur Want to use the most recent Revise MEM hazard condition Only forward if EX hazard condition isn’t true Chapter 4 — The Processor

50 Revised Forwarding Condition
Morgan Kaufmann Publishers 9 May, 2019 Revised Forwarding Condition MEM hazard if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01 Checking precedence of EX hazard Chapter 4 — The Processor

51 Datapath with Forwarding
Morgan Kaufmann Publishers 9 May, 2019 Datapath with Forwarding Chapter 4 — The Processor

52 Concurrent Execution Correct execution is about managing dependencies
Producer-consumer Structural (using the same hardware component) We will come across other types of dependencies later!

53 Morgan Kaufmann Publishers
9 May, 2019 Load-Use Data Hazard Need to stall for one cycle Chapter 4 — The Processor

54 Forwarding MEM: lw $11, 0($2) WB: lw $10, 9($1) EX: and $6, $4, $11
P C I n s t r u c i o m e y A d [ 2 1 6 ] M R g L U O p B a h D S 4 3 5 x l W Z f X F / E 10 000 MEM: lw $11, 0($2) WB: lw $10, 9($1) EX: and $6, $4, $11 1100 ID: or $13, $6, $7 “or” IF: add $14, $8, $9

55 Load-Use Hazard Detection
Morgan Kaufmann Publishers 9 May, 2019 Load-Use Hazard Detection Check when using instruction is decoded in ID stage ALU operand register numbers in ID stage are given by IF/ID.RegisterRs, IF/ID.RegisterRt Load-use hazard when ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt)) If detected, stall and insert bubble Chapter 4 — The Processor

56 Code Scheduling to Avoid Stalls
Morgan Kaufmann Publishers 9 May, 2019 Code Scheduling to Avoid Stalls Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) add $t5, $t1, $t4 sw $t5, 16($t0) lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0) stall stall 13 cycles 11 cycles Chapter 4 — The Processor

57 How to Stall the Pipeline
Morgan Kaufmann Publishers 9 May, 2019 How to Stall the Pipeline Force control values in ID/EX register to 0 EX, MEM and WB perform a nop (no-operation) Prevent update of PC and IF/ID register Using instruction is decoded again Following instruction is fetched again 1-cycle stall allows MEM to read data for lw Can subsequently forward to EX stage Chapter 4 — The Processor

58 Stall/Bubble in the Pipeline
Morgan Kaufmann Publishers 9 May, 2019 Stall/Bubble in the Pipeline Stall inserted here Chapter 4 — The Processor

59 Stall/Bubble in the Pipeline
Morgan Kaufmann Publishers 9 May, 2019 Stall/Bubble in the Pipeline Or, more accurately… Chapter 4 — The Processor

60 Datapath with Hazard Detection
Morgan Kaufmann Publishers 9 May, 2019 Datapath with Hazard Detection Pipeline stage execution time ALUSrc mux is missing! Chapter 4 — The Processor

61 Morgan Kaufmann Publishers
9 May, 2019 Control Hazards (4.8) Branch instruction determines flow of control Fetching next instruction depends on branch outcome Pipeline cannot always fetch correct instruction Still working on ID stage of branch In MIPS pipeline Need to compare registers and determine the branch condition Chapter 4 — The Processor

62 Morgan Kaufmann Publishers
9 May, 2019 Branch Hazards If branch outcome determined in MEM Flush these instructions (Set control values to 0) PC Chapter 4 — The Processor

63 Morgan Kaufmann Publishers
9 May, 2019 Reducing Branch Delay Move hardware to determine outcome to ID stage Target address adder Register comparator Add IF.Flush signal to squash IF/ID register Example: branch taken 36: sub $10, $4, $8 40: beq $1, $3, 72 44: and $12, $2, $5 48: or $13, $2, $6 52: add $14, $4, $2 56: slt $15, $6, $ : lw $4, 50($7) Chapter 4 — The Processor

64 Morgan Kaufmann Publishers
9 May, 2019 Example: Branch Taken Chapter 4 — The Processor

65 Morgan Kaufmann Publishers
9 May, 2019 Example: Branch Taken Chapter 4 — The Processor

66 Data Hazards for Branches
Morgan Kaufmann Publishers 9 May, 2019 Data Hazards for Branches If a comparison register is a destination of 2nd or 3rd preceding ALU instruction IF ID EX MEM WB add $1, $2, $3 IF ID EX MEM WB add $4, $5, $6 IF ID EX MEM WB IF ID EX MEM WB beq $1, $4, target Can resolve using forwarding to ID Need to add forwarding logic! Chapter 4 — The Processor

67 Data Hazards for Branches
Morgan Kaufmann Publishers 9 May, 2019 Data Hazards for Branches If a comparison register is a destination of preceding ALU instruction or 2nd preceding load instruction Need 1 stall cycle IF ID EX MEM WB lw $1, addr IF ID EX MEM WB add $4, $5, $6 beq stalled IF ID beq $1, $4, target ID EX MEM WB Chapter 4 — The Processor

68 Data Hazards for Branches
Morgan Kaufmann Publishers 9 May, 2019 Data Hazards for Branches If a comparison register is a destination of immediately preceding load instruction Need 2 stall cycles IF ID EX MEM WB lw $1, addr beq stalled IF ID beq stalled ID beq $1, $0, target ID EX MEM WB Chapter 4 — The Processor

69 Delay Slot (MIPS) Expose pipeline
Load and jump/branch entail a “delay slot” The instruction right after the jump or branch is executed before the jump/branch jal function_A add $4, $5, $6 ; executed before jmp lw $12, 8($4) ; executed after return Jump/branch and the delay slot instruction are considered “indivisible” In the delay slot, the compiler needs to schedule A useful instruction (either before the jmp, or after the jmp w/o side effect) otherwise a NOP

70 Morgan Kaufmann Publishers
9 May, 2019 Branch Prediction Longer pipelines cannot readily determine branch outcome early Stall penalty becomes unacceptable Predict outcome of branch Only stall if prediction is wrong In MIPS pipeline Can predict branches not taken Fetch instruction after branch, with no delay Chapter 4 — The Processor

71 MIPS with Predict Not Taken
Morgan Kaufmann Publishers 9 May, 2019 MIPS with Predict Not Taken Prediction correct Prediction incorrect Chapter 4 — The Processor

72 1-Bit Predictor: Shortcoming
Morgan Kaufmann Publishers 9 May, 2019 1-Bit Predictor: Shortcoming Inner loop branches mispredicted twice! outer: … … inner: … beq …, …, inner … beq …, …, outer Mispredict as taken on last iteration of inner loop Then mispredict as not taken on first iteration of inner loop next time around Chapter 4 — The Processor

73 2-Bit Predictor: State Machine
Morgan Kaufmann Publishers 9 May, 2019 2-Bit Predictor: State Machine Only change prediction on two successive mispredictions Chapter 4 — The Processor

74 More-Realistic Branch Prediction
Morgan Kaufmann Publishers 9 May, 2019 More-Realistic Branch Prediction Static branch prediction Based on typical branch behavior Example: loop and if-statement branches Predict backward branches taken Predict forward branches not taken Dynamic branch prediction Hardware measures actual branch behavior e.g., record recent history of each branch Assume future behavior will continue the trend When wrong, stall while re-fetching, and update history Chapter 4 — The Processor

75 Instruction Level Parallelism (ILP)
AMD Bobcat ECE 6100 Later in this course ECE 6100 Instruction Level Parallelism (ILP) Later in this course

76 Intel Sandy Bridge bdti.com

77 Exceptions and Interrupts (4.9)
Morgan Kaufmann Publishers 9 May, 2019 Exceptions and Interrupts (4.9) “Unexpected” events requiring change in flow of control Different ISAs use the terms differently Exception Arises within the CPU e.g., undefined opcode, overflow, syscall, … Interrupt From an external I/O controller Updates to the data path Recording the cause of the exception and transferring control to the OS Consider the impact of hardware modifications on the critical path Chapter 4 — The Processor

78 Morgan Kaufmann Publishers
9 May, 2019 Handling Exceptions In MIPS, exceptions managed by a System Control Coprocessor (CP0) Save PC of offending (or interrupted) instruction In MIPS: Exception Program Counter (EPC) Save indication of the problem In MIPS: Cause register We’ll assume 1-bit 0 for undefined opcode, 1 for overflow Jump to handler at Chapter 4 — The Processor

79 Exception Handling: Operations
Add two registers to the datapath EPC and Cause registers Add a state for each exception condition Use the ALU to compute the EPC contents Write the Cause register with exception condition Update the PC with OS handler address Generate control signals for each operation See Appendix A.7 for details of MIPS 2000/3000 implementation

80 The OS Interactions The MIPS 32 Status and Cause Registers
31 15 8 4 1 Status Register Interrupt Mask User mode Excep level int enable 31 15 8 6 2 Cause Register Branch Delay Pending Interrupts Exception Code Operating System handlers interrogate these registers Manage all state saving requirements Read example in A.7

81 An Alternate Mechanism
Morgan Kaufmann Publishers 9 May, 2019 An Alternate Mechanism Vectored Interrupts Handler address determined by the cause Example: Undefined opcode: C Overflow: C …: C Instructions either Deal with the interrupt, or Jump to real handler Chapter 4 — The Processor

82 Morgan Kaufmann Publishers
9 May, 2019 Handler Actions Read cause, and transfer to relevant handler Determine action required If restartable Take corrective action use EPC to return to program Otherwise Terminate program Report error using EPC, cause, … Chapter 4 — The Processor

83 Exceptions in a Pipeline
Morgan Kaufmann Publishers 9 May, 2019 Exceptions in a Pipeline Another form of control hazard Consider overflow on add in EX stage add $1, $2, $1 Prevent $1 from being clobbered Complete previous instructions Flush add and subsequent instructions Set Cause and EPC register values Transfer control to handler Similar to mispredicted branch Use much of the same hardware Chapter 4 — The Processor

84 Pipeline with Exceptions
Morgan Kaufmann Publishers 9 May, 2019 Pipeline with Exceptions Chapter 4 — The Processor

85 Morgan Kaufmann Publishers
9 May, 2019 Exception Properties Restartable exceptions Pipeline can flush the instruction Handler executes, then returns to the instruction Re-fetched and executed from scratch PC saved in EPC register Identifies causing instruction Actually PC + 4 is saved Handler must adjust Chapter 4 — The Processor

86 Morgan Kaufmann Publishers
9 May, 2019 Exception Example Exception on add in 40 sub $11, $2, $4 44 and $12, $2, $5 48 or $13, $2, $6 4C add $1, $2, $1 50 slt $15, $6, $7 54 lw $16, 50($7) … Handler sw $25, 1000($0) sw $26, 1004($0) … Chapter 4 — The Processor

87 Morgan Kaufmann Publishers
9 May, 2019 Exception Example Chapter 4 — The Processor

88 Morgan Kaufmann Publishers
9 May, 2019 Exception Example Chapter 4 — The Processor

89 Morgan Kaufmann Publishers
9 May, 2019 Multiple Exceptions Pipelining overlaps multiple instructions Could have multiple exceptions at once Simple approach: deal with exception from earliest instruction Flush subsequent instructions “Precise” exceptions In complex pipelines Multiple instructions issued per cycle Out-of-order completion Maintaining precise exceptions is difficult! Chapter 4 — The Processor

90 Morgan Kaufmann Publishers
9 May, 2019 Imprecise Exceptions Just stop pipeline and save state Including exception cause(s) Let the handler work out Which instruction(s) had exceptions Which to complete or flush May require “manual” completion Simplifies hardware, but more complex handler software Not feasible for complex multiple-issue out-of-order pipelines Chapter 4 — The Processor

91 Performance How do we assess the impact of stall cycles?
How close do we approach the ideal of one instruction per cycle execution time? Back to the CPI model!

92 Recall: Program Execution time
Number of instruction classes ~= Instruction_count * CPIavg * clock_cycle_time technology algorithms/compiler architecture Relative frequency

93 Study Guide Given a code block, and initial register values (those that are accessed) be able to determine state of all pipeline registers at some future clock cycle. Determine the size of each pipeline register Track pipeline state in the case of forwarding and branches Compute the number of cycles to execute a code block Modify the datapath to include forwarding and hazard detection for branches (this is trickier and time consuming but well worth it)

94 Study Guide (cont.) Schedule code (manually) to improve performance, for example to eliminate hazards and fill delay slots Modify the data path to add new instructions such as j Modify the data path to accommodate a two cycle data memory access, i.e., the data memory itself is a two cycle pipeline Modify the forwarding and hazard control logic Given a code sequence, be able to compute the number of stall cycles

95 Study Guide (cont.) Track the state of the 2-bit branch predictor over a sequence of branches in a code segment, for example a for-loop Show the pipeline state before and after an exception has taken place.

96 Glossary Branch prediction Branch hazards Branch delay Control hazard
Data hazard Delay slot Dynamic instruction issue Forwarding Imprecise exception Instruction scheduling Instruction level parallelism (ILP) Load-to-use hazard Pipeline bubbles Stall cycles Static instruction issue Structural hazard


Download ppt "Pipelined Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili."

Similar presentations


Ads by Google