Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Oct 5, 2005 Topic: Instruction-Level Parallelism (Dynamic Scheduling: Scoreboarding)

Similar presentations


Presentation on theme: "1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Oct 5, 2005 Topic: Instruction-Level Parallelism (Dynamic Scheduling: Scoreboarding)"— Presentation transcript:

1 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Oct 5, 2005 Topic: Instruction-Level Parallelism (Dynamic Scheduling: Scoreboarding)

2 2Reading  Appendix A (Crosscutting Issues)

3 3 What is a Scoreboard? A Scoreboard is a table maintained by the hardware: keeps track of instructions being fetched, issued, executed etc. keeps track of instructions being fetched, issued, executed etc. keeps track of the resources (functional units and operands) they use/need keeps track of the resources (functional units and operands) they use/need keeps track of which instructions modify which registers keeps track of which instructions modify which registers  uses this information to dynamically schedule instructions  very similar to a pen and paper calculation  simple step-by-step procedure easily implemented in hardware

4 4 MIPS with a Scoreboard

5 5 Dynamic Scheduling with a Scoreboard  Original development in CDC 6600  Simplified example in HP3 for MIPS FP operations (Read Section A.8) Using neither renaming nor forwarding Using neither renaming nor forwarding  Values always move from registers to function units, and from function units back to registers However, write-back of results happen as soon as possible, not in a statically scheduled slot However, write-back of results happen as soon as possible, not in a statically scheduled slot  Out-of-order completion can give rise to WAR and WAW hazards  Remember: machine “knows” original program order (needed for hazard detection) Machine model Machine model  2 FP multipliers (10 cycles), 1 FP adder (2 cycles), 1 FP divider (40 cycles), all non-pipelined  1 integer unit for everything else (incl. memory references)

6 6 Scoreboard Implications  Out-of-order completion  WAW, WAR hazards? for WAW: stall in Issue until previous write completes for WAW: stall in Issue until previous write completes for WAR: stall in Write Result until previous read completes for WAR: stall in Write Result until previous read completes  Need to have multiple instructions in execution phase  multiple execution units or pipelined execution units  Scoreboard keeps track of dependences, state of operations  Scoreboard replaces ID, EX, WB with 4 stages

7 7 Four Stages of Scoreboard Control 1. Issue: decode instr. & check for structural hazards (ID1) If functional unit is free and no WAW hazard with other active instruction … If functional unit is free and no WAW hazard with other active instruction …  … scoreboard issues the instruction to the functional unit and updates its internal data structure. If a structural or WAW hazard exists … If a structural or WAW hazard exists …  … instruction issue stalls –unless there is buffering between fetch and issue, no further instructions can issue until these hazards are cleared. 2. Read operands: wait until no data hazards, then read (ID2) A source operand is available if no earlier issued active instruction is going to write it. A source operand is available if no earlier issued active instruction is going to write it. When all source operands are available … When all source operands are available …  … scoreboard tells the functional unit to proceed to read the operands from registers and begin execution. Thus, scoreboard resolves RAW hazards dynamically in this step Thus, scoreboard resolves RAW hazards dynamically in this step  instructions may be sent into execution out of order

8 8 Four Stages of Scoreboard Control (cont.) 3. Execution: operate on operands The functional unit begins execution upon receiving operands The functional unit begins execution upon receiving operands When result is ready, it notifies the scoreboard When result is ready, it notifies the scoreboard 4. Write Result: finish execution (WB) Once scoreboard is aware that functional unit has completed execution, scoreboard checks for WAR hazards. Once scoreboard is aware that functional unit has completed execution, scoreboard checks for WAR hazards. If no WAR hazard … If no WAR hazard …  … it writes results If WAR hazard … If WAR hazard …  … it stalls the completing instruction Example: Example: DIV.DF0,F2,F4 ADD.DF10,F0,F8 SUB.DF8,F8,F14  CDC 6600 scoreboard would stall SUB.D until ADD.D reads ops

9 9 Three Parts of the Scoreboard 1. Instruction status: Which of 4 steps instruction is in 2. Functional unit (FU) status: Indicates state of FU Nine fields for each functional unit Nine fields for each functional unit  Busy: Indicates whether the unit is busy or not  Op: Operation to perform in the unit (e.g., + or -)  Fi: Destination register  Fj, Fk: Source registers  Qj, Qk: Functional units producing source registers Fj, Fk  Rj, Rk: Flags indicating when Fj, Fk are ready 3. Register result status: Indicates which functional unit will write each register, if any blank when no pending instructions will write that register blank when no pending instructions will write that register

10 10 Scoreboard Example Cycle 0

11 11 Scoreboard Example Cycle 1 First LD issues

12 12 Scoreboard Example Cycle 2 Structural hazard on Integer unit; second LD stalls in IF stage

13 13 Scoreboard Example Cycle 3 Second LD is still stalled

14 14 Scoreboard Example Cycle 4 Second LD still stalled; first LD done

15 15 Scoreboard Example Cycle 5 Second LD issues as the structural hazard on Integer unit has cleared

16 16 Scoreboard Example Cycle 6 MULT issues

17 17 Scoreboard Example Cycle 7 SUBD issues; MULT stalled on LD

18 18 Scoreboard Example Cycle 8a DIVD issues; SUBD stalled on LD

19 19 Scoreboard Example Cycle 8b LD writes F2; MULT and SUBD enabled

20 20 Scoreboard Example Cycle 9 MULT and SUBD read operands and enter execution

21 21 Scoreboard Example Cycle 10 Structural hazard on Add unit stalls the final ADDD

22 22 Scoreboard Example Cycle 11 SUBD and MULT are still in execution

23 23 Scoreboard Example Cycle 12 SUBD writes results; Add unit free; structural hazard resolves

24 24 Scoreboard Example Cycle 13 Note WAR hazard between DIVD and ADDD

25 25 Scoreboard Example Cycle 14 MULT still executing; DIVD stalled on F0 (RAW hazard)

26 26 Scoreboard Example Cycle 15 MULT still executing

27 27 Scoreboard Example Cycle 16 ADDD completes execution, ready to write result into F6

28 28 Scoreboard Example Cycle 17 WAR hazard : ADDD stalls in Write Result stage

29 29 Scoreboard Example Cycle 18 DIVD stalled (RAW hazard on F0), ADDD stalled (WAR hazard on F6)

30 30 Scoreboard Example Cycle 19 MULT completes execution

31 31 Scoreboard Example Cycle 20 MULT writes result; DIVD can proceed to read operands at next cycle

32 32 Scoreboard Example Cycle 21 DIVD reads operands; WAR hazard on F6 is resolved

33 33 Scoreboard Example Cycle 22 40 cycle Divide! ADDD completes writing of result

34 34 Scoreboard Example Cycle 61 DIVD completes execution; ready to write result

35 35 Scoreboard Summary  CDC designers measured performance improvement of 1.7 for compiled FORTRAN code, 2.5 for assembly No pipeline scheduling in software No pipeline scheduling in software Slow memory (no cache) Slow memory (no cache)  Limitations of 6600 scoreboard No forwarding No forwarding Limited to instructions in basic block (small issue window) Limited to instructions in basic block (small issue window) Number of functional units (structural hazards) Number of functional units (structural hazards) Wait for WAR hazards Wait for WAR hazards Prevent WAW hazards Prevent WAW hazards

36 36 Scoreboard: Bookkeeping Actions Instruction Status Wait Until Bookkeeping Issue Not Busy[FU] and not Result[D] Busy[FU]  yes; Op[FU]  op; Fi[FU]  D; Fj[FU]  S1; Fk[FU]  S2; Qj[FU]  Result[S1]; Qk[FU]  Result[S2]; Rj  not Qj; Rk  not Qk; Result[D]  FU Read Operands Rj and Rk Rj  No; Rk  No; Qj  0; Qk  0 Execution Complete Functional unit done Write Result  f ((Fj[f]≠Fi[FU] or Rj[f]=No) & (Fk[f]≠Fi[FU] or Rk[f]=No))  f (if Qj[f]=FU then Rj[f]  yes);  f (if Qk[f]=FU then Rk[f]  yes); Result[Fi[FU]]  0; Busy[FU]  No;


Download ppt "1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Oct 5, 2005 Topic: Instruction-Level Parallelism (Dynamic Scheduling: Scoreboarding)"

Similar presentations


Ads by Google