Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 IF IDEX MEM 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) WB IF IDM1 MEM WBM2M3M4M5M6M7 stall.

Similar presentations


Presentation on theme: "1 IF IDEX MEM 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) WB IF IDM1 MEM WBM2M3M4M5M6M7 stall."— Presentation transcript:

1 1 IF IDEX MEM 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) WB IF IDM1 MEM WBM2M3M4M5M6M7 stall IF stall ID IF ID A1A2A3A4 MEM WB stall EX MEM WB stall ●Although it seems useless sequence of instructions (L.D overwrites F2 immediately after ADD.D writes it) we must detect WAW hazard and make sure the later value appears in register ●One approach (shown) is to delay writing stage of the later instruction ●Another approach is to stamp the result of the earlier instruction and don’t write it into memory or register WAW Hazards In FP MIPS Pipeline

2 2 Write Port Structural Hazards In FP MIPS Pipeline IF IDM1 M2 M3 IF IDA1 MEM WB IF IDEXMEM WB 1 2 3 4 5 6 7 8 9 10 11 MUL.D ADD.D L.D A2 A3 A4 M4 M5 M6 M7 MEM WB IF IDEXMEM WB IF IDEXMEM WB IF IDEXMEM WB IF IDEXMEM WB Detect write port hazard in ID stage, use shift register that indicates when already issued instructions will use write port, shift reservation register one bit at each clock cycle. We could then insert stalls right after ID stage. Alternative solution would insert stall before MEM or WB

3 3 Data Hazards In FP MIPS Pipeline Since FP and integer operations use different registers we need only consider moves and loads/stores as potential source of hazards between FP and integer instructions Pipeline checks in ID for: Structural hazards – DIV unit and write port RAW hazards – wait until source registers are not listed as pending destinations WAW hazards – determine if any instruction in EX stage has the same destination register, if so stall the current instruction

4 4 Maintaining Precise Exceptions In FP MIPS Pipeline Out-of-order completion is possible DIV.D F0, F2, F4 ADD.D F10, F10, F8 SUB.D F12, F12, F14 Option 1: History file keeps track of original values of registers/memory Option 2: Future file keeps track of new values, registers/memory are updated when all previous instructions have completed Option 3: Proceed only if sure that no previous instructions will cause exceptions

5 5 Instruction Level Parallelism Amount of parallelism within a basic block is very small We must exploit parallelism across multiple basic blocks Pipelining Out-of-order execution

6 6 Dependencies If two instructions are independent then they can be executed in parallel Otherwise they must execute in order, although they may partially overlap Types of dependencies: Data (true) dependencies Name dependencies Control dependencies

7 7 Data Dependencies Instructions j is data dependent on instruction i if Instruction i produces a result that may be used by instruction j Instruction j is data dependent on instruction k and instruction k is data dependent on instruction i LOOP: L.D F0, 0(R1) ADD.D F4, F0, F2 S.D F4, 0(R1) DADDUI R1, R1,#-8 BNE R1, R2, LOOP What effect do we get if we move branch condition test to EX phase? Is this RAW, WAW or WAR hazard?

8 8 Data Dependencies Data dependencies can be overcome by Leaving the dependence but avoiding the hazard Eliminating the dependence by transforming the code

9 9 Name Dependencies Instructions i and j use the same register or memory location Antidependence – instruction j writes a location that instruction i reads Output dependence – instruction j writes a location that instruction i writes Since there is no data flow between instructions, they can be renamed and executed in parallel Is this RAW, WAW or WAR hazard?

10 10 Control Dependencies Branches incur some penalty – while the target and condition are evaluated we cannot be sure which instruction is next We have to guess We have to reorder instructions so that we execute useful instructions while waiting for the branch Main goal is not to affect correctness of the program

11 11 Control Dependencies Preserve data flow and exception behavior Instructions after the branch depend on it and all instructions prior to the branch DADDU R1, R2, R3 BEQZ R4, L DSUBU R1, R5, R6 L: … OR R7, R1, R8 Instruction reordering should not cause exception reordering DADDU R2, R3, R4 BEQZ R2, L1 LW R1, 0(R2) L: Only those exceptions are allowed that would surely occur

12 12 Dynamic Scheduling Techniques we have learned so far are static scheduling techniques – forwarding, delayed branches, flush pipeline, predict taken, predict untaken Compiler detects dependencies and schedules instruction execution to minimize hazards Pipeline executes instructions in order, detects hazards and inserts stalls Dynamic scheduling overcomes data hazards by out-of-order execution

13 13 Out-of-Order Execution If some instruction is stalled, check the following instructions to see whether they can proceed (they have no hazards with previous instructions) Check for structural and data hazards Instruction can be issued as soon as its operands are available Out-of-order issue means out-of-order completion and possibility of WAR and WAW hazards, and problems with exception handling

14 14 Out-of-Order Execution Now is the time to forget that there were 5 stages in MIPS pipeline since we will redesign pipeline in next slides!

15 15 Out-of-Order Execution To allow out-of-order execution we split ID stage into two stages Issue – decode instruction, check for structural hazards Read operands – wait until no data hazards All instructions pass issue stage in order, but they may be reordered in the read operands stage Since multiple instructions will be in EX stage we need multiple functional units or pipelined functional units (assume the first)

16 16 Scoreboarding 2 FP multipliers, 1 FP adder, 1 FP divide unit, 1 integer ALU unit scoreboard FP mult FP add FP div ALU...... Registers

17 17 Scoreboarding Every instruction goes through scoreboard which determines if the instruction has all its operands available If yes, instruction can proceed If no, scoreboard will monitor every change in the hardware to detect when the instruction can proceed Scoreboard also controls when the instruction can write the destination register

18 18 Scoreboarding The following four steps replace ID, EX and WB steps ID: Issue ID: Issue – if a functional unit for instruction is free and no other active instruction has the same destination register (WAW) it can proceed, otherwise it stalls ID: Read operands ID: Read operands – a source operand is available if no earlier instruction plans to write it EX: Execute EX: Execute – once the execution is complete this stage notifies the scoreboard WB: Write results WB: Write results – scoreboard checks for WAR hazards and may stall write back

19 19 Scoreboarding Operands are always read from register file – no advantage is taken of forwarding This is no large penalty as write occurs immediately after the execution and not after MEM stage Read operand and write result stages cannot overlap so we have 1 cycle latency

20 20 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status Integer YesLoadF6R2Yes Issue first load Time =1

21 21 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status Integer YesLoadF6R2Yes  First load reads operands Time =2 Second load cannot be issued due to structural hazard No

22 22 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status   YesLoadF6R2No Integer First load completes execution Time =3 Second load cannot be issued due to structural hazard

23 23 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status   First load writes the result and frees ALU  Time =4 YesLoadF6R2No Integer Second load cannot be issued due to structural hazard

24 24 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    YesLoadF2R3Yes Integer  Second load is issued Time =5

25 25 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes LoadF2R3Yes Integer    MultF0F2 F4 Integer No Yes Mult1 Second load reads operands Time =6 Mult is issued No

26 26 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes LoadF2R3No Integer    MultF0F2 F4 Integer No Yes Mult1   Sub is issued SubF8F6 F2 Integer Yes No Add Time =7 Second load completes execution Mult is stalled waiting for F2

27 27 Integer IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Yes Mult1   Div is issued SubF8F6 F2 No Add  DivF10F0 F6 No Yes Mult1  Divide Time =8 Second load writes result Mult is stalled waiting for F2 Sub is stalled waiting for F2 YesLoadF2R3No Yes Integer

28 28 Yes IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   SubF8F6 F2 No Add  DivF10F0 F6 No Yes Mult1    Divide Time =9 Mult reads operands Sub reads operands Div is stalled waiting for F0 Add cannot be issued due to structural hazard No

29 29 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   Add cannot be issued due to structural hazard SubF8F6 F2 No Add  DivF10F0 F6 No Yes Mult1    Divide Time =10 Mult in execution (1 out of 10) Sub in execution (1 out of 2) Div is stalled waiting for F0 19 11

30 30 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   SubF8F6 F2 No Add  DivF10F0 F6 No Yes Mult1     Divide Time =11 Add cannot be issued due to structural hazard Mult in execution (2 out of 10) Sub completes execution Div is stalled waiting for F0 19 11

31 31 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1    DivF10F0 F6 No Yes Mult1     Divide Time =12 Mult in execution (3 out of 10) Sub writes result, frees adder Div is stalled waiting for F0 Add cannot be issued due to structural hazard 19 Yes SubF8F6 F2 No Add

32 32 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   Add is issued Yes Add  DivF10F0 F6 No Yes Mult1      AddF6F8 F2 Divide Time =13 19 Mult in execution (4 out of 10) Div is stalled waiting for F0

33 33 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   Add reads operands Yes Add  DivF10F0 F6 No Yes Mult1      AddF6F8 F2  Divide Time =14 Mult in execution (5 out of 10) Div is stalled waiting for F0 19 No

34 34 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   Add in execution (1 out of 2) No Add  DivF10F0 F6 No Yes Mult1      AddF6F8 F2  Divide Time =15 19 Mult in execution (6 out of 10) Div is stalled waiting for F0 16

35 35 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   No Add  DivF10F0 F6 No Yes Mult1      AddF6F8 F2   Divide Time =16 Add completes execution Mult in execution (7 out of 10) Div is stalled waiting for F0 19 16

36 36 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   No Add  DivF10F0 F6 No Yes Mult1      AddF6F8 F2   Divide Time =17 Add is stalled, WAR hazard Mult in execution (8 out of 10) Div is stalled waiting for F0

37 37 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    MultF0F2 F4 No Mult1   No Add  DivF10F0 F6 No Yes Mult1      AddF6F8 F2    Divide Time =19 Add is stalled, WAR hazard Mult completes execution Div is stalled waiting for F0

38 38 No IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    Add   No  DivF10F0 F6 Yes      AddF6F8 F2    Time =20 Add is stalled, WAR hazard Mult writes result Div is stalled waiting for F0 YesMultF0F2 F4 No Mult1 Divide

39 39 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes    Add   No  DivF10F0 F6 Yes      AddF6F8 F2     Divide Time =21 No Div reads operands Add is stalled, WAR hazard

40 40 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes      Add writes result  DivF10F0 F6 No          Divide  Time =22 Div in execution (1 out of 40) 61 Yes Add No AddF6F8 F2

41 41 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status    Yes      Div completes execution  DivF10F0 F6 No          Divide   Time =61

42 42 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status         Div writes result              Time =62 Yes DivF10F0 F6 No Divide


Download ppt "1 IF IDEX MEM 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) WB IF IDM1 MEM WBM2M3M4M5M6M7 stall."

Similar presentations


Ads by Google