Presentation is loading. Please wait.

Presentation is loading. Please wait.

RTL for the SRC pipeline registers

Similar presentations


Presentation on theme: "RTL for the SRC pipeline registers"— Presentation transcript:

1 RTL for the SRC pipeline registers
IR2 M[PC] Instruction Fetch PC2 PC+4 X3 l-s2:(rel2:PC2,disp2:(rb=0):?,(rb!=0):R[rb]),brl2:PC2,alu2:R[rb] Y3  l-s2:(rel2:c1,disp2:c2),branch2:,alu2:(imm2:c2,!imm2:R[rc]) MD3 store2:R[ra],IR3  IR2,stop2:Run  0, PC  !branch2:PC+4,branch2:(cond(IR2,R[rc]):R[rb];!cond(IR2,R[rc]):PC+4; Decode and Operand Read Z4  (I-s3:X3+Y3, brl3 :X3, Alu3:X3 op Y3 MD4  MD3; IR4  IR3; ALU Operation Z5  (load4:M[Z4],ladr4~branch4~alu4:Z4), store4 : (M[Z4]  MD4), IR5 IR4 Memory Access regwrite5 : (R[ra]  Z5); Register Writeback

2 Review

3 CS501 Advanced Computer Architecture
Lecture 20 Dr.Noor Muhammad Sheikh

4 CS501 Advanced Computer Architecture

5 Instruction propagation through the pipeline
Consider the following SRC code segment flowing through the pipeline. The instructions along with their addresses are 200: add r1, r2, r3 204: ld r5, [4+r7] 208: br r6 212: str r4, 56 400

6 Instruction propagation through the pipeline
Instruction Fetch Add instruction enters the pipeline.The value in PC is incremented from 200 to 204. Only the highlighted portion of pipeline is active Decode and Operand Read ALU Operation Memory Access Register Writeback

7 Decode and Operand Read
First clock cycle Mp1 Instruction Memory MUX PC Add is fetched Inc4 IR2 regwrite Register File R[rb] R[rc] R[ra] opcode ra rb rc c PC2 Decode and Operand Read MUX Mp3 Mp2 MUX Mp4 MUX Y3 X3 IR3 MD3 decoder ALU Operation ALU IR4 Z4 MD4 Data Memory Memory Access MUX Mp5 IR5 Z5 Register Writeback

8 Instruction propagation through the pipeline
ld instruction enters the pipeline.The value in PC is incremented from 204 to 208. add moves to decode stage. Its operands are fetched from the register file and moved to X3 and Y3 at the end of clock cycle Instruction Fetch Decode and Operand Read ALU Operation Memory Access Register Writeback

9 Second clock cycle … Mp1 Instruction MUX Memory PC
Register File R[rb] R[rc] R[ra] regwrite PC2 opcode ra rb rc c PC Inc4 MUX IR2 Ld instruction is fetched Operands for add are being fetched MUX Mp3 Mp2 MUX Mp4 MUX Y3 X3 IR3 MD3 decoder ALU Operation ALU IR4 Z4 MD4 Data Memory Memory Access MUX Mp5 IR5 Z5 Register Writeback

10 Instruction propagation through the pipeline
Instruction Fetch br instruction enters the pipeline.The value in PC is incremented from 208 to 212. ld moves to decode stage. The operands are fetched to calculate the displacement address. Decode and Operand Read add operation is being executed. The results are written to Z4 at the trailing edge of clock. ALU Operation Memory Access Register Writeback

11 third clock cycle … Mp1 Instruction MUX Memory PC
Register File R[rb] R[rc] R[ra] regwrite PC2 opcode ra rb rc c PC Inc4 MUX IR2 Br instruction is being fetched Ld instruction decoded MUX Mp3 Mp2 MUX Mp4 MUX Y3 X3 IR3 MD3 decoder Add operation in execution ALU IR4 Z4 MD4 Data Memory Memory Access MUX Mp5 IR5 Z5 Register Writeback

12 Instruction propagation through the pipeline
Instruction Fetch str instruction enters the pipeline.The value in PC is incremented from 212 to 216. Br is in the decode stage. Since this branch is always true, the contents of PC are modified to new address. Decode and Operand Read The address is being calculated here for ld. The results are written to Z4. ALU Operation add does not access memory. The result is written to Z5 at the trailing edge of clock. Memory Access Register Writeback

13 fourth clock cycle … Mp1 Instruction MUX Memory PC
Register File R[rb] R[rc] R[ra] regwrite PC2 opcode ra rb rc c PC Inc4 MUX IR2 Str is being fetched Br address calculated MUX Mp3 Mp2 MUX Mp4 MUX Y3 X3 IR3 MD3 decoder Ld address calculated ALU IR4 Z4 MD4 Data Memory No memory access by add MUX Mp5 IR5 Z5 Register Writeback

14 Instruction propagation through the pipeline
Instruction Fetch The instruction at address 400 enters the pipeline. The value in PC is incremented from 400 to 404. str is in the decode stage. The operands are being Fetched for address calculation to X3 and Y3. Decode and Operand Read Br instruction just propagates through this stage without any calculation. ALU Operation Ld accesses data memory at the address specified in Z4 and result stored in Z5 at falling edge of clock. Memory Access The result of addition is written into register r1 . add instruction completes. Register Writeback

15 fifth clock cycle … Mp1 Instruction MUX Memory PC
Register File R[rb] R[rc] R[ra] regwrite PC2 opcode ra rb rc c PC Inc4 MUX IR2 New instruction fetched str decode and operand Read MUX Mp3 Mp2 MUX Mp4 MUX Y3 X3 IR3 MD3 decoder br moves through the stage ALU IR4 Z4 MD4 Data Memory ld accesses data memory MUX Mp5 IR5 Z5 add completes

16 Pipelined execution: alternate notation
Instruction Fetch Instruction decode ALU Operation Memory Access Register Writeback CC1 add r1,r2,r3 CC2 ld r5,(4+r7) add r1,r2,r3 CC3 br r6 ld r5,(4+r7) add r1,r2,r3 CC4 str r4,56 br r6 ld r5,(4+r7) add r1,r2,r3 CC5 _ add r1,r2,r3 str r4,56 br r6 ld r5,(4+r7)

17 Data Hazard An example of this RAW (read after write) data hazard is;
200: add r2, r3, r4 204: sub r7, r2, r6 The register r2 is written in clock cycle 5 hence the sub instruction cannot proceed beyond stage 2 until the add leaves the pipeline.

18 Solutions to data hazard
Detection Correction Delay or stalls Data forwarding

19 Data Forwarding Data can be forwarded to the next instruction from the stage where it is available without waiting for the completion of the instruction Data normally required at stage 2 (operand fetch) Data earliest available at stage 3 output( ALU result) or stage 4 output ( memory access result) Designing a data forwarding unit requires the study of dependency distances

20 Data Dependence Distances
The following table shows instruction pair hazard interaction Write to register file Data Available Normal/Earliest stage Instruction class alu load ladr brl 6/4 6/5 6/2 2/3 4/1 4/2 store(rb) store(ra) 2/4 branch 2/2 4/3 Data Available Normal/Earliest stage Read from register file Normal/forwarded No hazard

21 Notes: Without forwarding, the minimum spacing required between two data dependent instructions to avoid hazard is four The load instruction has a minimum distance of two from all other instructions except branch Branch delays cannot be removed even with forwarding.

22 Compiler solution to hazards
Hazards can be detected by the compiler, by analyzing the instruction sequences and dependencies Compiler inserts bubbles (nop instruction) between two instructions that form a hazard. The compiler solution to hazards is complex, expensive and not very efficient as compared to the hardware solution

23 SRC hazard correction: Pipeline stalls
Consider the following sequence of instruction going through the SRC pipeline 200: shl r6, r3, 2 204: str r3, 32 208: sub r2, r4,r5 212: add r1,r2,r3 216: ld r7, 48 There is a data hazard between instruction three and four that can be resolved by using pipeline stalls or bubbles

24 SRC hazard correction: Pipeline stalls
Instruction Fetch Instruction decode ALU Operation Memory Access Register Writeback CC1 Ld r7,48 Add r1,r2,r3 Sub r2,r4,r5 Str r3,32 Shl r6,r3,2 CC2 Ld r7,48 Add r1,r2,r3 nop Sub r2,r4,r5 Str r3,32 CC3 nop nop Sub r2,r4,r5 Ld r7,48 Add r1,r2,r3 nop nop CC4 Ld r7,48 Add r1,r2,3 nop nop nop CC5 new Ld r7,48 Add r1,r2,3

25 RTL for hazard detection and pipeline stall
The following RTL expression detects data hazard between stage 2 and 3, then stalls stage 1 and 2 by inserting a bubble in stage 3 alu3&alu2&((ra3=rb2)~((ra3=rc2)&!imm2)): (pause2, pause1, op30) Detects hazard Inserts a nop instruction

26 RTL for hazard detection and pipeline stall
Following is the complete RTL for detecting hazards among alu instructions in different stages of the pipeline Data Hazard between RTL for detection and stalling Stage 2 and 3 alu3&alu2&((ra3=rb2)~((ra3=rc2)&!imm2)): (pause2, pause1, op30) Stage 2 and 4 alu4&alu2&((ra4=rb2)~((ra4=rc2)&!imm2)): Stage 4 and 5 alu5&alu2&((ra5=rb2)~((ra5=rc2)&!imm2)):

27 Data Hazards

28 Pipelining Hazards

29 Branch Hazards

30 Structural Hazards

31 RTL for SRC Pipelining

32 RTL for SRC Pipelining


Download ppt "RTL for the SRC pipeline registers"

Similar presentations


Ads by Google