Presentation is loading. Please wait.

Presentation is loading. Please wait.

COMP541 Datapaths II & Single-Cycle MIPS

Similar presentations


Presentation on theme: "COMP541 Datapaths II & Single-Cycle MIPS"— Presentation transcript:

1 COMP541 Datapaths II & Single-Cycle MIPS
Montek Singh Apr 2, 2012

2 Topics Complete the datapath Add control to it
Create a full single-cycle MIPS! Reading Chapter 7 Review MIPS assembly language Chapter 6 of course textbook Or, Patterson Hennessy (inside flap)

3 Top-Level CPU (MIPS) reset clk clk pc[31:2] memwrite Instr Memory MIPS
Data Memory dataadr writedata instr readdata

4 Top-Level CPU: Verilog
module top(input clk, reset, output … ); // add signals here for debugging wire [31:0] pc, instr, readdata, writedata, dataadr; wire memwrite; mips mips(clk, reset, pc, instr, memwrite, dataadr, writedata, readdata); // processor imem imem(pc[31:2], instr); // instr memory dmem dmem(clk, memwrite, dataadr, writedata, readdata); // data memory endmodule

5 Top Level Schematic (ISE)
imem MIPS dmem

6 One level down: Inside MIPS
module mips(input clk, reset, output [31:0] pc, input [31:0] instr, output memwrite, output [31:0] aluout, writedata, input [31:0] readdata); wire memtoreg, branch, pcsrc, alusrc, regdst, regwrite, jump; wire [4:0] alucontrol; // depends on your ALU wire [3:0] flags; // flags = {Z, V, C, N} controller c(instr[31:26], instr[5:0], flags, memtoreg, memwrite, pcsrc, alusrc, regdst, regwrite, jump, alucontrol); datapath dp(clk, reset, memtoreg, pcsrc, alucontrol, flags, pc, instr, aluout, writedata, readdata); endmodule

7 A Note on Flags Book’s design only uses Z (zero)
simple version of MIPS allows beq, bne, slt type of tests Our design uses { Z, V, C, N } flags Z = zero V = overflow C = carry out N = negative Allows richer variety of instructions see next slide wherever you see “zero” in these slides, it should probably read “flags”

8 A Note on Flags 4 flags produced by ALU: Z (zero): result is = 0
big NOR gate N (negative): result is < 0 SN-1 C (carry): indicates that most significant position produced a carry, e.g., “1 + (-1)” Carry from last FA V (overflow): indicates answer doesn’t fit precisely: To compare A and B, perform A–B and use condition codes: Signed comparison: LT NV LE Z+(NV) EQ Z NE ~Z GE ~(NV) GT ~(Z+(NV)) Unsigned comparison: LTU C LEU C+Z GEU ~C GTU ~(C+Z) -or-

9 Datapath flags(3:0)

10 MIPS State Elements We’ll fill out the datapath and control logic for basic single cycle MIPS first the datapath then the control logic

11 Single-Cycle Datapath: lw
Let’s start by implementing lw instruction

12 Single-Cycle Datapath: lw
First consider executing lw How does lw work? STEP 1: Fetch instruction

13 Single-Cycle Datapath: lw
STEP 2: Read source operands from register file

14 Single-Cycle Datapath: lw
STEP 3: Sign-extend the immediate

15 Single-Cycle Datapath: lw
STEP 4: Compute the memory address Note Control

16 Single-Cycle Datapath: lw
STEP 5: Read data from memory and write it back to register file

17 Single-Cycle Datapath: lw
STEP 6: Determine the address of the next instruction

18 Let’s be Clear: CPU is Single-Cycle!
Although the slides said “STEP” … … all that stuff is executed in one cycle!!! Let’s look at sw next … … and then R-type instructions

19 Single-Cycle Datapath: sw
Write data in rt to memory nothing is written back into the register file

20 Single-Cycle Datapath: R-type instr
R-Type instructions: Read from rs and rt Write ALUResult to register file Write to rd (instead of rt)

21 Single-Cycle Datapath: beq
Determine whether values in rs and rt are equal Calculate branch target address: BTA = (sign-extended immediate << 2) + (PC+4)

22 Complete Single-Cycle Processor (w/control)

23 Note: Difference due to Flags
Our Control Unit will be slightly different … because of the extra flags All flags (Z, V, C, N) are inputs to the control unit Signals such as PCSrc are produced inside the control unit

24 Control Unit Generally as shown below
but some differences because our ALU is more sophisticated flags[3:0] PCSrc Note: This will be different for our full-feature ALU! Note: This will be 5 bits for our full-feature ALU!

25 Review: Lightweight ALU from book
Function 000 A & B 001 A | B 010 A + B 011 not used 100 A & ~B 101 A | ~B 110 A - B 111 SLT

26 Review: Lightweight ALU from book
Function 000 A & B 001 A | B 010 A + B 011 not used 100 A & ~B 101 A | ~B 110 A - B 111 SLT

27 Review: Our “full feature” ALU
Full-feature ALU from COMP411: A B 5-bit ALUFN Sub Bool Shft Math OP 0 XX A+B 1 XX A-B X X X X X B<<A X B>>A X B>>>A X A & B X A | B X A ^ B X A | B Add/Sub Bidirectional Barrel Shifter Boolean Sub Bool Shft Math Flags V,C N Flag R Z Flag

28 Review: R-Type instructions
Register-type 3 register operands: rs, rt: source registers rd: destination register Other fields: op: the operation code or opcode (0 for R-type instructions) funct: the function together, op and funct tell the computer which operation to perform shamt: the shift amount for shift instructions, otherwise it is 0

29 Controller (2 modules) module controller(input [5:0] op, funct,
input [3:0] flags, output memtoreg, memwrite, output pcsrc, alusrc, output regdst, regwrite, output jump, output [2:0] alucontrol); // 5 bits for our ALU!! wire [1:0] aluop; // This will be different for our ALU wire branch; maindec md(op, memtoreg, memwrite, branch, alusrc, regdst, regwrite, jump, aluop); aludec ad(funct, aluop, alucontrol); assign pcsrc = branch & flags[3]; // flags = {Z, V, C, N} endmodule

30 This entire coding may be different in our design
Main Decoder module maindec(input [5:0] op, output memtoreg, memwrite, branch, alusrc, output regdst, regwrite, jump, output [1:0] aluop); // different for our ALU reg [8:0] controls; assign {regwrite, regdst, alusrc, branch, memwrite, memtoreg, jump, aluop} = controls; case(op) 6'b000000: controls <= 9'b ; //Rtype 6'b100011: controls <= 9'b ; //LW 6'b101011: controls <= 9'b ; //SW 6'b000100: controls <= 9'b ; //BEQ 6'b001000: controls <= 9'b ; //ADDI 6'b000010: controls <= 9'b ; //J default: controls <= 9'bxxxxxxxxx; //??? endcase endmodule Why do this? This entire coding may be different in our design

31 This entire coding will be different in our design
ALU Decoder module aludec(input [5:0] funct, input [1:0] aluop, output reg [2:0] alucontrol); // 5 bits for our ALU!! case(aluop) 2'b00: alucontrol <= 3'b010; // add 2'b01: alucontrol <= 3'b110; // sub default: case(funct) // RTYPE 6'b100000: alucontrol <= 3'b010; // ADD 6'b100010: alucontrol <= 3'b110; // SUB 6'b100100: alucontrol <= 3'b000; // AND 6'b100101: alucontrol <= 3'b001; // OR 6'b101010: alucontrol <= 3'b111; // SLT default: alucontrol <= 3'bxxx; // ??? endcase endmodule This entire coding will be different in our design

32 Control Unit: ALU Decoder
This entire coding will be different in our design ALUOp1:0 Meaning 00 Add 01 Subtract 10 Look at Funct 11 Not Used ALUOp1:0 Funct ALUControl2:0 00 X 010 (Add) X1 110 (Subtract) 1X (add) (sub) (and) 000 (And) (or) 001 (Or) (slt) 111 (SLT)

33 Control Unit: Main Decoder
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 R-type 000000 1 lw 100011 sw 101011 X beq 000100

34 Note on controller The actual number and names of control signals may be somewhat different in our/your design compared to the one given in the book because we are implementing more features/instructions SO BE VERY CAREFUL WHEN YOU DESIGN YOUR CPU!

35 Single-Cycle Datapath Example: or

36 Extended Functionality: addi
No change to datapath

37 Control Unit: addi 1 … X R-type 000000 lw 100011 sw 101011 beq 000100
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 R-type 000000 1 lw 100011 sw 101011 X beq 000100 addi 001000

38 Adding Jumps: j

39 Control Unit: Main Decoder
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 Jump R-type 000000 1 lw 100011 sw 101011 X beq 000100 j XX

40 Review: Processor Performance
Program Execution Time = (# instructions)(cycles/instruction)(seconds/cycle) = # instructions x CPI x TC

41 Single-Cycle Performance
TC is limited by the critical path (lw)

42 Single-Cycle Performance
Single-cycle critical path: Tc = tpcq_PC + tmem + max(tRFread, tsext + tmux) + tALU + tmem + tmux + tRFsetup In most implementations, limiting paths are: memory, ALU, register file. Tc = tpcq_PC + 2tmem + tRFread + tALU + tRFsetup + tmux

43 Single-Cycle Performance Example
Tc = tpcq_PC + 2tmem + tRFread + tmux + tALU + tRFsetup = [30 + 2(250) ] ps = 925 ps What’s the max clock frequency?

44 Single-Cycle Performance Example
For a program with 100 billion instructions executing on a single-cycle MIPS processor, Execution Time = # instructions x CPI x TC = (100 × 109)(1)(925 × s) = 92.5 seconds

45 Next Time Next class: Next lab: We’ll look at multi-cycle MIPS
Adding functionality to our design Next lab: Implement single-cycle CPU!


Download ppt "COMP541 Datapaths II & Single-Cycle MIPS"

Similar presentations


Ads by Google