Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Architecture: A Constructive Approach One-Cycle Implementation Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute.

Similar presentations


Presentation on theme: "Computer Architecture: A Constructive Approach One-Cycle Implementation Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute."— Presentation transcript:

1 Computer Architecture: A Constructive Approach One-Cycle Implementation Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 29, 2012L5-1 http://csg.csail.mit.edu/6.s078

2 The MIPS ISA Processor State 32 32-bit GPRs, R0 always contains a 0 PC, the program counter some other special registers Data types 8-bit byte, 16-bit half word 32-bit word for integers Load/Store style instruction set data addressing modes- immediate & indexed branch addressing modes- PC relative & register indirect All instructions are 32 bits Byte addressable memory- big endian mode February 29, 2012 L4-2http://csg.csail.mit.edu/6.s078 r0 - r31 PC  Registers Floating point, memory management and other systems instructions are not includes in the SMIPS subset

3 Instruction formats Only three formats but the fields are used differently by different types of instructions 6 5 5 16 opcode rsrt immediate I-type 6 5 5 5 5 6 opcode rsrt rd shamt func R-type 6 26 opcode target J-type February 29, 2012 L4-3http://csg.csail.mit.edu/6.s078

4 Instruction formats cont Computational Instructions Load/Store Instructions opcode rsrt immediate rt  (rs) op immediate 6 5 5 5 5 6 0 rsrt rd 0 func rd  (rs) func (rt) rs is the base register rt is the destination of a Load or the source for a Store 6 55 16 addressing mode opcode rsrt displacement (rs) + displacement 31 26 25 21 20 16 15 0 February 29, 2012 L4-4http://csg.csail.mit.edu/6.s078

5 Control Instructions Conditional (on GPR) PC-relative branch target address = (offset in words)4 + (PC+4) range: 128 KB range Unconditional register-indirect jumps Unconditional absolute jumps target address = {PC, target4} range : 256 MB range 6 55 16 opcode rs offset BEQZ, BNEZ 6 26 opcode target J, JAL 6 55 16 opcode rs JR, JALR jump-&-link stores PC+4 into the link register (R31) February 29, 2012 L4-5http://csg.csail.mit.edu/6.s078

6 Instruction Execution  Execution of an instruction involves  1. Instruction fetch  2. Decode  3. Register fetch  4. ALU operation  5. Memory operation (optional)  6. Write back  and the computation of the next instruction address February 29, 2012 L4-6http://csg.csail.mit.edu/6.s078

7 Implementing an ISA Instructions fetch requires an Instruction memory, PC Decode requires understanding the instruction format Register Fetch requires interaction with a register file with a specific number of read/write ports ALU must have the ability to carry out the specified ops Memory operations requires a data memory Write-back requires interaction with the register file Update the PC requires arithmetic ops to calculate pc and condition February 29, 2012 L4-7http://csg.csail.mit.edu/6.s078

8 A single-cycle implementation A single-cycle MIPS implementation requires: A register file with 2 read ports and a write port An instruction memory, separate from data memory so that we can fetch an instruction as well as perform a data operation (Load/store) on the memory fetch & execute pc iMemdMem rf CPU February 29, 2012 L4-8http://csg.csail.mit.edu/6.s078

9 Single-Cycle SMIPS PC Inst Memory Decode Register File Execute Data Memory +4 February 29, 2012 L4-9http://csg.csail.mit.edu/6.s078 Datapath is shown only for convenience; it will be derived automatically from the high- level textual description

10 Single-Cycle SMIPS code structure module mkProc(Proc); // State Reg#(Addr) pc <- mkRegU; RFile rf <- mkRFile; Memory imem <- mkMemory; Memory dmem <- mkMemory; … February 29, 2012 L4-10http://csg.csail.mit.edu/6.s078

11 Single-Cycle SMIPS code structure … rule fetchExecute; //fetch let instResp = imem(MemReq{…}); //decode let decInst = decode(instResp); //register read let rVal1 = rf[…]; let rVal2 = … //execute let execInst = exec(decInst, pc, rVal1, rVal2); //memory (optional) if (instType Ld || St) data = dmem(…) //writeback if(instType Alu || Ld) rf.wr(rDst,data); pc <= execInst.brTaken ? execInst.addr : pc + 4; endrule endmodule; February 29, 2012 L4-11http://csg.csail.mit.edu/6.s078

12 Decoding dInst is all the information we need to do the later phases of execution on the instruction Not all fields of dInst are defined in each case We may decide to pass more information from decode to execute for efficiency– in that case the definition of the type of the decoded instruction has to be adjusted accordingly February 29, 2012 L5-12http://csg.csail.mit.edu/6.s078

13 Instruction grouping Many instructions have the same implementation except for the ALU function they invoke We can group such instructions and reduce the amount of code we have to write Example: R-Type ALU, I-Type ALU, Br Type, Memory Type, … February 29, 2012 L5-13http://csg.csail.mit.edu/6.s078

14 immValid Decoding Instructions: extract fields needed for execution from each instruction instruction branchComp rDst rSrc1 rSrc2 imm ext aluFunc instType 31:26 5:0 31:26 20:16 15:11 25:21 20:16 15:0 25:0 Lot of pure combinational logic: will be derived automatically from the high-level description decode February 29, 2012 L5-14http://csg.csail.mit.edu/6.s078

15 decode Decoding Instructions: input-output types instruction brComp rDst rSrc1 rSrc2 imm ext aluFunc iType 31:26, 5:0 31:26 5:0 31:26 20:16 15:11 25:21 20:16 15:0 25:0 Type DecodedInst Bit#(32) IType AluFunc Rindex BrType Bit#(32) Mux control logic not shown immValid Bool February 29, 2012 L7-15http://csg.csail.mit.edu/6.s078

16 Typedefs typedef enum {Alu, Ld, St, J, Jr, Jal, Jalr, Br} IType deriving(Bits, Eq); typedef enum {Eq, Neq, Le, Lt, Ge, Gt} BrType deriving(Bits, Eq); typedef enum {Add, Sub, And, Or, Xor, Nor, Slt, Sltu, LShift, RShift, Sra} AluFunc deriving(Bits, Eq); typedef enum {RAlu, IAlu, LdSt, J, Jr, Br} InstDType deriving(Bits, Eq); February 29, 2012 L7-16http://csg.csail.mit.edu/6.s078

17 Decode Function function DecodedInst decode(Bit#(32) inst); DecodedInst dInst = ?; let opcode = inst[ 31 : 26 ]; let rs = inst[ 25 : 21 ]; let rt = inst[ 20 : 16 ]; let rd = inst[ 15 : 11 ]; let funct = inst[ 5 : 0 ]; let imm = inst[ 15 : 0 ]; let target = inst[ 25 : 0 ]; case (getInstDType(inst))... endcase return dInst; endfunction February 29, 2012 L7-17http://csg.csail.mit.edu/6.s078  initially undefined

18 Instruction Encoding Bit#(6) opADDIU = 6’b001001; Bit#(6) opSLTI = 6’b001010; Bit#(6) opLW = 6’b100011; Bit#(6) opSW = 6’b101011; Bit#(6) opJ = 6’b000010; Bit#(6) opBEQ = 6’b000100; … Bit#(6) opFUNC = 6’b000000; Bit#(6) fcADDU = 6’b100001; Bit#(6) fcAND = 6’b100100; Bit#(6) fcJR = 6’b001000; … Bit#(6) opRT = 6’b000001; Bit#(6) rtBLTZ = 5’b00000; Bit#(6) rtBGEZ = 5’b00100; February 29, 2012 L7-18http://csg.csail.mit.edu/6.s078

19 Instruction Grouping function InstDType getInstDType(Bit#(32) inst) return case (inst[31:26]) opADDIU, opSLTI, …: IAlu; opLW, opSW: LdSt; opJ, opJAL: J; opBEQ, opRT, …: Br; opFUNC: case (inst[5:0]) fcADDU, fcAND, …: RAlu; fcJR, fcJALR: Jr; endcase; endfunction February 29, 2012 L7-19http://csg.csail.mit.edu/6.s078

20 Decoding Instructions: R-Type ALU case (getInstDType(opcode)) … RAlu: begin dInst.iType = Alu; dInst.aluFunc = case (funct) fcADDU: Add; fcSUBU: Sub;... endcase; dInst.rDst = rd; dInst.rSrc1 = rs; dInst.rSrc2 = rt; dInst.immValid = False; end February 29, 2012 L7-20http://csg.csail.mit.edu/6.s078

21 Decoding Instructions: I-Type ALU case (getInstDType(opcode)) … IAlu: begin dInst.iType = Alu; dInst.aluFunc = case (opcode) opADDIU: Add;... endcase; dInst.rDst = rt; dInst.rSrc1 = rs; dInst.imm = signedIAlu(opcode) ? signExtend(imm): zeroExtend(imm); dInst.immValid = True; end February 29, 2012 L7-21http://csg.csail.mit.edu/6.s078

22 Decoding Instructions: Load & Store case (getInstDType(opcode)) … LdSt: begin dInst.iType = opcode==opLW ? Ld : St; dInst.aluFunc = Add; dInst.rDst = rt; dInst.rSrc1 = rs; dInst.rSrc2 = rt; dInst.imm = signExtned(imm); dInst.immValid = True; end February 29, 2012 L7-22http://csg.csail.mit.edu/6.s078

23 Decoding Instructions: Jump case (getInstDType(opcode)) … J: begin dInst.iType = opcode==opJ ? J : Jal; dInst.rDst = 31; dInst.imm = zeroExtend({target, 2’b00}); dInst.immValid = False; end Jr: begin dInst.iType = funct==fcJR ? Jr : Jalr; dInst.rDst = rd; dInst.rSrc1 = rs; dInst.immValid = False; end February 29, 2012 L7-23http://csg.csail.mit.edu/6.s078

24 Decoding Instructions: Branch case (getInstDType(opcode)) … Br: begin dInst.iType = Br; dInst.brComp = case (opcode) opBEQ : EQ; opBLEZ: LE;... endcase; dInst.rSrc1 = rs; dInst.rSrc2 = rt; dInst.imm = signExtend({imm, 2’b00}); dInst.immValid = False; end February 29, 2012 L7-24http://csg.csail.mit.edu/6.s078

25 Reading Registers Read registers RVal2 RVal1 RF RSrc2 Pure combinational logic February 29, 2012 L7-25http://csg.csail.mit.edu/6.s078 RSrc1

26 Executing Instructions execute dInst addr brTaken data iType rDst ALU Branch Address pc rVal2 rVal1 Pure combinational logic either for memory reference or branch target either for rf write or St February 29, 2012 L7-26http://csg.csail.mit.edu/6.s078 ALUBr

27 Some Useful Functions function Bool memType (IType i) return (i==Ld || i == St); endfunction function Bool regWriteType (IType i) return (i==Alu || i==Ld || i==Jal || i==Jalr); endfunction function Bool controlType (IType i) return (i==J || i==Jr || i==Jal || i==Jalr || i==Br); endfunction February 29, 2012 L7-27http://csg.csail.mit.edu/6.s078

28 Execute Function function ExecInst exec(DecodedInst dInst, Data rVal1, Data rVal2, Addr pc); ExecInst einst = ?; Data aluVal2 = (dInst.immValid)? dInst.imm : rVal2 let aluRes = alu(rVal1, aluVal2, dInst.aluFunc); let brAddr = brAddrCal(pc, rVal1, dInst.iType, dInst.imm); einst.itype = dInst.iType; einst.addr = (memType(dInst.iType)? aluRes : brAddr; einst.data = dInst.iType==St ? rVal2 : aluRes; einst.brTaken = aluBr(rVal1, aluVal2, dInst.brComp); einst.rDst = dInst.rDst; return einst; endfunction February 29, 2012 L7-28http://csg.csail.mit.edu/6.s078

29 ALU function Data alu(Data a, Data b, Func func); Data res = case(func) Add : add(a, b); Sub : subtract(a, b); And : (a & b); Or : (a | b); Xor : (a ^ b); Nor : ~(a | b); Slt : setLessThan(a, b); Sltu : setLessThanUnsigned(a, b); LShift: logicalShiftLeft(a, b[4:0]); RShift: logicalShiftRight(a, b[4:0]); Sra : signedShiftRight(a, b[4:0]); endcase; return res; endfunction February 29, 2012 L7-29http://csg.csail.mit.edu/6.s078

30 Branch Resolution function Bool aluBr(Data a, Data b, BrType brComp); Bool brTaken = case(brComp) Eq : (a == b); Neq : (a != b); Le : signedLE(a, 0); Lt : signedLT(a, 0); Ge : signedGE(a, 0); Gt : signedGT(a, 0); endcase; return brTaken; endfunction February 29, 2012 L7-30http://csg.csail.mit.edu/6.s078

31 Branch Address Calculation function Addr brAddrCalc(Address pc, Data val, IType iType, Data imm); let targetAddr = case (iType) J, Jal : {pc[31:28], imm[27:0]}; Jr, Jalr : val; default : pc + imm; endcase; return targetAddr; endfunction February 29, 2012 L7-31http://csg.csail.mit.edu/6.s078

32 Single-Cycle SMIPS module mkProc(Proc); Reg#(Addr) pc <- mkRegU; RFile rf <- mkRFile; Memory mem <- mkTwoPortedMemory; let iMem = mem.iport ; let dMem = mem.dport; rule doProc; let inst = iMem(MemReq{op: Ld, addr: pc, data: ?}); let dInst = decode(inst); Data rVal1 = rf.rd1(dInst.rSrc1); Data rVal2 = rf.rd2(dInst.rSrc2); February 29, 2012 L7-32http://csg.csail.mit.edu/6.s078

33 Single-Cycle SMIPS cont let eInst = exec(dInst, rVal1, rVal2, pc); if(memType(eInst.iType)) eInst.data <- dMem(MemReq{ op: eInst.iType==Ld ? Ld : St, addr: eInst.addr, data: eInst.data}); if(regWriteType(eInst.iType)) rf.wr(eInst.rDst, eInst.data); pc <= eInst.brTaken ? eInst.addr : pc + 4; endrule endmodule February 29, 2012 L7-33http://csg.csail.mit.edu/6.s078  state updates

34 Single-Cycle SMIPS PC Inst Memory Decode Register File Execute Data Memory +4 The whole system was described using one rule; lots of big combinational functions performance? February 29, 2012 L5-34http://csg.csail.mit.edu/6.s078

35 Harvard-Style Datapath for MIPS February 29, 2012 http://csg.csail.mit.edu/6.s078L03-35 0x4 RegWrite Add clk WBSrcMemWrite addr wdata rdata Data Memory we RegDst BSrc ExtSelOpCode z OpSel clk zero? clk addr inst Inst. Memory PC rd1 GPRs rs1 rs2 ws wd rd2 we Imm Ext ALU Control 31 PCSrc br rind jabs pc+4 old way

36 Hardwired Control Table http://csg.csail.mit.edu/6.s078L03-36 OpcodeExtSelBSrcOpSelMemWRegWWBSrcRegDstPCSrc ALU ALUi ALUiu LW SW BEQZ z=0 BEQZ z=1 J JAL JR JALR BSrc = Reg / ImmWBSrc = ALU / Mem / PC RegDst = rt / rd / R31PCSrc = pc+4 / br / rind / jabs ***no yesrindPC R31 rind***no ** jabs ** * no yesPCR31 jabs * * * no ** pc+4sExt 16 *0?no ** brsExt 16 *0?no ** pc+4sExt 16 Imm+yesno** pc+4ImmOpnoyesALUrt pc+4*RegFuncnoyesALUrd sExt 16 ImmOppc+4noyesALUrt pc+4sExt 16 Imm+noyesMemrt uExt 16 February 29, 2012 old way


Download ppt "Computer Architecture: A Constructive Approach One-Cycle Implementation Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute."

Similar presentations


Ads by Google