Computer Architecture: A Constructive Approach One-Cycle Implementation Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute.

Slides:



Advertisements
Similar presentations
Multi-cycle Implementations Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology January 13, 2012L6-1
Advertisements

CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Pipelining III Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture MIPS Review & Pipelining I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture ISAs and MIPS Steve Ko Computer Sciences and Engineering University at Buffalo.
The Processor: Datapath & Control
January 26, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 3 - From CISC to RISC Krste Asanovic Electrical Engineering and.
Savio Chau Single Cycle Controller Design Last Time: Discussed the Designing of a Single Cycle Datapath Control Datapath Memory Processor (CPU) Input Output.
January 26, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 3 - From CISC to RISC Krste Asanovic Electrical Engineering and.
January 31, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 4 - Pipelining Krste Asanovic Electrical Engineering and Computer.
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Digital Architectures1 Machine instructions execution steps (1) FETCH = Read the instruction.
Computer Architecture: A Constructive Approach Instruction Representation Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.
Computer Architecture: A Constructive Approach Branch Prediction - 1 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of.
Constructive Computer Architecture Interrupts/Exceptions/Faults Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Interrupts / Exceptions / Faults Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 30, 2012L21-1
Non-Pipelined Processors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 6, 2013
Asanovic/Devadas Spring Simple Instruction Pipelining Krste Asanovic Laboratory for Computer Science Massachusetts Institute of Technology.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Constructive Computer Architecture Combinational circuits-2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
6.S078 - Computer Architecture: A Constructive Approach Introduction to SMIPS Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts.
Computer Organization & Programming Chapter 6 Single Datapath CPU Architecture.
CS 152 Computer Architecture and Engineering Lecture 3 - From CISC to RISC Krste Asanovic Electrical Engineering and Computer Sciences University of California.
Electrical and Computer Engineering University of Cyprus LAB 2: MIPS.
CS2100 Computer Organisation The Processor: Datapath (AY2015/6) Semester 1.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
Datapath and Control Unit Design
1 A single-cycle MIPS processor  An instruction set architecture is an interface that defines the hardware operations which are available to software.
Constructive Computer Architecture Virtual Memory and Interrupts Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Computer Organization Rabie A. Ramadan Lecture 3.
Computer Architecture: A Constructive Approach Branch Prediction - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of.
EE 3755 Datapath Presented by Dr. Alexander Skavantzos.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
Single Cycle Controller Design
MIPS Processor.
Lecture 5. MIPS Processor Design
Computer Architecture: A Constructive Approach Implementing SMIPS Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture Interrupts/Exceptions/Faults Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Access the Instruction from Memory
CS161 – Design and Architecture of Computer Systems
Electrical and Computer Engineering University of Cyprus
CS 230: Computer Organization and Assembly Language
Non-Pipelined Processors
ELEN 468 Advanced Logic Design
Combinational ALU Constructive Computer Architecture Arvind
Designing MIPS Processor (Single-Cycle) Presentation G
Non-Pipelined Processors
CSCI206 - Computer Organization & Programming
Non-Pipelined Processors - 2
Non-Pipelined Processors
Non-Pipelined Processors
Single-Cycle CPU DataPath.
Non-Pipelined and Pipelined Processors
Multi-cycle SMIPS Implementations
Topic 5: Processor Architecture Implementation Methodology
Rocky K. C. Chang 6 November 2017
Composing the Elements
The Processor Lecture 3.2: Building a Datapath with Control
Topic 5: Processor Architecture
Lecture 9. MIPS Processor Design – Decoding and Execution
COMS 361 Computer Organization
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
CS/COE0447 Computer Organization & Assembly Language
Non-Pipelined Processors
Branch Predictor Interface
The Processor: Datapath & Control.
COMS 361 Computer Organization
Processor: Datapath and Control
Presentation transcript:

Computer Architecture: A Constructive Approach One-Cycle Implementation Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 29, 2012L5-1

The MIPS ISA Processor State bit GPRs, R0 always contains a 0 PC, the program counter some other special registers Data types 8-bit byte, 16-bit half word 32-bit word for integers Load/Store style instruction set data addressing modes- immediate & indexed branch addressing modes- PC relative & register indirect All instructions are 32 bits Byte addressable memory- big endian mode February 29, 2012 L4-2http://csg.csail.mit.edu/6.s078 r0 - r31 PC  Registers Floating point, memory management and other systems instructions are not includes in the SMIPS subset

Instruction formats Only three formats but the fields are used differently by different types of instructions opcode rsrt immediate I-type opcode rsrt rd shamt func R-type 6 26 opcode target J-type February 29, 2012 L4-3http://csg.csail.mit.edu/6.s078

Instruction formats cont Computational Instructions Load/Store Instructions opcode rsrt immediate rt  (rs) op immediate rsrt rd 0 func rd  (rs) func (rt) rs is the base register rt is the destination of a Load or the source for a Store addressing mode opcode rsrt displacement (rs) + displacement February 29, 2012 L4-4http://csg.csail.mit.edu/6.s078

Control Instructions Conditional (on GPR) PC-relative branch target address = (offset in words)4 + (PC+4) range: 128 KB range Unconditional register-indirect jumps Unconditional absolute jumps target address = {PC, target4} range : 256 MB range opcode rs offset BEQZ, BNEZ 6 26 opcode target J, JAL opcode rs JR, JALR jump-&-link stores PC+4 into the link register (R31) February 29, 2012 L4-5http://csg.csail.mit.edu/6.s078

Instruction Execution  Execution of an instruction involves  1. Instruction fetch  2. Decode  3. Register fetch  4. ALU operation  5. Memory operation (optional)  6. Write back  and the computation of the next instruction address February 29, 2012 L4-6http://csg.csail.mit.edu/6.s078

Implementing an ISA Instructions fetch requires an Instruction memory, PC Decode requires understanding the instruction format Register Fetch requires interaction with a register file with a specific number of read/write ports ALU must have the ability to carry out the specified ops Memory operations requires a data memory Write-back requires interaction with the register file Update the PC requires arithmetic ops to calculate pc and condition February 29, 2012 L4-7http://csg.csail.mit.edu/6.s078

A single-cycle implementation A single-cycle MIPS implementation requires: A register file with 2 read ports and a write port An instruction memory, separate from data memory so that we can fetch an instruction as well as perform a data operation (Load/store) on the memory fetch & execute pc iMemdMem rf CPU February 29, 2012 L4-8http://csg.csail.mit.edu/6.s078

Single-Cycle SMIPS PC Inst Memory Decode Register File Execute Data Memory +4 February 29, 2012 L4-9http://csg.csail.mit.edu/6.s078 Datapath is shown only for convenience; it will be derived automatically from the high- level textual description

Single-Cycle SMIPS code structure module mkProc(Proc); // State Reg#(Addr) pc <- mkRegU; RFile rf <- mkRFile; Memory imem <- mkMemory; Memory dmem <- mkMemory; … February 29, 2012 L4-10http://csg.csail.mit.edu/6.s078

Single-Cycle SMIPS code structure … rule fetchExecute; //fetch let instResp = imem(MemReq{…}); //decode let decInst = decode(instResp); //register read let rVal1 = rf[…]; let rVal2 = … //execute let execInst = exec(decInst, pc, rVal1, rVal2); //memory (optional) if (instType Ld || St) data = dmem(…) //writeback if(instType Alu || Ld) rf.wr(rDst,data); pc <= execInst.brTaken ? execInst.addr : pc + 4; endrule endmodule; February 29, 2012 L4-11http://csg.csail.mit.edu/6.s078

Decoding dInst is all the information we need to do the later phases of execution on the instruction Not all fields of dInst are defined in each case We may decide to pass more information from decode to execute for efficiency– in that case the definition of the type of the decoded instruction has to be adjusted accordingly February 29, 2012 L5-12http://csg.csail.mit.edu/6.s078

Instruction grouping Many instructions have the same implementation except for the ALU function they invoke We can group such instructions and reduce the amount of code we have to write Example: R-Type ALU, I-Type ALU, Br Type, Memory Type, … February 29, 2012 L5-13http://csg.csail.mit.edu/6.s078

immValid Decoding Instructions: extract fields needed for execution from each instruction instruction branchComp rDst rSrc1 rSrc2 imm ext aluFunc instType 31:26 5:0 31:26 20:16 15:11 25:21 20:16 15:0 25:0 Lot of pure combinational logic: will be derived automatically from the high-level description decode February 29, 2012 L5-14http://csg.csail.mit.edu/6.s078

decode Decoding Instructions: input-output types instruction brComp rDst rSrc1 rSrc2 imm ext aluFunc iType 31:26, 5:0 31:26 5:0 31:26 20:16 15:11 25:21 20:16 15:0 25:0 Type DecodedInst Bit#(32) IType AluFunc Rindex BrType Bit#(32) Mux control logic not shown immValid Bool February 29, 2012 L7-15http://csg.csail.mit.edu/6.s078

Typedefs typedef enum {Alu, Ld, St, J, Jr, Jal, Jalr, Br} IType deriving(Bits, Eq); typedef enum {Eq, Neq, Le, Lt, Ge, Gt} BrType deriving(Bits, Eq); typedef enum {Add, Sub, And, Or, Xor, Nor, Slt, Sltu, LShift, RShift, Sra} AluFunc deriving(Bits, Eq); typedef enum {RAlu, IAlu, LdSt, J, Jr, Br} InstDType deriving(Bits, Eq); February 29, 2012 L7-16http://csg.csail.mit.edu/6.s078

Decode Function function DecodedInst decode(Bit#(32) inst); DecodedInst dInst = ?; let opcode = inst[ 31 : 26 ]; let rs = inst[ 25 : 21 ]; let rt = inst[ 20 : 16 ]; let rd = inst[ 15 : 11 ]; let funct = inst[ 5 : 0 ]; let imm = inst[ 15 : 0 ]; let target = inst[ 25 : 0 ]; case (getInstDType(inst))... endcase return dInst; endfunction February 29, 2012 L7-17http://csg.csail.mit.edu/6.s078  initially undefined

Instruction Encoding Bit#(6) opADDIU = 6’b001001; Bit#(6) opSLTI = 6’b001010; Bit#(6) opLW = 6’b100011; Bit#(6) opSW = 6’b101011; Bit#(6) opJ = 6’b000010; Bit#(6) opBEQ = 6’b000100; … Bit#(6) opFUNC = 6’b000000; Bit#(6) fcADDU = 6’b100001; Bit#(6) fcAND = 6’b100100; Bit#(6) fcJR = 6’b001000; … Bit#(6) opRT = 6’b000001; Bit#(6) rtBLTZ = 5’b00000; Bit#(6) rtBGEZ = 5’b00100; February 29, 2012 L7-18http://csg.csail.mit.edu/6.s078

Instruction Grouping function InstDType getInstDType(Bit#(32) inst) return case (inst[31:26]) opADDIU, opSLTI, …: IAlu; opLW, opSW: LdSt; opJ, opJAL: J; opBEQ, opRT, …: Br; opFUNC: case (inst[5:0]) fcADDU, fcAND, …: RAlu; fcJR, fcJALR: Jr; endcase; endfunction February 29, 2012 L7-19http://csg.csail.mit.edu/6.s078

Decoding Instructions: R-Type ALU case (getInstDType(opcode)) … RAlu: begin dInst.iType = Alu; dInst.aluFunc = case (funct) fcADDU: Add; fcSUBU: Sub;... endcase; dInst.rDst = rd; dInst.rSrc1 = rs; dInst.rSrc2 = rt; dInst.immValid = False; end February 29, 2012 L7-20http://csg.csail.mit.edu/6.s078

Decoding Instructions: I-Type ALU case (getInstDType(opcode)) … IAlu: begin dInst.iType = Alu; dInst.aluFunc = case (opcode) opADDIU: Add;... endcase; dInst.rDst = rt; dInst.rSrc1 = rs; dInst.imm = signedIAlu(opcode) ? signExtend(imm): zeroExtend(imm); dInst.immValid = True; end February 29, 2012 L7-21http://csg.csail.mit.edu/6.s078

Decoding Instructions: Load & Store case (getInstDType(opcode)) … LdSt: begin dInst.iType = opcode==opLW ? Ld : St; dInst.aluFunc = Add; dInst.rDst = rt; dInst.rSrc1 = rs; dInst.rSrc2 = rt; dInst.imm = signExtned(imm); dInst.immValid = True; end February 29, 2012 L7-22http://csg.csail.mit.edu/6.s078

Decoding Instructions: Jump case (getInstDType(opcode)) … J: begin dInst.iType = opcode==opJ ? J : Jal; dInst.rDst = 31; dInst.imm = zeroExtend({target, 2’b00}); dInst.immValid = False; end Jr: begin dInst.iType = funct==fcJR ? Jr : Jalr; dInst.rDst = rd; dInst.rSrc1 = rs; dInst.immValid = False; end February 29, 2012 L7-23http://csg.csail.mit.edu/6.s078

Decoding Instructions: Branch case (getInstDType(opcode)) … Br: begin dInst.iType = Br; dInst.brComp = case (opcode) opBEQ : EQ; opBLEZ: LE;... endcase; dInst.rSrc1 = rs; dInst.rSrc2 = rt; dInst.imm = signExtend({imm, 2’b00}); dInst.immValid = False; end February 29, 2012 L7-24http://csg.csail.mit.edu/6.s078

Reading Registers Read registers RVal2 RVal1 RF RSrc2 Pure combinational logic February 29, 2012 L7-25http://csg.csail.mit.edu/6.s078 RSrc1

Executing Instructions execute dInst addr brTaken data iType rDst ALU Branch Address pc rVal2 rVal1 Pure combinational logic either for memory reference or branch target either for rf write or St February 29, 2012 L7-26http://csg.csail.mit.edu/6.s078 ALUBr

Some Useful Functions function Bool memType (IType i) return (i==Ld || i == St); endfunction function Bool regWriteType (IType i) return (i==Alu || i==Ld || i==Jal || i==Jalr); endfunction function Bool controlType (IType i) return (i==J || i==Jr || i==Jal || i==Jalr || i==Br); endfunction February 29, 2012 L7-27http://csg.csail.mit.edu/6.s078

Execute Function function ExecInst exec(DecodedInst dInst, Data rVal1, Data rVal2, Addr pc); ExecInst einst = ?; Data aluVal2 = (dInst.immValid)? dInst.imm : rVal2 let aluRes = alu(rVal1, aluVal2, dInst.aluFunc); let brAddr = brAddrCal(pc, rVal1, dInst.iType, dInst.imm); einst.itype = dInst.iType; einst.addr = (memType(dInst.iType)? aluRes : brAddr; einst.data = dInst.iType==St ? rVal2 : aluRes; einst.brTaken = aluBr(rVal1, aluVal2, dInst.brComp); einst.rDst = dInst.rDst; return einst; endfunction February 29, 2012 L7-28http://csg.csail.mit.edu/6.s078

ALU function Data alu(Data a, Data b, Func func); Data res = case(func) Add : add(a, b); Sub : subtract(a, b); And : (a & b); Or : (a | b); Xor : (a ^ b); Nor : ~(a | b); Slt : setLessThan(a, b); Sltu : setLessThanUnsigned(a, b); LShift: logicalShiftLeft(a, b[4:0]); RShift: logicalShiftRight(a, b[4:0]); Sra : signedShiftRight(a, b[4:0]); endcase; return res; endfunction February 29, 2012 L7-29http://csg.csail.mit.edu/6.s078

Branch Resolution function Bool aluBr(Data a, Data b, BrType brComp); Bool brTaken = case(brComp) Eq : (a == b); Neq : (a != b); Le : signedLE(a, 0); Lt : signedLT(a, 0); Ge : signedGE(a, 0); Gt : signedGT(a, 0); endcase; return brTaken; endfunction February 29, 2012 L7-30http://csg.csail.mit.edu/6.s078

Branch Address Calculation function Addr brAddrCalc(Address pc, Data val, IType iType, Data imm); let targetAddr = case (iType) J, Jal : {pc[31:28], imm[27:0]}; Jr, Jalr : val; default : pc + imm; endcase; return targetAddr; endfunction February 29, 2012 L7-31http://csg.csail.mit.edu/6.s078

Single-Cycle SMIPS module mkProc(Proc); Reg#(Addr) pc <- mkRegU; RFile rf <- mkRFile; Memory mem <- mkTwoPortedMemory; let iMem = mem.iport ; let dMem = mem.dport; rule doProc; let inst = iMem(MemReq{op: Ld, addr: pc, data: ?}); let dInst = decode(inst); Data rVal1 = rf.rd1(dInst.rSrc1); Data rVal2 = rf.rd2(dInst.rSrc2); February 29, 2012 L7-32http://csg.csail.mit.edu/6.s078

Single-Cycle SMIPS cont let eInst = exec(dInst, rVal1, rVal2, pc); if(memType(eInst.iType)) eInst.data <- dMem(MemReq{ op: eInst.iType==Ld ? Ld : St, addr: eInst.addr, data: eInst.data}); if(regWriteType(eInst.iType)) rf.wr(eInst.rDst, eInst.data); pc <= eInst.brTaken ? eInst.addr : pc + 4; endrule endmodule February 29, 2012 L7-33http://csg.csail.mit.edu/6.s078  state updates

Single-Cycle SMIPS PC Inst Memory Decode Register File Execute Data Memory +4 The whole system was described using one rule; lots of big combinational functions performance? February 29, 2012 L5-34http://csg.csail.mit.edu/6.s078

Harvard-Style Datapath for MIPS February 29, x4 RegWrite Add clk WBSrcMemWrite addr wdata rdata Data Memory we RegDst BSrc ExtSelOpCode z OpSel clk zero? clk addr inst Inst. Memory PC rd1 GPRs rs1 rs2 ws wd rd2 we Imm Ext ALU Control 31 PCSrc br rind jabs pc+4 old way

Hardwired Control Table OpcodeExtSelBSrcOpSelMemWRegWWBSrcRegDstPCSrc ALU ALUi ALUiu LW SW BEQZ z=0 BEQZ z=1 J JAL JR JALR BSrc = Reg / ImmWBSrc = ALU / Mem / PC RegDst = rt / rd / R31PCSrc = pc+4 / br / rind / jabs ***no yesrindPC R31 rind***no ** jabs ** * no yesPCR31 jabs * * * no ** pc+4sExt 16 *0?no ** brsExt 16 *0?no ** pc+4sExt 16 Imm+yesno** pc+4ImmOpnoyesALUrt pc+4*RegFuncnoyesALUrd sExt 16 ImmOppc+4noyesALUrt pc+4sExt 16 Imm+noyesMemrt uExt 16 February 29, 2012 old way