ELEN 350 Multi-Cycle Datapath Adapted from the lecture notes of John Kubiatowicz (UCB) and Hank Walker (TAMU)

Slides:



Advertisements
Similar presentations
EEM 486 EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor.
Advertisements

EECC550 - Shaaban #1 Lec # 4 Summer Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target ISA.
361 datapath Computer Architecture Lecture 8: Designing a Single Cycle Datapath.
CS61C L19 CPU Design : Designing a Single-Cycle CPU (1) Beamer, Summer 2007 © UCB Scott Beamer Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
EECC550 - Shaaban #1 Lec # 5 Winter Major CPU Design Steps 1. Analyze instruction set operations using independent RTN ISA => RTN => datapath.
CS61C L26 Single Cycle CPU Datapath II (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use.
EECC550 - Shaaban #1 Lec # 5 Winter CPU Design Steps 1. Analyze instruction set operations using independent ISA => RTN => datapath requirements.
Savio Chau Single Cycle Controller Design Last Time: Discussed the Designing of a Single Cycle Datapath Control Datapath Memory Processor (CPU) Input Output.
EECC550 - Shaaban #1 Lec # 5 Winter CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
EECC550 - Shaaban #1 Lec # 5 Winter CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
CS152 / Kubiatowicz Lec9.1 9/28/01©UCB Fall 2001 CS 152 Computer Architecture and Engineering Lecture 9 Designing a Multicycle Processor February 15, 2001.
Adapted from the lecture notes of John Kubiatowicz (UCB)
CS 61C L34 Single Cycle CPU Control I (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
ECE 232 L15.Miulticycle.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 15 Multi-cycle.
Microprocessor Design
Give qualifications of instructors: DAP
CS152 / Kubiatowicz Lec9.1 2/26/03©UCB Spring 2003 CS 152 Computer Architecture and Engineering Lecture 9 Designing a Multicycle Processor February 26,
ECE 232 L13. Control.1 ©UCB, DAP’ 97 ECE 232 Hardware Organization and Design Lecture 13 Control Design
Recap: Processor Design is a Process
CS 61C L17 Control (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #17: CPU Design II – Control
CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Intel is prototyping circuits that.
EECC550 - Shaaban #1 Lec # 4 Winter Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target.
EEM 486: Computer Architecture Lecture 3 Designing a Single Cycle Datapath.
CS61C L27 Single-Cycle CPU Control (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 27 Single-cycle.
CS 61C L16 Datapath (1) A Carle, Summer 2004 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #16 – Datapath Andy.
CS61C L20 Single Cycle Datapath, Control (1) Chae, Summer 2008 © UCB Albert Chae, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
361 control Computer Architecture Lecture 9: Designing Single Cycle Control.
EECC550 - Shaaban #1 Lec # 5 Winter Major CPU Design Steps 1. Analyze instruction set operations using independent RTN ISA => RTN => datapath.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures.
EECC550 - Shaaban #1 Lec # 5 Spring CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
CPSC 321 Computer Architecture and Engineering Lecture 8 Designing a Multicycle Processor Instructor: Rabi Mahapatra & Hank Walker Adapted from the lecture.
ECE 232 L12.Datapath.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 12 Datapath.
ELEN 350 Single Cycle Datapath Adapted from the lecture notes of John Kubiatowicz(UCB) and Hank Walker (TAMU)
Major CPU Design Steps 1. Analyze instruction set operations using independent RTN ISA => RTN => datapath requirements. This provides the the required.
EECC550 - Shaaban #1 Lec # 5 Spring CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
CS61C L27 Single Cycle CPU Control (1) Garcia, Fall 2006 © UCB Wireless High Definition?  Several companies will be working on a “WirelessHD” standard,
CS3350B Computer Architecture Winter 2015 Lecture 5.6: Single-Cycle CPU: Datapath Control (Part 1) Marc Moreno Maza [Adapted.
ECS154B Computer Architecture Designing a Multicycle Processor Note Set 4
CASE STUDY OF A MULTYCYCLE DATAPATH. Alternative Multiple Cycle Datapath (In Textbook) Minimizes Hardware: 1 memory, 1 ALU Ideal Memory Din Address 32.
EEM 486: Computer Architecture Designing Single Cycle Control.
5. The Processor: Datapath and Control
Designing a Single Cycle Datapath In this lecture, slides from lectures 3, 8 and 9 from the course Computer Architecture ECE 201 by Professor Mike Schulte.
EEM 486: Computer Architecture Designing a Single Cycle Datapath.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
CPE 442 single-cycle datapath.1 Intro. To Computer Architecture CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath.
W.S Computer System Design Lecture 4 Wannarat Suntiamorntut.
Datapath and Control Unit Design
CS3350B Computer Architecture Winter 2015 Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2) Marc Moreno Maza [Adapted.
By Wannarat Computer System Design Lecture 4 Wannarat Suntiamorntut.
Csci 136 Computer Architecture II –Single-Cycle Datapath Xiuzhen Cheng
EEM 486: Computer Architecture Lecture 3 Designing Single Cycle Control.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single-Cycle CPU Datapath & Control Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic.
Single Cycle Controller Design
CS 110 Computer Architecture Lecture 11: Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger School of Information.
Problem with Single Cycle Processor Design
(Chapter 5: Hennessy and Patterson) Winter Quarter 1998 Chris Myers
Designing a Multicycle Processor
Designing a Multicycle Processor
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
John Lazzaro ( CS152 – Computer Architecture and Engineering Lecture 8 – Multicycle Design and Microcode John.
CS152 Computer Architecture and Engineering Lecture 8 Designing a Single Cycle Datapath Start: X:40.
COMS 361 Computer Organization
Instructors: Randy H. Katz David A. Patterson
Alternative datapath (book): Multiple Cycle Datapath
COMS 361 Computer Organization
What You Will Learn In Next Few Sets of Lectures
Processor: Datapath and Control
Presentation transcript:

ELEN 350 Multi-Cycle Datapath Adapted from the lecture notes of John Kubiatowicz (UCB) and Hank Walker (TAMU)

Abstract View of our single cycle processor °looks like a FSM with PC as state PC Next PC Register Fetch ALU Reg. Wrt Mem Access Data Mem Instruction Fetch ALUctr RegDst ALUSrc ExtOp MemWr Equal nPC_sel RegWr MemWr MemRd Control Unit op fun Ext

What’s wrong with our CPI=1 processor? °All instructions take as much time as the slowest °Long Cycle Time °Real memory is not as nice as our idealized memory cannot always get the job done in one (short) cycle PCInst Memory mux ALUData Mem mux PCReg FileInst Memory mux ALU mux PCInst Memory mux ALUData Mem PCInst Memorycmp mux Reg File Arithmetic & Logical Load Store Branch Critical Path setup

Reducing Cycle Time °Cut combinational dependency graph and insert register / latch °Do same work in two fast cycles, rather than one slow one °May be able to short-circuit path and remove some components for some instructions! storage element Combinational Logic storage element Combinational Logic (A) storage element Combinational Logic (B) 

Partitioning the Singlecycle Datapath °Add registers between smallest steps °Place enables on all registers PC Next PC Operand Fetch Exec Reg. File Mem Access Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp MemWr nPC_sel RegWr MemWr MemRd Equal

Example Multicycle Datapath °Critical Path ? PC Next PC Operand Fetch Instruction Fetch nPC_sel IR Reg File Ext ALU Reg. File Mem Acces s Data Mem Result Store RegDst RegWr MemWr MemRd S M MemToReg Equal ALUctr ALUSrc ExtOp A B E

R-type (add, sub,...) °Instruction °Register Transfers inst Logical Register Transfers ADDUR[rd] <– R[rs] + R[rt]; PC <– PC + 4 cycle Register Transfers 1. IR <– MEM[pc] ADDU 2. A<– R[rs]; B <– R[rt] 3. S <– A + B 4. R[rd] <– S; PC <– PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem Time A B E

Logical immed °Instruction °Register Transfers ORIR[rt] <– R[rs] OR ZExt(Im16); PC <– PC + 4 cycle Register Transfers 1. IR <– MEM[pc] ORI2. A<– R[rs]; B <– R[rt] 3. S <– A or ZExt(Im16) 4. R[rt] <– S; PC <– PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem Time A B E

Load °Instruction °Register Transfers LWR[rt] <– MEM[R[rs] + SExt(Im16)]; PC <– PC + 4 cycle Register Transfers 1. IR <– MEM[pc] LW2. A<– R[rs]; B <– R[rt] 3. S <– A + SExt(Im16) 4. M <– MEM[S] 5. R[rd] <– M; PC <– PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem A B E Time

Store °Instruction °Register Transfers SWMEM[R[rs] + SExt(Im16)] <– R[rt]; PC <– PC + 4 inst Register Transfers IR <– MEM[pc] SWA<– R[rs]; B <– R[rt] S <– A + SExt(Im16); MEM[S] <– BPC <– PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem A B E Time

Branch °Instruction °Register Transfers BEQif R[rs] == R[rt] then PC <= PC + 4+SExt(Im16) || 00 else PC <= PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem inst Register Transfers IR <– MEM[pc] BEQE<– (R[rs] = R[rt]) if !E then PC <– PC + 4 else PC <– PC+4+SExt(Im16)||00 A B E Time

Performance Evaluation °What is the average CPI? state diagram gives CPI for each instruction type workload gives frequency of each type TypeCPI i for typeFrequency CPI i x freqI i Arith/Logic440%1.6 Load530%1.5 Store410%0.4 branch320%0.6 Average CPI:4.1

Verilog Implementation (IM) module IM(IR, PC, clk, IRen); output [31:0] IR; input [31:0] PC; input clk, IRen; reg [31:0] IR; reg [31:0] mem[0:1023]; wire [31:0] IR_next; // OK, but slow // clk) // IR = mem[PC[12:2]]; assign IR_next = mem[PC[12:2]]; clk) if (IRen) IR = IR_next; endmodule IR Inst. Mem PC

Verilog Implementation (REGS) module REGS(A, B, E, RA, RB, RW, W, RegWr, clk, REGSen); output [31:0] A, B; output E; // A == B input [4:0] RA, RB, RW; input [31:0] W; input RegWr, clk, REGSen; reg [31:0] A, B; reg E; wire E_next; reg [31:0] regs[0:31]; assign E_next = (A_next == B_next) ? 1 : 0; clk) begin if (REGSen == 1) begin A = regs[RA]; B = regs[RB]; E = E_next; if (RegWr == 1’b1) regs[RW] = W; regs[0] = 0; end end endmodule Reg File A B E

Verilog Implementation (ALU) module ALU(S, A, B, ALUCtr, clk, ALUen); output [31:0] S; input [31:0] A, B; input [2:0] ALUCtr; input clk, ALUen; reg [31:0] S, S_next; or B or ALUCtr) begin if (ALUCtr == 3'h0) S_next = A + B;... end clk) begin if (ALUen == 1) S = S_next; end endmodule Exec S A B

Control °State specifies control points for Register Transfer °Transfer occurs upon entering state (rising edge) Current State Next State Logic Output Logic inputs Output control signals

State Machine for multicycle MIPS IR <= MEM[PC] R-type A <= R[rs] B <= R[rt] E <= R[rt]==R[rs] S <= A fun B R[rd] <= S PC <= PC + 4 S <= A or ZX R[rt] <= S PC <= PC + 4 ORi S <= A + SX R[rt] <= M PC <= PC + 4 M <= MEM[S] LW S <= A + SX MEM[S] <= B PC <= PC + 4 BEQ PC <= Next(PC,Equal) SW “start / instruction fetch” “decode / operand fetch” Execute Memory Write-back

State Machine that Generates Control Signals IR <= MEM[PC] R-type A <= R[rs] B <= R[rt] S <= A fun B R[rd] <= S PC <= PC + 4 S <= A or ZX R[rt] <= S PC <= PC + 4 ORi S <= A + SX R[rt] <= M PC <= PC + 4 M <= MEM[S] LW S <= A + SX MEM[S] <= B PC <= PC + 4 BEQ PC <= Next(PC,Equal) SW “start, instruction fetch” “decode” IRen ALUCtr, ALUen RegDst, RegWr, PCen REGSen Execute Memory Write-back

State Machine Implementation in Verilog 1 module CTRL(clk, rst, opcode, IRen, REGSen, ALUen, ALUCtr, REGDst, REGWr, PCen); input clk, rst; input [5:0] opcode; output IRen, REGSen, ALUen, ALUCtr, REGDst, REGWr, PCen; reg [3:0] state, next_state; reg IRen, REGSen, ALUen, ALUCtr, REGDst, REGWr, PCen; parameter [3:0] START = 0, DECODE = 1, RTYPE_1 = 2, RTYPE_2 = 3; // other states omitted

State Machine in Verilog 2 (posedge clk or negedge rst) begin if (!rst) state = START; else state = next_state; // asynchronous reset end (opcode or state) begin case (state) START: state_next = DECODE; DECODE: if (opcode == 6’h00) state_next = RTYPE_1; else if (opcode == 6’h02) state_next = ORI; else if (opcode == 6’h32) state_next = LW; // other states omitted RTYPE_1: state_next = RTYPE_2; RTYPE_2: state_next = START; endcase end

State Machine in Verilog 3 assign IRen = (state == START) ? 1 : 0; assign REGSen = (state == DECODE) ? 1 : 0; assign ALUen = (state == RTYPE_1 || state == ORI || state == LW || state == SW) ? 1 : 0;

Assigning States IR <= MEM[PC] R-type A <= R[rs] B <= R[rt] S <= A fun B R[rd] <= S PC <= PC + 4 S <= A or ZX R[rt] <= S PC <= PC + 4 ORi S <= A + SX R[rt] <= M PC <= PC + 4 M <= MEM[S] LW S <= A + SX MEM[S] <= B PC <= PC + 4 BEQ PC <= Next(PC) SW “start, instruction fetch” “decode” Execute Memory Write-back

(Mostly) Detailed Control Specification (missing  0) 0000??????? BEQx R-typex ORIx LWx SWx xxxxxx x 0 x 0011xxxxxx x 0 x 0100xxxxxxx fun xxxxxxx xxxxxxx or xxxxxxx xxxxxxx add xxxxxxx xxxxxxx xxxxxxx add xxxxxxx StateOp fieldEqNext IRPCOpsExecMemWrite-Back en selA B EEx Sr ALU S R W MM-R Wr Dst R: ORi: LW: SW: -all same in Moore machine BEQ:

Controller Design Alternative: Microprogramming °The state machines defining the controller for an instruction set processor are highly structured °Use this structure to construct a simple “microsequencer” °Control reduces to programming this very simple device  microprogramming sequencer control datapath control micro-PC sequencer microinstruction