Presentation is loading. Please wait.

Presentation is loading. Please wait.

EECC550 - Shaaban #1 Lec # 5 Winter 2003 1-6-2004 CPU Design Steps 1. Analyze instruction set operations using independent ISA => RTN => datapath requirements.

Similar presentations


Presentation on theme: "EECC550 - Shaaban #1 Lec # 5 Winter 2003 1-6-2004 CPU Design Steps 1. Analyze instruction set operations using independent ISA => RTN => datapath requirements."— Presentation transcript:

1 EECC550 - Shaaban #1 Lec # 5 Winter 2003 1-6-2004 CPU Design Steps 1. Analyze instruction set operations using independent ISA => RTN => datapath requirements. 2. Select required datapath components, connections & establish clock methodology. 3. Assemble datapath meeting the requirements. 4. Analyze the implementation of each instruction to determine setting of control points that effects the register transfer. 5. Design & assemble the control logic. (Chapter 5.4)

2 EECC550 - Shaaban #2 Lec # 5 Winter 2003 1-6-2004 Single Cycle MIPS Datapath: CPI = 1, Long Clock Cycle Jump Not Included

3 EECC550 - Shaaban #3 Lec # 5 Winter 2003 1-6-2004 Drawbacks of Single-Cycle Processor Long cycle time. All instructions must take as much time as the slowest: –Cycle time for load is longer than needed for all other instructions. Real memory is not as well-behaved as idealized memory –Cannot always complete data access in one (short) cycle. Cannot pipeline (overlap) the processing of one instruction with the previous instructions. –(instruction pipelining, chapter 6).

4 EECC550 - Shaaban #4 Lec # 5 Winter 2003 1-6-2004 Abstract View of Single Cycle CPU PC Next PC Register Fetch ALU Reg. Wrt Mem Access Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp MemWr Equal Branch, Jump RegWr MemWr MemRd Main Control ALU control op fun Ext One CPU Clock Cycle Duration C = 8ns One instruction per cycle CPI = 1

5 EECC550 - Shaaban #5 Lec # 5 Winter 2003 1-6-2004 Single Cycle Instruction Timing PCInst Memory mux ALUData Mem mux PCReg FileInst Memory mux ALU mux PCInst Memory mux ALUData Mem PCInst Memorycmp mux Reg File Arithmetic & Logical Load Store Branch Critical Path setup (Determines CPU clock cycle, C)

6 EECC550 - Shaaban #6 Lec # 5 Winter 2003 1-6-2004 Clock Cycle Time & Critical Path Critical path: the slowest path between any two storage devices Clock Cycle time is a function of the critical path, and must be greater than: –Clock-to-Q + Longest Path through the Combination Logic + Setup Clk........................ One CPU Clock Cycle Duration C = 8ns here

7 EECC550 - Shaaban #7 Lec # 5 Winter 2003 1-6-2004 Reducing Cycle Time: Multi-Cycle Design Cut combinational dependency graph by inserting registers / latches. The same work is done in two or more shorter cycles, rather than one long cycle. storage element Acyclic Combinational Logic storage element Acyclic Combinational Logic (A) storage element Acyclic Combinational Logic (B) =>

8 EECC550 - Shaaban #8 Lec # 5 Winter 2003 1-6-2004 Instruction Processing Steps Obtain instruction from program storage Determine instruction type Obtain operands from registers Compute result value or status Store result in register/memory if needed (usually called Write Back). Update program counter to address of next instruction } Common steps for all instructions Instruction Fetch Instruction Decode Execute Result Store Next Instruction Instruction  Mem[PC] PC  PC + 4 (For MIPS)

9 EECC550 - Shaaban #9 Lec # 5 Winter 2003 1-6-2004 Partitioning The Single Cycle Datapath Add registers between steps to break into cycles PC Next PC Operand Fetch Exec Reg. File Mem Access Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp MemWr Branch, Jump RegWr MemWr MemRd Instruction Fetch Cycle (IF) Instruction Decode Cycle (ID) Execution Cycle (EX) Data Memory Access Cycle (MEM) Write back Cycle (WB) 12345

10 EECC550 - Shaaban #10 Lec # 5 Winter 2003 1-6-2004 Example Multi-cycle Datapath PC Next PC Ext ALU Reg. File Mem Acces s Data Mem ALUctr RegDst ALUSrc ExtOp Branch, Jump RegWr MemWr MemRd IR A B R M Reg File MemToReg Equal Registers added: (not shown register write enable control lines) IR: Instruction register A, B: Two registers to hold operands read from register file. R: or ALUOut, holds the output of the main ALU M: or Memory data register (MDR) to hold data read from data memory CPU Clock Cycle Time: Worst cycle delay = C = 2ns (ignoring MUX, CLK-Q delays) Instruction Fetch (IF) 2ns Instruction Decode (ID) 1ns Execution (EX) 2ns Memory (MEM) 2ns Write Back (WB) 1ns To Control Unit

11 EECC550 - Shaaban #11 Lec # 5 Winter 2003 1-6-2004 Operations (Dependant RTN) for Each Cycle Instruction Fetch Instruction Decode Execution Memory Write Back R-Type IR  Mem[PC] A  R[rs] B  R[rt] R  A + B R[rd]  R PC  PC + 4 Logic Immediate IR  Mem[PC] A  R[rs] R  A OR ZeroExt[imm16] R[rt]  R PC  PC + 4 Load IR  Mem[PC] A  R[rs] R  A + SignEx(Im16) M  Mem[R] R[rt]  M PC  PC + 4 Store IR  Mem[PC] A  R[rs] B  R[rt] R  A + SignEx(Im16) Mem[R]  B PC  PC + 4 Branch IR  Mem[PC] A  R[rs] B  R[rt] Zero  R[rs] - R[rt] If Zero = 1: PC  PC + 4 + (SignExt(imm16) x4) else (i.e Zero =0): PC  PC + 4 IF ID EX MEM WB

12 EECC550 - Shaaban #12 Lec # 5 Winter 2003 1-6-2004 MIPS Multi-Cycle Datapath: Five Cycles of Load Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IF IDEXMEMWBLoad 1- Instruction Fetch (IF): Fetch the instruction from instruction Memory. 2- Instruction Decode (ID): Operand Register Fetch and Instruction Decode. 3- Execute (EX): Calculate the effective memory address. 4- Memory (MEM): Read the data from the Data Memory. 5- Write Back (WB): Write the loaded data to the register file. Update PC.

13 EECC550 - Shaaban #13 Lec # 5 Winter 2003 1-6-2004 Multi-cycle Datapath Instruction CPI R-Type/Immediate: Require four cycles, CPI = 4 – IF, ID, EX, WB Loads: Require five cycles, CPI = 5 – IF, ID, EX, MEM, WB Stores: Require four cycles, CPI = 4 –IF, ID, EX, MEM Branches/Jumps: Require three cycles, CPI = 3 – IF, ID, EX Average program CPI: 3  CPI  5 depending on program profile (instruction mix).

14 EECC550 - Shaaban #14 Lec # 5 Winter 2003 1-6-2004 Single Cycle Vs. Multi-Cycle CPU Single-Cycle CPU: CPI = 1 C = 8ns One million instructions take = I x CPI x C = 10 6 x 1 x 8x10 -9 = 8 msec Multi-Cycle CPU: CPI = 3 to 5 C = 2ns One million instructions take from 10 6 x 3 x 2x10 -9 = 6 msec to 10 6 x 5 x 2x10 -9 = 10 msec depending on instruction mix used. 8ns (125 MHz)

15 EECC550 - Shaaban #15 Lec # 5 Winter 2003 1-6-2004 Finite State Machine (FSM) Control Model State specifies control points for Register Transfer. Transfer occurs upon exiting state (falling edge). State X Register Transfer Control Points Depends on Input Control State Next State Logic Output Logic inputs (conditions) outputs (control points) Next State Last State

16 EECC550 - Shaaban #16 Lec # 5 Winter 2003 1-6-2004 Control Specification For Multi-cycle CPU Finite State Machine (FSM) - State Transition Diagram IR   MEM[PC] R-type A  R[rs] B  R[rt] R  A fun B R[rd]  R PC  PC + 4 R  A or ZX R[rt]  R PC  PC + 4 ORi R  A + SX R[rt]  M PC  PC + 4 M  MEM[R] LW R  A + SX MEM[R]  B PC  PC + 4 BEQ & Zero BEQ & ~Zero PC  PC + 4 PC  PC + 4+ SX || 00 SW “instruction fetch” “decode / operand fetch” Execute Memory Write-back To instruction fetch 13 states: 4 State Flip-Flops needed

17 EECC550 - Shaaban #17 Lec # 5 Winter 2003 1-6-2004 Traditional FSM Controller State 6 4 11 next State op Equal control points stateopcond next state control points Truth or Transition Table datapath State To datapath

18 EECC550 - Shaaban #18 Lec # 5 Winter 2003 1-6-2004 Traditional FSM Controller datapath + state diagram => control Translate RTN statements into control points. Assign states. Implement the controller.

19 EECC550 - Shaaban #19 Lec # 5 Winter 2003 1-6-2004 Mapping RTNs To Control Points Examples & State Assignments IR  MEM[PC] 0000 R-type A  R[rs] B  R[rt] 0001 R  A fun B 0100 R[rd]  R PC  PC + 4 0101 R  A or ZX 0110 R[rt]  R PC  PC + 4 0111 ORi R  A + SX 1000 R[rt]  M PC  PC + 4 1010 M  MEM[R] 1001 LW R  A + SX 1011 MEM[R]  B PC  PC + 4 1100 BEQ & Zero BEQ & ~Zero PC  PC + 4 0011 PC  PC + 4+SX || 00 0010 SW “instruction fetch” “decode / operand fetch” Execute Memory Write-back imem_rd, IRen Aen, Ben ALUfun, Sen RegDst, RegWr, PCen To instruction fetch state 0000 To instruction fetch state 0000 0 1 2 3 4 57 8 9 10 11 6 12

20 EECC550 - Shaaban #20 Lec # 5 Winter 2003 1-6-2004 Detailed Control Specification - State Transition Table Current Op fieldZNext IR PC Ops Exec Mem Write-Back State en selA B Ex Sr ALU S R W MM-R Wr Dst 0000???????00011 0001BEQ000111 1 0001BEQ100101 1 0001R-typex01001 1 0001orIx01101 1 0001LWx10001 1 0001SWx10111 1 0010xxxxxxx00001 1 0011xxxxxxx00001 0 0100xxxxxxx01010 1 fun 1 0101xxxxxxx00001 0 0 1 1 0110xxxxxxx01110 0 or 1 0111xxxxxxx00001 0 0 1 0 1000xxxxxxx10011 0 add 1 1001xxxxxxx10101 0 1 1010 xxxxxxx00001 0 1 1 0 1011xxxxxxx11001 0 add 1 1100xxxxxxx0000 1 00 1 R ORI LW SW BEQ IF ID Can be combines in one state

21 EECC550 - Shaaban #21 Lec # 5 Winter 2003 1-6-2004 Alternative Multiple Cycle Datapath (In Textbook) Miminizes Hardware: 1 memory, 1 ALU

22 EECC550 - Shaaban #22 Lec # 5 Winter 2003 1-6-2004 Alternative Multiple Cycle Datapath (In Textbook) Shared instruction/data memory unit A single ALU shared among instructions Shared units require additional or widened multiplexors Temporary registers to hold data between clock cycles of the instruction: Additional registers: Instruction Register (IR), Memory Data Register (MDR), A, B, ALUOut

23 EECC550 - Shaaban #23 Lec # 5 Winter 2003 1-6-2004 Alternative Multiple Cycle Datapath With Control Lines (Fig 5.33 In Textbook) (ORI not supported, Jump supported) PC+ 4 Branch Target

24 EECC550 - Shaaban #24 Lec # 5 Winter 2003 1-6-2004 Operations In Each Cycle Instruction Fetch Instruction Decode Execution Memory Write Back R-Type IR  Mem[PC] PC  PC + 4 A  R[rs] B  R[rt] ALUout  PC + (SignExt(imm16) x4) ALUout  A + B R[rd]  ALUout Logic Immediate IR  Mem[PC] PC  PC + 4 A  R[rs] B  R[rt] ALUout  PC + (SignExt(imm16) x4) ALUout  A OR ZeroExt[imm16] R[rt]  ALUout Load IR  Mem[PC] PC  PC + 4 A  R[rs] B  R[rt] ALUout  PC + (SignExt(imm16) x4) ALUout   A + SignEx(Im16) M  Mem[ALUout] R[rt]  Mem Store IR  Mem[PC] PC  PC + 4 A  R[rs] B  R[rt] ALUout  PC + (SignExt(imm16) x4) ALUout  A + SignEx(Im16) Mem[ALUout]  B Branch IR  Mem[PC] PC  PC + 4 A  R[rs] B  R[rt] ALUout  PC + (SignExt(imm16) x4) If Equal = 1 PC  ALUout IF ID EX MEM WB

25 EECC550 - Shaaban #25 Lec # 5 Winter 2003 1-6-2004 High-Level View of Finite State Machine Control First steps are independent of the instruction class Then a series of sequences that depend on the instruction opcode Then the control returns to fetch a new instruction. Each box above represents one or several state.

26 EECC550 - Shaaban #26 Lec # 5 Winter 2003 1-6-2004 Instruction Fetch (IF) and Decode (ID) FSM States IF ID

27 EECC550 - Shaaban #27 Lec # 5 Winter 2003 1-6-2004 Load/Store Instructions FSM States EX MEM WB

28 EECC550 - Shaaban #28 Lec # 5 Winter 2003 1-6-2004 R-Type Instructions FSM States EX WB

29 EECC550 - Shaaban #29 Lec # 5 Winter 2003 1-6-2004 Jump Instruction Single EX State Branch Instruction Single EX State EX

30 EECC550 - Shaaban #30 Lec # 5 Winter 2003 1-6-2004 FSM State Transition Diagram (From Book) IF ID EX MEM WB

31 EECC550 - Shaaban #31 Lec # 5 Winter 2003 1-6-2004 Finite State Machine (FSM) Specification Finite State Machine (FSM) Specification IR  MEM[PC] PC  PC + 4 R-type ALUout  A fun B R[rd]  ALUout ALUout  A op ZX R[rt]  ALUout ORi ALUout  A + SX R[rt]  M M  MEM[ALUout] LW ALUout  A + SX MEM[ALUout]  B SW “instruction fetch” “decode” Execute Memory Write-back 0000 0001 0100 0101 0110 0111 1000 1001 1010 1011 1100 BEQ 0010 If A = B then PC  ALUout A  R[rs] B  R[rt] ALUout  PC +SX To instruction fetch

32 EECC550 - Shaaban #32 Lec # 5 Winter 2003 1-6-2004 MIPS Multi-cycle Datapath Performance Evaluation What is the average CPI? –State diagram gives CPI for each instruction type. –Workload (program) below gives frequency of each type. TypeCPI i for typeFrequency CPI i x freqI i Arith/Logic 440%1.6 Load 5 30%1.5 Store 410%0.4 branch 320%0.6 Average CPI: 4.1 Better than CPI = 5 if all instructions took the same number of clock cycles (5).


Download ppt "EECC550 - Shaaban #1 Lec # 5 Winter 2003 1-6-2004 CPU Design Steps 1. Analyze instruction set operations using independent ISA => RTN => datapath requirements."

Similar presentations


Ads by Google