Download presentation
Presentation is loading. Please wait.
1
1 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. The single cycle CPU
2
2 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Performance of Single-Cycle Machines Memory Unit 2 ns ALU and Adders 2 ns Register file (Read or Write) 1 ns Class Fetch Decode ALU Memory Write Back Total R-format 21 2 0 1 6 LW 21 2 2 1 8 SW 21 2 2 7ns Branch 21 2 5ns Jump 2 2ns
3
3 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. What if we had a variable CK cycle? Let’s check the following scenario: Rtype: 44%, LW: 24%, SW: 12% BRANCH: 18%, JUMP: 2% I- number of instructions in program T- time of the CK cycle CPI - number of CK cycle per instruction (=1) Execution=I*T*CPI= 8*24%+7*12%+6*44%+5*18%+2*2%=6.3 ns
4
4 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. The result: EXE Single cycle T single clock * I T single clock 8 EXE Variable T variable clock * I T variable clock 6.3 We get a ratio of 1.27. The ratio is higher when more complicated instructions, e.g., floating point instructions are also implemented. Since building a variable CK circuit is too complicated, we instead want instructions to take as many shorter CKs as required
5
5 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Multicycle Approach The idea of Multi-cycle approach: We’ll save time since each instruction takes only the necessary number of CK cycles (which are about 5 times shorter than the original CK cycle) We also save in components since we can use the same component in different phases of the same instruction
6
6 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Building a Multi-Cycle CPU: Split the instruction to steps (phases) Make sure that the steps are balanced (same time required) Reduce the job done at each step. In each step only one chore is done. At the end of each CK cycle: Store the result of the current step to be used by the next step. So, add more internal registers for storing the intermediate results.
7
7 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A single cycle CPU capable of R-type & lw/sw instructions (data & control) 5 [25:21]=Rs 5 [20:16]=Rt Reg File Instruction Memory PCALU Adde r 4 ck 6 [31:26] RegWrite 16 [15:0] 5 add Sext 16->32 Data Memory 5 [25:21]=Rs 6 [5:0]=funct ALU control Rd Address D.In D. Out MemWrite
8
8 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A single cycle CPU capable of R-type & lw/sw instructions - Data Path only 5 [25:21]=Rs 5 [20:16]=Rt Reg File Instruction Memory PCALU Adde r 4 ck 16 [15:0] 5 Sext 16->32 Data Memory 5 [25:21]=Rs Rd Address D.In D. Out lw sw
9
9 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Timing of a single cycle CPU
10
10 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. PC D. Mem data D.Mem adrs 0x400000 Rs, RtALU inputs ALU output (address) Memory output fetch Write backdecode execute Mem data memory I.Mem data PC IR A,B ALUout Mem data MDR fetch Write back decode execute memory Timing of a lw instruction in a single cycle CPU Timing of a lw instruction in a multi-cycle CPU 2ns We want to replace a long single CK cycle with 5 short ones: 1ns2ns 1ns 0x400000 Instruction in IR ALU calculates something 01345=(0)2
11
11 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Therefore we should add registers to the single cycle CPU shown below: 5 [25:21]=Rs 5 [20:16]=Rt Reg File Instruction Memory PCALU Adde r 4 ck 16 [15:0] 5 Sext 16->32 Data Memory Rd Address D.In D. Out
12
12 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Adding registers to “split” the instruction to 5 stages: 5 [25:21]=Rs 5 [20:16]=Rt Reg File Instruction Memory PCALU Adde r 4 ck 16 [15:0] 5 Sext 16->32 Data Memory Rd Address D.In D. Out IR ck A B ALUoutMDR PCWrite 2 0 3 4 1 5
13
13 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Here is the book’s version of the multi-cycle CPU: Only PC and IR have write enable signals All other registers hold data for a single cycle
14
14 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Here is our version of A mult--cycle CPU capable of R-type & lw/sw & branch instructions 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck MDR ck ALUout ck A B << 2
15
15 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Let us explain the multi-cycle CPU First we’ll look at a CPU capable of performing only R-type instructions Then, we’ll add the lw instruction And the sw instruction Then, the beq instruction And finally, the j instruction
16
16 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Let us remind ourselves how works a single cycle CPU capable of performing R-type instructions. Here you see the data-path and the timing of an R-type instruction. 5 [25:21]=Rs 5 [20:16]=Rt 5 [15:11]=Rd Reg File Instruction Memory PCALU Adde r 4 ck 6 [31:26] 6 [5:0]= funct
17
17 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A single cycle CPU demo: R-type instruction 5 [25:21]=Rs 5 [20:16]=Rt 5 [15:11]=Rd Reg File Instruction Memory PC ALU ck 4
18
18 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi cycle CPU capable of performing R-type instructions 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU ck 5 5 IR[25:21]=Rs Rd IR ck ALUout ck A B
19
19 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi cycle CPU capable of R-type & instructions fetch 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU ck 5 5 IR[25:21]=Rs Rd IR ck ALUout ck A B 0 1
20
20 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi cycle CPU capable of R-type & instructions decode 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU ck 5 5 IR[25:21]=Rs Rd IR ck ALUout ck A B 1 2
21
21 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi cycle CPU capable of R-type & instructions execute 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ck 5 5 IR[25:21]=Rs Rd IR ck ALUout ck A B ALU 2 3
22
22 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi cycle CPU capable of R-type & instructions write back 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU ck 5 5 IR[25:21]=Rs Rd IR ck ALUout ck A B Rd ck 3 4
23
23 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. PC GPR input 0x400000 Rs, RtALU inputs ALU output (Data = result of cala.) Memory output = the instruction fetch decode executeWrite Back Inst. Mem data Mem data IR A,B ALUout fetch Write back decode execute Timing of an R-type instruction in a single cycle CPU Timing of an R-type instruction in a multi-cycle CPU 34 (=0)012 PC Previous inst.Current instruction
24
24 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Mem data IR A,B ALUout fetch Write back decode execute GPR outputs ALU output IR=M ( PC ) A= Rs, B= Rt ALUuot= A op B IRWrite At the rising edge of CK: Rd=ALUout R-Type instruction takes 4 CKs PC Previous inst. Current instruction next inst. IR=M(PC) A= Rs, B= Rt ALUout = A op B Rd=ALUout The state diagram:
25
25 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type instructions (PC calc. ) 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU 4 ck 5 5 IR[25:21]=Rs Rd IR ck ALUout ck A B
26
26 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Mem data IR A,B ALUout fetch Write back decode execute GPR outputs ALU output ALUuot = A op B At the rising edge of CK: Rd=ALUout PC = PC+4 PC next PC = current PC+4current PC next inst.Previous inst. current instruction PCWrite
27
27 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi cycle CPU capable of R-type & instructions fetch 5 IR[20:16]=Rt Reg File Instruction Memory PC ALU ck 5 5 IR[25:21]=Rs Rd IR ck ALUout ck A B ALU 4
28
28 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Fetch WBR ALU Decode 1 6 0 7 R-type The state diagram of a CPU capable of R-type instructions only IR=M(PC) PC = PC+4 ALUout=A op B A=Rs B=Rt Rd = ALUout
29
29 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Fetch WBR Load ALU AdrCmp Decode WB 1 2 6 0 74 3 lw R-type lw The state diagram of a CPU capable of R-type and lw instructions ALUout= A+sext(imm) MDR = M(ALUout) Rt = MDR
30
30 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. We added registers to “split” the instruction to 5 stages. Let’s discuss the lw instruction 5 [25:21]=Rs 5 [20:16]=Rt Reg File Instruction Memory PCALU Adde r 4 ck 16 [15:0] 5 Sext 16->32 Data Memory Rd Address D.In D. Out IR ck A B ALUoutMDR PCWrite 2 0 3 4 1 5 In ths single-cycle we kept the “data flow” from left to right. Here we change that a little, since as we’ll see, we are some parts of the CPU more than once during the same instruction. So we prefer to move data the memory. All parts related to lw only are blue
31
31 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. First we draw a multi-cycle CPU capable of R-type & lw instructions: 5 IR[20:16]=Rt Reg File Instruction Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck MDR ck ALUout ck A B ALU We just moved the data memoryAll parts related to lw only are blue Data Memory
32
32 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw instructions fetch 5 IR[20:16]=Rt Reg File Instruction Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck MDR ck ALUout ck A B ALU Data Memory
33
33 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw instructions decode 5 IR[20:16]=Rt Reg File Instruction Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck MDR ck ALUout ck A B << 2 Data Memory
34
34 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw instructions AdrCmp 5 IR[20:16]=Rt Reg File Instruction Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck MDR ck ALUout ck A B ALU Data Memory
35
35 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw instructions memory 5 IR[20:16]=Rt Reg File Instruction Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd Branch Address IR ck MDR ck ALUout ck A B << 2 Data Memory
36
36 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw instructions WB 5 IR[20:16]=Rt Reg File Instruction Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck MDR ck ALUout ck A B Data Memory ck Rt
37
37 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Can we unite the Instruction & Data memories? (They are not used simultaneously as in the single cycle CPU) 5 IR[20:16]=Rt Reg File Instruction Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck MDR ck ALUout ck A B Data Memory ck
38
38 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. So here is a multi-cycle CPU capable of R-type & lw instructions using a single memory for instructions & data 5 IR[20:16]=Rt Reg File PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck MDR ck ALUout ck A B Instruction & data Memory
39
39 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. PC D. Mem data D.Mem adrs 0x400000 Rs, RtALU inputs ALU output (address) Memory output fetch Write backdecode execute Mem data memory I.Mem data PC IR A,B ALUout Mem data MDR fetch Write back decode execute memory Timing of a lw instruction in a single cycle CPU Timing of a lw instruction in a multi-cycle CPU PC+4 Previous inst. current instruction Data address Data to Rt
40
40 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Mem data IR A,B ALUout Mem data MDR fetch Write back decode execute memory GPR outputs ALU output IR=M ( PC ) PC= PC+4 A= Rs, B= Rt ALUuot= A+sext(imm) MDR=M(ALUout) At the rising edge of CK: Rt=MDR PC Previous inst. current instruction Data address Data to Rt PCWrite, IRWrite
41
41 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Fetch WBR Load ALU AdrCmp Decode WB 1 2 6 0 74 3 lw R-type The state diagram of a CPU capable of R-type and lw instructions ALUout= A+sext(imm) MDR = M(ALUout) Rt = MDR IR=M(PC) PC = PC+4 ALUout=A op B A=Rs B=Rt Rd = ALUout
42
42 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw & sw instructions 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd Branch Address IR ck MDR ck ALUout ck A B << 2 lw sw
43
43 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Fetch WBR Load ALU AdrCmp Store Decode WB 1 5 2 6 0 74 3 lw+sw R-type swlw The state diagram of a CPU capable of R-type and lw and sw instructions M(ALUout)=B IR=M(PC) PC = PC+4 ALUout=A op B A=Rs B=Rt Rd = ALUout ALUout= A+sext(imm) MDR = M(ALUout) Rt = MDR
44
44 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw/sw & branch instructions 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd IR ck IR ck ALUout ck A B <<2
45
45 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Calc PC=PC+sext(imm)<<2 Adding the instruction beq to the state diagram: Calc Rs -Rt (just to produce the zero signal) Fetch WBR Load Branch ALU AdrCmp Store Decode WB 1 5 28 6 0 74 3 lw+sw R-type beq zero swlw not zero
46
46 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Adding the instruction beq to the state diagram, a more efficient way: Let’s use the decode state in which the ALU is doing nothing to compute the branch address. We’ll have to store it for 1 more CK cycle, until we know whether to branch or not! (We store it in the ALUout reg.) Fetch WBR Load Branch ALU AdrCmp Store Decode WB 1 5 28 6 0 74 3 lw+sw R-type beq swlw Calc ALUout=PC+sext(imm)<<2 Calc Rs - Rt. If zero, load the PC with ALUout data, else do not load the PC
47
47 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw/sw & branch instructions 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd Branch Address IR ck IR ck ALUout ck A B <<2 PC+4
48
48 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Fetch Jump WBR Load Branch ALU AdrCmp Store Decode WB 1 5 28 6 9 0 74 3 lw+sw R-type beq j swlw Adding the instruction j to the state diagram: PC = PC[31:28] || IR[25:0]<<2
49
49 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. A multi-cycle CPU capable of R-type & lw/sw & branch & jump instructions 5 IR[20:16]=Rt Reg File Instruction & data Memory PC ALU 4 ck 16 IR[15:0] 5 Sext 16->32 5 IR[25:21]=Rs Rd Branch Address IR ck IR ck ALUout ck A B <<2 PC+4= next address Jump address IR[25:0] <<2 + PC[31:28]
50
50 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. The phases (steps) of all instructions 5 2896 1 0 74 3
51
51 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. MultiCycle implementation with Control
52
Final State Machine
53
53 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Fetch Jump WBR Load Branch ALU AdrCmp Store Decode WB 1 5 28 6 9 0 74 3 lw+sw R-type beq j swlw The final state diagram:
54
54 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
55
55 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Implementation: Finite State Machine for Control (The book’s version)
56
56 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Opcode= IR[31:26] zero, neg, etc. next state current state control signalsnext state calculation Outputs decoder State reg ck The Control Finite State Machine: For 10 states coded 0-9, we need 4 bits, i.e., [S3,S2,S1,S0]
57
57 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. The control signals decoder We just implement the table of slide 54: Let’s look at ALUSrcA: it is “0” in states 0 and 1 and it is “1” in states 2, 6 and 8. In all other states we don’t care. let’s look at PCWrite: it is “1” in states 0 and 9. In all other states it must be “0”. And so, we’ll fill the table below and build the decoder.
58
58 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. The state machine “next state calc.” logic R-type=000000, lw=100011, sw=101011, beq=000100, bne=000101, lui=001111, j=0000010, jal=000011, addi=001000 Fetch 0 Jump 9 WBR 7 Load 3 Branch 8 ALU 6 AdrCmp 2 Store 5 Decode 1 WB 4 lw+sw R-type beq j swlw IR31IR30IR29IR28IR27IR26 opcode S3S2S1S0 current state S3S2S1S0 next state X0XXXXX0000001 00010110000000 X X1 0X XXX XXX X 0010 0010 0011 0101 10XXXXX0010010 R-type lw sw lw+sw
59
59 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Opcode = IR[31:26] next state current state control signalsnext state calculation Outputs decoder State reg ck The Control Finite State Machine: Meally machine PCWrite PCWriteCond zero Moore machine to PC
60
60 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Microprogramming
61
61 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Microinstruction
62
Microinstruction format
63
63 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Interrupt and exception Type of event From Where ? MIPS terminology Interrupt External I/O device request ------------------------------------------------------------------------------------ Invoke Operation system Internal Exception From user program ------------------------------------------------------------------------------------- Arithmetic Overflow Internal Exception Using an undefined Instruction Internal Exception -------------------------------------------------------------------------------------- Either Exception or interrupt Hardware malfunctions
64
64 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Exceptions handling Exception typeException vector address (in hex) Undefined instruction c0 00 00 00 Arithmetic Overflow c0 00 00 20 We have 2 ways to handle exceptions: Cause register or Vectored interrupts MIPS – Cause register
65
65 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Handling exceptions 10
66
66 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Handling exceptions
67
67 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Fetch Jump WBR Load Branch ALU AdrCmp Store Decode WB 1 5 28 6 9 0 74 3 lw+sw R-type be q j swsw lw SavePC 10 IRET 1 JumpInt 11 Handling interrupts: int iret
68
68 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. DQ “1” irq int (to the state machine) eint clr_irq~ The interrupt source Handling an interrupt: remembering it in a FF until it is serviced
69
69 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Jumping to the interrupt routine C0000000 Iret Returning from interrupt Interrupt
70
70 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Jumping to the interrupt routine C0000000 Iret Returning from interrupt Interrupt irqeint 0 1
71
71 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Fetch > decode >ex >wb Fetch > Save_PC >JumpInt C0000000 IretFetch > decode > Iret The state machine in action during interrupt
72
72 Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. End of multi-cycle implementation
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.