Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS161 – Design and Architecture of Computer Systems

Similar presentations


Presentation on theme: "CS161 – Design and Architecture of Computer Systems"— Presentation transcript:

1 CS161 – Design and Architecture of Computer Systems
Multi-Cycle CPU Design

2 Quick recap: Single Cycle Implementation
Calculate cycle time assuming negligible delays except: memory (2ns), ALU and adders (2ns), register file access (1ns)

3 Single Cycle – Steps of each instruction
Inst. Type Functional Units Used R-type Instruction fetch Register read ALU Register write Load Memory access Store Branch Jump

4 Single Cycle – How long is the cycle?
Inst. Type Inst. Mem. Reg. File (read) ALU (s) Data Mem. (write) Total R-type 2 1 6 ns Load 8 ns Store 7 ns Branch 5 ns Jump 2 ns Inst. % 44 24 12 18 2 The cycle time must accommodate the longest operation: lw. Cycle time = 8 ns but the CPI = 1. If we can accommodate variable number of cycles for each instruction and a cycle time of 1ns. CPI = 6*44% + 8*24% + 7*12% + 5*18% + 2*2% = 6.3

5 Single cycle execution problem
The cycle time depends on the most time-consuming instruction What happens if we implement a more complex instruction, e.g., a floating-point multiplication All resources are simultaneously active – there is no sharing of resources (waste of area and potentially slower clock) We’ll adopt a multi-cycle solution which allows us to Use a faster clock Reduce physical resources Determine clock cycle time independently of instruction processing time Each instruction takes as many clock cycles as it needs to take (different instructions take different number of cycles) Multiple state transitions per instruction The states followed by each instruction is different

6 Multicycle Approach We will be reusing functional units
ALU used to compute address and to increment PC Memory used for instruction and data At the end of a cycle, keep results in registers Additional registers Now, control signals are NOT solely determined by the instruction bits! Controls will be generated by a Finite State Machine (FSM)!

7 Review: finite state machines
a set of states and next state function (determined by current state and the input) output function (determined by current state and possibly input) We’ll use a Moore machine (output based only on current state)

8 Multicycle Approach Break up the instructions into steps, each step takes a cycle balance the amount of work to be done restrict each cycle to use only one major functional unit At the end of a cycle store values for use in later cycles (easiest thing to do) introduce additional “internal” registers

9 Muticyle Datapath

10 INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!
Five Execution Steps Instruction Fetch (IF) Instruction Decode and Register Fetch (ID/RF) Execution, Memory Address Computation, or Branch Completion (EX) Memory Access or R-type instruction completion (M) Write-back step (WB) What is CPI in Multicycle machines? INSTRUCTIONS TAKE FROM CYCLES!

11 Step 1: Instruction Fetch
Access memory w/ PC to fetch instruction and store it in Instruction Register (IR) Increment PC by 4 using ALU and put the result back in the PC We can do this because ALU is not busy in this cycle Can be described concisely using RTL "Register-Transfer Language" IR = Memory[PC]; PC = PC + 4;

12 Datapath of Multicycle Implementation

13 Step 2: Instruction Decode & Register Fetch
Read registers rs and rt We read both of them regardless of necessity Compute the branch address in case the instruction is a branch We can do this because ALU is not busy ALUOut will keep the target address RTL: A = Reg[IR[25-21]]; B = Reg[IR[20-16]]; ALUOut = PC + (sign-extend(IR[15-0]) << 2); We have not set any control signal based n the instruction type yet We are busy "decoding" it in our control logic

14 Single Cycle Implementation

15 Datapath of Multicycle Implementation

16 Step 3 (instruction dependent)
ALU is performing one of three functions, based on instruction type Memory Reference: ALUOut = A + sign-extend(IR[15-0]); R-type: ALUOut = A op B; Branch: if (A==B) PC = ALUOut; Jump: PC = {PC[31:28], IR[25:0],2’b00};

17 Datapath of Multicycle Implementation

18 Step 4 (R-type or memory-access)
Loads and stores access memory MDR = Memory[ALUOut]; or Memory[ALUOut] = B; R-type instructions finish Reg[IR[15-11]] = ALUOut;

19 Datapath of Multicycle Implementation

20 Step 5 Write-back step Reg[IR[20-16]]= MDR;
What about all the other instructions?

21 Summary:

22 Multi cycle datapath & control:
PC

23 Example: lw, 1st cycle PC

24 Example: lw, 1st cycle

25 Example: lw, 2nd cycle PC

26 Example: lw, 2nd cycle

27 Example: lw, 3rd cycle PC

28 Example: lw, 3rd cycle

29 Example: lw, 4th cycle PC

30 Example: lw, 4th cycle

31 Example: lw, 5th cycle PC

32 Example: lw, 5th cycle

33 Example: J, 1st cycle PC

34 Example: J, 1st cycle

35 Example: J, 2nd cycle PC

36 Example: J, 2nd cycle

37 Example: J, 3rd cycle PC

38 Example: J, 3rd cycle

39 A Simple Example How many cycles will it take to execute this code? lw $t2, 0($t3) lw $t3, 4($t3) beq $t2, $t3, Label #assume not add $t5, $t2, $t3 sw $t5, 8($t3) Label: ... What is going on during the 8th cycle of execution? In what cycle does the actual addition of $t2 and $t3 takes place?

40 Implementing the Control
Value of control signals is dependent upon: what instruction is being executed which step is being performed Use the information we’ve accumulated to specify a finite state machine specify the finite state machine graphically, or use microprogramming

41 Graphical Specification of FSM
Instruction Fetch 1 Decode 2 Mem. Address Compute 3 Mem. Access Read 5 Write 4 Write-Back 6 ALU Execute 7 Complete R-type 8 Branch 9 Jump Start Op = LW or SW Op = R-type Op = BEQ Op = J Op = SW Op = LW How many state bits will we need?

42 Detailed FSM

43 Multi-cycle examples

44 JAL Instruction: Jal #label
Interpretation: $ra = PC, $ra is register PC = #label

45

46

47

48 (Op = ‘JAL’) PCWrite PCSource = 10 RegDest = 10 MemToReg = 10 RegWrite

49 LWI Similar to HW2, but for Multi-cycle Instruction: LWI Rt, Rd(Rs) * Note LWI is a R-type instruction Interpretation: Reg[Rt] = Mem[Reg[Rd] + Reg[Rs]]

50

51

52

53 Need new register fetch stage to read rs, rd
RegSrc= 1 LWI is R-type Can reuse state to calculate address (Op = ‘LWI’) Write to register rt

54 SWAP Instruction: Swap Rs, Rt Assume r-type instruction where Rs is in Rd field Swap Rs, Rs, Rt Interpretation: Reg[Rs] = Reg[Rt] Reg[Rt] = Reg[Rs]

55

56

57

58 (Op = ‘SWAP’) Reg[Rs] = Reg[Rt] RegDst = 1 RegWrite MemtoReg = 3 Reg[Rt] = Reg[Rs] RegDst = 0 RegWrite MemtoReg = 2


Download ppt "CS161 – Design and Architecture of Computer Systems"

Similar presentations


Ads by Google