Presentation is loading. Please wait.

Presentation is loading. Please wait.

The CPU Processor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making) Datapath: portion of the processor.

Similar presentations


Presentation on theme: "The CPU Processor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making) Datapath: portion of the processor."— Presentation transcript:

1

2 The CPU Processor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making) Datapath: portion of the processor which contains hardware necessary to perform operations required by the processor (the brawn) Control: portion of the processor (also in hardware) which tells the datapath what needs to be done (the brain)

3 The CPU Our implementation of the MIPS is simplified
memory-reference instructions: lw, sw arithmetic-logical instructions: add, sub, and, or, slt control flow instructions: beq, j Generic implementation use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC) decode the instruction (and read registers) execute the instruction All instructions (except j) use the ALU after reading the registers Fetch PC = PC+4 Decode Exec

4 A Small Set of Instructions
MIPS instruction formats and naming of the various fields. We will refer to this diagram later Seven R-format ALU instructions (add, sub, slt, and, or, xor, nor) Six I-format ALU instructions (lui, addi, slti, andi, ori, xori) Two I-format memory access instructions (lw, sw) Three I-format conditional branch instructions (bltz, beq, bne) Four unconditional jump instructions (j, jr, jal, syscall)

5 The MIPS Instruction Set
op 15 8 10 12 13 14 35 43 2 1 4 5 3 fn 32 34 42 36 37 38 39 Instruction Usage Load upper immediate lui rt,imm Add  add rd,rs,rt Subtract sub rd,rs,rt Set less than slt rd,rs,rt Add immediate  addi rt,rs,imm Set less than immediate slti rd,rs,imm AND and rd,rs,rt OR or rd,rs,rt XOR xor rd,rs,rt NOR nor rd,rs,rt AND immediate andi rt,rs,imm OR immediate ori rt,rs,imm XOR immediate xori rt,rs,imm Load word lw rt,imm(rs) Store word sw rt,imm(rs) Jump  j L Jump register jr rs Branch less than 0 bltz rs,L Branch equal beq rs,rt,L Branch not equal  bne rs,rt,L Jump and link jal L System call  syscall The MIPS Instruction Set Copy Arithmetic Logic Memory access Control transfer

6 Fetching Instructions
Fetching instructions involves reading the instruction from the Instruction Memory updating the PC to hold the address of the next instruction PC is updated every cycle, so it does not need an explicit write control signal Instruction Memory is read every cycle, so it doesn’t need an explicit read control signal Add 4 Instruction Memory Read Address PC Instruction

7 Decoding Instructions
Decoding instructions involves sending the fetched instruction’s opcode and function field bits to the control unit reading two values from the Register File Register File addresses are contained in the instruction Control Unit Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Data 2 Instruction

8 Executing R Format Operations
R format operations (add,sub,slt,and,or) perform the (op and funct) operation on values in rs and rt store the result back into the Register File (into location rd) The Register File is not written every cycle (e.g. sw), so we need an explicit write control signal for the Register File R-type: 31 25 20 15 5 op rs rt rd funct shamt 10 RegWrite ALU control Read Addr 1 Read Data 1 Register File Read Addr 2 overflow Instruction zero ALU Write Addr Read Data 2 Write Data

9 Executing Load and Store Operations
Load and store operations involve compute memory address by adding the base register (read from the Register File during decode) to the 16-bit signed-extended offset field in the instruction store value (read from the Register File during decode) written to the Data Memory load value, read from the Data Memory, written to the Register File Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Data 2 ALU overflow zero ALU control RegWrite Data Memory Address Read Data Sign Extend MemWrite MemRead 16 32

10 Executing Branch Operations
Add Branch target address Add 4 Shift left 2 ALU control PC zero (to branch control logic) Read Addr 1 Read Data 1 Register File Branch operations involves compare the operands read from the Register File during decode for equality (zero ALU output) compute the branch target address by adding the updated PC to the 16-bit signed-extended offset field in the instr Read Addr 2 Instruction ALU Write Addr Read Data 2 Write Data Sign Extend 16 32

11 Jump and Branch Instructions
Unconditional jump and jump through register instructions j verify # go to mem loc named “verify” jr $ra # go to address that is in $ra; # $ra may hold a return address $ra is the symbolic name for reg. $31 (return address) (incremented) The jump instruction j of MiniMIPS is a J-type instruction which is shown along with how its effective target address is obtained. The jump register (jr) instruction is R-type, with its specified register often being $ra.

12 Executing Jump Operations
Jump operation involves replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits Add 4 4 Jump address Instruction Memory Shift left 2 28 Read Address PC Instruction 26

13 Creating a Single Datapath from the Parts
Assemble the datapath segments and add control lines and multiplexors as needed Single cycle design – fetch, decode and execute each instructions in one clock cycle no datapath resource can be used more than once per instruction, so some must be duplicated (e.g., separate Instruction Memory and Data Memory, several adders) multiplexors needed at the input of shared elements with control lines to do the selection write signals to control writing to the Register File and Data Memory Cycle time is determined by length of the longest path

14 Morgan Kaufmann Publishers
24 April, 2017 The Main Control Unit Control signals derived from instruction R-type rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6 Load/ Store 35 or 43 rs rt address 31:26 25:21 20:16 15:0 4 rs rt address 31:26 25:21 20:16 15:0 Branch opcode always read read, except for load write for R-type and load sign-extend and add Chapter 4 — The Processor

15 Single Cycle Datapath with Control Unit
Add Add 1 4 Shift left 2 PCSrc ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File Instr[20-16] zero Read Addr 2 Data Memory Read Address PC Instr[31-0] Read Data 1 ALU Write Addr Read Data 2 1 Write Data Instr[ ] Write Data 1 Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0]

16 R-type Instruction Data/Control Flow
Add Add 1 4 Shift left 2 PCSrc ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File zero Instr[20-16] Read Addr 2 Data Memory Read Address PC Instr[31-0] Read Data ALU 1 Write Addr Read Data 2 1 Write Data Write Data Instr[ ] 1 Instr[5-0] Sign Extend ALU control 16 32 Instr[5-0]

17 Deriving the Control Signals
Control signals for the single-cycle MicroMIPS implementation. Control signal 1 2 3 RegWrite Don’t write Write RegDst rt rd $31 MemtoReg Data out ALU out IncrPC ALUSrc (rt ) imm AddSub Add Subtract LogicFn1, LogicFn0 AND OR XOR NOR FnClass1, FnClass0 lui Set less Arithmetic Logic MemRead Don’t read Read MemWrite BrType1, BrType0 No branch beq bne bltz PCSrc jta (rs) SysCallAddr Reg file ALUop Data cache Next addr

18 Load Word Instruction Data/Control Flow
Add 1 Add 4 Shift left 2 PCSrc ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File Instr[20-16] zero Read Addr 2 Read Address Data Memory PC Instr[31-0] Read Data 1 ALU Write Addr Read Data 2 1 Write Data Instr[ ] Write Data 1 Instr[15-0] Store Word Instruction? Sign Extend ALU control 16 32 Instr[5-0]

19 Deriving the Control Signals
Control signals for the single-cycle MicroMIPS implementation. Control signal 1 2 3 RegWrite Don’t write Write RegDst rt rd $31 MemtoReg Data out ALU out IncrPC ALUSrc (rt ) imm AddSub Add Subtract LogicFn1, LogicFn0 AND OR XOR NOR FnClass1, FnClass0 lui Set less Arithmetic Logic MemRead Don’t read Read MemWrite BrType1, BrType0 No branch beq bne bltz PCSrc jta (rs) SysCallAddr Reg file ALUop Data cache Next addr

20 Branch Instruction Data/Control Flow
Add Add 1 4 Shift left 2 PCSrc ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Instr[20-16] Register File zero Read Addr 2 Data Memory Read Address PC Instr[31-0] Read Data 1 ALU Write Addr Read Data 2 1 Write Data Instr[ ] Write Data 1 Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0]

21 Deriving the Control Signals
Control signals for the single-cycle MicroMIPS implementation. Control signal 1 2 3 RegWrite Don’t write Write RegDst rt rd $31 MemtoReg Data out ALU out IncrPC ALUSrc (rt ) imm AddSub Add Subtract LogicFn1, LogicFn0 AND OR XOR NOR FnClass1, FnClass0 lui Set less Arithmetic Logic MemRead Don’t read Read MemWrite BrType1, BrType0 No branch beq bne bltz PCSrc jta (rs) SysCallAddr Reg file ALUop Data cache Next addr

22 Jump Operation? 1 ovf zero 1 1 1 Add Add 4 Shift left 2 PCSrc ALUOp
Add Add 1 4 Shift left 2 PCSrc ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File Instr[20-16] zero Read Addr 2 Data Memory Read Address PC Instr[31-0] Read Data 1 ALU Write Addr Read Data 2 1 Write Data Instr[ ] Write Data 1 Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0]

23 Adding the Jump Operation
Instr[25-0] 1 Shift left 2 28 32 26 PC+4[31-28] Add Add 1 4 Shift left 2 PCSrc Jump ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File Instr[20-16] zero Read Addr 2 Data Memory Read Address PC Instr[31-0] Read Data 1 ALU Write Addr Read Data 2 1 Write Data Instr[ ] Write Data 1 Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0]

24 Can the Control Unit set all the control signals based solely on the opcode field of the instruction? Instr[25-0] 1 Shift left 2 28 32 26 PC+4[31-28] Add Add 1 4 Shift left 2 PCSrc Jump ALUOp Branch MemRead Instr[31-26] Control Unit MemtoReg MemWrite ALUSrc RegWrite RegDst ovf Instr[25-21] Read Addr 1 Instruction Memory Read Data 1 Address Register File Instr[20-16] zero Read Addr 2 Data Memory Read Address PC Instr[31-0] Read Data 1 ALU Write Addr Read Data 2 1 Write Data Instr[ ] Write Data 1 Instr[15-0] Sign Extend ALU control 16 32 Instr[5-0]

25 Control Signal Settings

26 Control Signal Generation
Auxiliary signals identifying instruction classes arithInst = addInst  subInst  sltInst  addiInst  sltiInst logicInst = andInst  orInst  xorInst  norInst  andiInst  oriInst  xoriInst immInst = luiInst  addiInst  sltiInst  andiInst  oriInst  xoriInst Example logic expressions for control signals RegWrite = luiInst  arithInst  logicInst  lwInst  jalInst ALUSrc = immInst  lwInst  swInst AddSub = subInst  sltInst  sltiInst DataRead = lwInst PCSrc0 = jInst  jalInst  syscallInst Control addInst subInst jInst sltInst .

27 13.6 Performance of the Single-Cycle Design
An example combinational-logic data path to compute z := (u + v)(w – x) / y Add/Sub latency 2 ns Multiply latency 6 ns Divide latency 15 ns Total latency 23 ns / + - y u v w x z Note that the divider gets its correct inputs after 9 ns, but this won’t cause a problem if we allow enough total time Beginning with inputs u, v, w, x, and y stored in registers, the entire computation can be completed in 25 ns, allowing 1 ns each for register readout and write

28 Performance Estimation for Single-Cycle MicroMIPS
Instruction access 2 ns Register read 1 ns ALU operation 2 ns Data cache access 2 ns Register write 1 ns Total 8 ns Single-cycle clock = 125 MHz R-type 44% 6 ns Load 24% 8 ns Store 12% 7 ns Branch 18% 5 ns Jump 2% 3 ns Weighted mean  6.36 ns The MicroMIPS data path unfolded (by depicting the register write step as a separate block) so as to better visualize the critical-path latencies.

29 How Good is Our Single-Cycle Design?
Clock rate of 125 MHz not impressive How does this compare with current processors on the market? Instruction access 2 ns Register read 1 ns ALU operation 2 ns Data cache access 2 ns Register write 1 ns Total 8 ns Single-cycle clock = 125 MHz Not bad, where latency is concerned A 2.5 GHz processor with 20 or so pipeline stages has a latency of about 0.4 ns/cycle  20 cycles = 8 ns Throughput, however, is much better for the pipelined processor: Up to 20 times better with single issue Perhaps up to 100 times better with multiple issue


Download ppt "The CPU Processor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making) Datapath: portion of the processor."

Similar presentations


Ads by Google