Presentation is loading. Please wait.

Presentation is loading. Please wait.

Morgan Kaufmann Publishers The Processor

Similar presentations


Presentation on theme: "Morgan Kaufmann Publishers The Processor"— Presentation transcript:

1 Morgan Kaufmann Publishers The Processor
26 August, 2018 Chapter 4 The Processor Chapter 4 — The Processor

2 Morgan Kaufmann Publishers
26 August, 2018 Introduction §4.1 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two LEGv8 implementations A simplified version A more realistic pipelined version Simple subset, shows most aspects Memory reference: LDUR, STUR Arithmetic/logical: ADD, SUB, AND, ORR, SLT Control transfer: Compare and branch on zero (CBZ), Branch (B), beq, j Chapter 4 — The Processor

3 Instruction Execution
Morgan Kaufmann Publishers 26 August, 2018 Instruction Execution PC  instruction memory, fetch instruction Register numbers  register file, read registers Depending on instruction class Use ALU to calculate Arithmetic result Memory address for load/store Branch target address Access data memory for load/store PC  target address or PC + 4 Chapter 4 — The Processor

4 Morgan Kaufmann Publishers
26 August, 2018 CPU Overview Chapter 4 — The Processor

5 Morgan Kaufmann Publishers
26 August, 2018 Multiplexers Can’t just join wires together Use multiplexers Chapter 4 — The Processor

6 Morgan Kaufmann Publishers
26 August, 2018 Control Chapter 4 — The Processor

7 Morgan Kaufmann Publishers
26 August, 2018 Logic Design Basics §4.2 Logic Design Conventions Information encoded in binary Low voltage = 0, High voltage = 1 One wire per bit Multi-bit data encoded on multi-wire buses Combinational element Operate on data Output is a function of input State (sequential) elements Store information Chapter 4 — The Processor

8 Combinational Elements
Morgan Kaufmann Publishers 26 August, 2018 Combinational Elements AND-gate Y = A & B Adder Y = A + B A B Y + A B Y Arithmetic/Logic Unit Y = F(A, B) Multiplexer Y = S ? I1 : I0 A B Y ALU F I0 I1 Y M u x S Chapter 4 — The Processor

9 Morgan Kaufmann Publishers
Sequential Elements 26 August, 2018 Register: stores data in a circuit Uses a clock signal to determine when to update the stored value Edge-triggered: update when Clk changes from 0 to 1 Clk D Q D Clk Q Chapter 4 — The Processor

10 Morgan Kaufmann Publishers
Sequential Elements 26 August, 2018 Register with write control Only updates on clock edge when write control input is 1 Used when stored value is required later Write D Q Clk D Clk Q Write Chapter 4 — The Processor

11 Morgan Kaufmann Publishers
Clocking Methodology 26 August, 2018 Combinational logic transforms data during clock cycles Between clock edges Input from state elements, output to state element Longest delay determines clock period Chapter 4 — The Processor

12 Morgan Kaufmann Publishers
26 August, 2018 Building a Datapath §4.3 Building a Datapath Datapath Elements that process data and addresses in the CPU Registers, ALUs, mux’s, memories, … We will build a LEGv8 datapath incrementally Refining the overview design Chapter 4 — The Processor

13 Morgan Kaufmann Publishers
26 August, 2018 Instruction Fetch Increment by 4 for next instruction 64-bit register Chapter 4 — The Processor

14 R-Format Instructions
Morgan Kaufmann Publishers R-Format Instructions 26 August, 2018 Read two register operands Perform arithmetic/logical operation Write register result Chapter 4 — The Processor

15 Load/Store Instructions
Morgan Kaufmann Publishers Load/Store Instructions 26 August, 2018 LDUR X1,[X2,offset_value] or STUR X1, [X2,offset_value] Read register operands, and Calculate memory address by adding the base register X2 with 9-bit signed offset Use ALU, but sign-extend the 9-bit offset field in the instruction to a 64-bit signed value Load: Read memory and write into register file (register X1 here) Store: read register file (X1) and write value to memory Chapter 4 — The Processor

16 Morgan Kaufmann Publishers
26 August, 2018 Branch Instructions CBZ X1,offset XI register is tested for zero, and a 19-bit offset used to compute the branch target address relative to the branch instruction address Use ALU, subtract and check Zero output Calculate target address Sign-extend displacement The base for the branch address calculation is the address of the branch instruction Shift left offset field by 2 bits so that it is a word offset If the operand (X1) is zero, the branch target address is the new PC If the operand is not zero, the incremented PC (PC+4, during instruction fetch) replaces the current PC Chapter 4 — The Processor

17 Datapath segment for branches
Morgan Kaufmann Publishers Datapath segment for branches 26 August, 2018 Just re-routes wires Sign-bit wire replicated Chapter 4 — The Processor

18 Composing the Elements
Morgan Kaufmann Publishers 26 August, 2018 Composing the Elements The simplest datapath executes all instructions in one clock cycle Each datapath element can only do one function at a time Hence, we need separate instruction and data memories Use multiplexers where alternate data sources are used for different instructions Chapter 4 — The Processor

19 R-Type/Load/Store Datapath
Morgan Kaufmann Publishers 26 August, 2018 R-Type/Load/Store Datapath Chapter 4 — The Processor

20 Morgan Kaufmann Publishers
26 August, 2018 Full Datapath Chapter 4 — The Processor

21 Morgan Kaufmann Publishers
ALU Control 26 August, 2018 Load/Store (LDUR/STUR): ALU computes the memory address by addition R-type instructions: ALU performs one of the four actions (AND, OR, subtract, or add), depending on the value of the 11-bit opcode field in the instruction compare and branch zero (CBZ): ALU just passes the register input value. Small control unit Input: opcode field of the instruction and a 2-bit control field, called ALUOp, with the following values: (00) indicates the operation to be performed should be add for loads and stores, (01) pass input b for CBZ, (10) determined by the operation encoded in the opcode field. Output: 4-bit signal that directly controls the ALU by generating one of the 6 combinations shown below §4.4 A Simple Implementation Scheme ALU control lines Function 0000 AND 0001 OR 0010 add 0110 subtract 0111 pass input b 1100 NOR Chapter 4 — The Processor

22 Morgan Kaufmann Publishers
ALU Control 26 August, 2018 ALU control inputs based on the 2-bit ALUOp control and the 11-bit opcode. ALUOp bits are generated from the main control unit. Multiple levels of decoding - common implementation technique can reduce the size of the main control unit potentially reduce the latency of the control unit opcode ALUOp Operation Opcode field ALU function ALU control LDUR 00 load register XXXXXXXXXXX add 0010 STUR store register CBZ 01 compare and branch on zero pass input b 0111 R-type 10 100000 subtract 100010 0110 AND 100100 0000 ORR 100101 OR 0001 Chapter 4 — The Processor

23 Morgan Kaufmann Publishers
The Main Control Unit 26 August, 2018 Control signals derived from instruction Opcode field: 6 – 11 bits wide, bit positions 31:26 to 31:21 First register operand: bit positions 9:5 (Rn) Other register operand: bit positions 20:16 (Rm), 4:0 (Rt) Another operand: 19-bit offset (CBZ) or 9-bit offset (Load/Store) The destination register for R-type instructions (Rd) and for loads (Rt) is in bit positions 4:0. Chapter 4 — The Processor


Download ppt "Morgan Kaufmann Publishers The Processor"

Similar presentations


Ads by Google