Basic Processing Unit (Chapter 7) http://www.pds.ewi.tudelft.nl/~iosup/Courses/2011_ti1400_7.ppt
Problem: How to Implement Computers? Data representation, conversion, and op. Instruction repr. and use Lectures 3,4,5,6 Programmable Devices Memory organization Program sequencing von Neumann archi. Instruction levels Lecture 2 Digital logic Memory elements Other building blocks (Multiplexer,Decoder) Finite State Machines Circuit Design Lecture 1 Why Computer Organization Matters? History of Computing (1642-2011) Lecture 0
Problem instruction ? y Decoder f ALU a y Reg
Lecture 2 Von Neumann Architecture Memory (Central) Processing Unit READ(X) READ(Y) ADD(X,Y,Z) WRITE(Z) X: 1 Y: 2 Z: 3 • TEMP_A: TEMP_B: RESULT: arithmetic unit Input IR: CONTROL Output PC:
The Processing Unit Basic Processing Cycle Types of Operations Control Mechanisms 5
Basic Processing Cycle Assume an instruction occupies a 32 bit single word in byte addressable memory Basic cycle to be implemented: 1. Fetch instruction pointed to by PC and put it into the Instruction Register (IR): [IR] M([PC]) 2. Increment PC: [PC] [PC] + 4 3. Perform actions as specified in IR
Organization CPU bus Decoder PC control MAR IR memory bus MDR Register file R0 Y R1 R2 ALU Rn-1 Z
Register gating CPU bus Y_in Const 4 Ri_in x Y x Select MUX Ri ALU x Z_in x Ri_out Z x Z_out
Register gating Edge-triggered D flip-flop Multiplexer R1_in R2_in R/W I R/W I R/W I C D Q C D Q C D Q C C C R1_out R2_out R3_out 1 bit of common bus line Tri-state gate: high impedance iff Ri_out=0, Q iff Ri_out=1
Multiple Datapaths Bus A R0 Y R1 R2 R3 ALU register file Bus C Bus B
The Processing Unit Basic Processing Cycle Types of Operations Control Mechanisms Register Transfer Fetch from Memory Store to Memory Arithmetic/Logic Ops. Complete Example Branching Ops. 11
2. Types of Operations Operation cycle includes: Transfer data from register to register or to ALU Fetch contents of memory location and put in one of the CPU registers Store contents of CPU register in memory location Perform arithmetic or logic operation
2.1. Register Transfers Copy contents of R1 to R3 1. Address_out=R1 2. R_out 3. Address_in=R3 4. R_in R_out CPU bus R0 Y R1 R2 ALU R3 Z register file 1. R1_out 2. R3_in Address_out R_in Address_in
2.2. Fetch from Memory (1) Memory bus Data lines Internal processor bus MDR_outE MDR_out x x MDR x x MDR_inE MDR_in
2.2. Fetch from memory (2) 1. MAR [Ri] e.g., Move (Ri),Rj 2. Start read on memory bus 3. Wait for MFC response 4. Load MDR from memory bus 5. Rj [MDR] Control Step 1 Memory Function Complete Control Step 2 Control Step 3 Address MAR Data MDR CPU Read Memory MFC
Fetch from memory (3) Signal Activation Sequence Internal processor bus Control Step 1. Ri_out, MAR_in, Read Control Step 2. MDR_inE, WMFC Control Step 3. MDR_out, Rj_in Ri_in x MDR_outE MDR_out Ri x x Memory bus Data lines MDR x x x Ri_out MDR_inE MDR_in
Timing of the Operation 2.2. Fetch from Memory (4) 1. Ri_out, MAR_in, Read 2. MDR_inE, WMFC 3. MDR_out, Rj_in Timing of the Operation 1 2 3 CLK MAR_in address New Address MAR to Mem.Bus Read MR Mem.Read Cmd. MDR_inE Data Mem.Bus to MDR Value MFC Mem.Fnc.Complete MDR_out
2.3. Store to Memory 1. Ri_out, MAR_in e.g., Move Rj,(Ri) 2. Rj_out, MDR_in, Write 3. MDR_outE, WMFC Address Data Memory CPU Write MFC
2.4. Arithmetic Operation ADD R3,R2,R1 Step Action 1. Address_out R1 Y_in R_out 2. Address_out R2 R_out F_alu “ADD” Z_in Address_in R3 Z_out R_in
Register Transfers 1. Address_out R1 Y_in R_out R_out CPU bus R0 ALU R3 Z register file Address_out
Arithmetic Operation ADD R3,R2,R1 Step Action 1. Address_out R1 Y_in R_out 2. Address_out R2 R_out F_alu “ADD” Z_in Address_in R3 Z_out R_in
Register Transfers 2. Address_out R2 R_out F_alu “ADD” Z_in CPU bus R_out R0 Y_in Y R1 R2 F_alu ALU R3 Z_in Z register file Address_out
Arithmetic Operation ADD R3,R2,R1 Step Action 1. Address_out R1 Y_in R_out 2. Address_out R2 R_out F_alu “ADD” Z_in Address_in R3 Z_out R_in
Register Transfers Address_in R3 Z_out R_in CPU bus R0 Y R1 R2 ALU R3 Z register file R_in Z_out Address_in
Steps in time CPU bus Step 1 2 3 Y_in Y Y_in ALU Z_in Z_in Z Z_out R_in Z_out
Timing Rising edge of clock turn output on trans- mission time delay through ALU set-up time hold time R_out data available at next register
The Processing Unit Basic Processing Cycle Types of Operations Control Mechanisms Register Transfer Fetch from Memory Store to Memory Arithmetic/Logic Ops. Complete Example Branching Ops. 27
2.5. Execution of a Complete Instruction 1. Fetch instruction 2. Fetch the operand 3. Perform operation 4. Store result Example ADD (R3),R1 [R1] M([R3]) + [R1]
Execution fetch (1) Step 1-3: Step Action Instruction fetch and PC update Step Action 1 PC_out, MAR_in, Read Set carry-in ALU F_alu = “ADD” Z_in Z_out, PC_in Wait for MFC 3 MDR_out, IR_in [PC] [PC ]+1 [IR] M([PC ]) Note: for architectures having PC:=PC+4 a different scheme must be used
Fetch instruction 1. PC_out, MAR_in, Read Set carry-in ALU F_alu = “ADD” Z_in Q Why MAR_in? MAR_in MAR PC_out IR PC ADD ALU MDR carry Z_in Z Read WFMC Q Why Set carry-in ALU?
Execution fetch (2) Step 1-3: instruction Step Action fetch and PC update Step Action 1 PC_out, MAR_in, Read Set carry-in ALU F_alu = “ADD” Z_in Z_out, PC_in Wait for MFC 3 MDR_out, IR_in [PC] [PC ]+1 [IR] M([PC ])
Fetch instruction . Z_out, PC_in Wait for MFC MAR_in MAR PC_in IR PC ALU MDR MDR_in Z Read Z_out Q What is read into MDR? WFMC
Execution fetch (3) Step 1-3: instruction Step Action fetch and PC update Step Action 1 PC_out, MAR_in, Read Set carry-in ALU F_alu = “ADD” Z_in Z_out, PC_in Wait for MFC 3 MDR_out, IR_in [PC ] [PC ]+1 [IR] M([PC ])
Fetch instruction 3. MDR_out, IR_in Q What is loaded into IR? MAR IR PC IR_in ALU MDR Z Read MDR_out WFMC
Execute Step Action 4 Address_out=R3, R_out MAR_in Read Y_in, Wait for MFC 6 MDR_out, Z_in F_alu = “ADD” 7 Address_in=R1, R_in Z_out, End Step 4 and 5: operand fetch Perform addition Store Result
Execute Q Role of Decoder? 4. R3_out MAR_in Read CPU bus Read PC control MAR IR memory bus MDR register file R0 Y R1 R2 ALU R3 Z
Execute Step Action 4 Address_out=R3, R_out MAR_in Read Y_in, Wait for MFC 6 MDR_out, Z_in F_alu = “ADD” 7 Address_in=R1, R_in Z_out, End Step 4 and 5: operand fetch Perform addition Store Result
Execute Q Where does MDR read from? . R1_out Y_in, Wait for MFC CPU bus WFMC Decoder PC control MAR IR memory bus MDR R0 Y R1 R2 ALU R3 Z register file
Execute Step Action 4 Address_out=R3, R_out MAR_in Read Y_in, Wait for MFC 6 MDR_out, Z_in F_alu = “ADD” 7 Address_in=R1, R_in Z_out, End Step 4 and 5: operand fetch Perform addition Store Result
Execute Q Who sets F_alu to ADD? 6. MDR_out, Z_in F_alu = “ADD” CPU bus Decoder PC control MAR IR memory bus MDR register file R0 Y R1 R2 Q Why Z_in? ALU R3 Z
Execute Step Action 4 Address_out=R3, R_out MAR_in Read Y_in, Wait for MFC 6 MDR_out, Z_in F_alu = “ADD” 7 Address_in=R1, R_in Z_out, End Step 4 and 5: operand fetch Perform addition Store Result
Execute Q Role of End? 7. R1_in Z_out, End CPU bus Decoder PC control MAR IR memory bus MDR register file R0 Y R1 R2 ALU R3 Z
The Processing Unit Basic Processing Cycle Types of Operations Control Mechanisms Register Transfer Fetch from Memory Store to Memory Arithmetic/Logic Ops. Complete Example Branching Ops. 43
2.6. Branching Jump PC+Offset Step Action 1-3 <instruction fetch as in previous example> PC_out, Y_in 5 Offset-field-IR_out F_alu = “ADD” Z_in 6 PC_in Z_out, End
Branching . PC_out, Y_in CPU bus Decoder PC control MAR IR memory bus MDR register file R0 Y R1 R2 ALU R3 Z
Branching Step Action 1-3 <instruction fetch as in previous example> PC_out, Y_in 5 Offset-field-IR_out F_alu = “ADD” Z_in 6 PC_in Z_out, End
Branching 5. Offset-field-IR_out F_alu = “ADD” Z_in CPU bus Decoder PC control MAR IR memory bus MDR register file R0 Y R1 R2 ALU R3 Z
Branching Step Action 1-3 <instruction fetch as in previous example> PC_out, Y_in 5 Offset-field-IR_out F_alu = “ADD” Z_in 6 PC_in Z_out, End
Branching 6. PC_in Z_out, End CPU bus Decoder PC control MAR IR memory bus MDR R0 Y R1 R2 ALU R3 Z register file
Conditional branching JN PC+Offset Step Action 1-3 <instruction fetch as in previous example> PC_out, Y_in If N=0 then End 5 Offset-field-IR_out F_alu = “ADD” Z_in 6 PC_in Z_out, End If not Negative
The Processing Unit Basic Processing Cycle Types of Operations Control Mechanisms Hardwired Micro-Programmed Q Who sets F_alu to ADD? 51
3. Control Mechanisms There are two basic control organizations: Hardwired control Micro-programmed control
The Processing Unit Basic Processing Cycle Types of Operations Control Mechanisms Hardwired Micro-Programmed Q Who sets F_alu to ADD? 53
3.1. Hardwired Control Control Unit Organization CLK Clock Control step counter IR Encoder/ Decoder Status Flags Condition Codes Control signals
3.1. Hardwired Control Separating decoding/encoding Clock Control step counter Reset Q Role of Run? Step decoder Only one set to 1 T_1 T_n Ins_1 IR Instruction decoder Encoder Status Flags Condition Codes Ins_n Run Z_in End
3.1. Hardwired Control Generation of control signals ADD BRanch T_6 T_5 T_1 Z_in = T_1 + T_6 . ADD + T_5 . BR time slot Z_in
3.1. Hardwired Control End signal Other example: End = T_7 . ADD + T_6 . BR +(T_6 . N + T_4 . /N) .BRN
PLAs PLA AND array OR array IR counter Flags Control signals
3.1. Hardwired Control Performance Performance is dependent on: Power of instructions Cycle time Number of cycles per instruction Performance improvement by: Multiple datapaths Instruction prefetching and pipelining Caches
Complete CPU Processor/CPU System Bus Integer unit Floating-point unit Instruction unit Integer unit Floating-point unit Integer unit Floating-point unit Instruction Cache Data Cache Bus Interface Processor/CPU System Bus Main Memory Input/ Output
3.2. Micro-programmed control All control bits are organized as memory Each memory location represents a control setting/word The word represents the state (0/1) of each control signal Memory words are called micro-instructions Micro-routines are sequences of micro-instructions Control store for all micro-routines Micro-program counter (uPC) to read control words sequentially
3.2. Micro-Programmed Control Examples of Micro-Instructions micro- PC_in MAR_in Addr_in Z_in ... instruction 1 0 1 00 1 ... 2 1 0 00 0 ... 3 0 0 01 0 ... ..
3.2. Micro-Programmed Control Example of a Micro-routine Address Micro-instruction 0 PC_out, MAR_in, Read, Set carry-in ALU, F_alu = “ADD”, Z_in 1 Z_out, PC_in, Wait for MFC 2 MDR_out, IR_in 3 Branch to starting address routine (here, 25) ........................................................................................................ 25 PC_out, Y_in, if N=0 then goto address 0 26 Offset-field-of-IR_out, F_alu = “ADD”, Z_in 27 Z_out, PC_in, End Fetch Instruction Test N bit New PC address
3.2. Micro-Programmed Control Basic organization Q Can this organization perform conditional branching operations? IR Starting address generator Clock micro-PC Control Signals Control Store
3.2. Micro-Programmed Control Detailed organization IR Status flags Starting/ Branching address generator Control codes Clock micro-PC Control Signals Control Store
3.2. Micro-Programmed Control micro-PC Operation Micro-PC is incremented by 1, except: After loading IR Micro-PC is set to first micro-instruction for executing machine instruction At End Micro-PC is set to first micro-instruction of instruction fetch routine (typically 0) At Branch instruction Micro-PC is set to the branch address
3.2. Micro-Programmed Control Why micro-programming? Flexibility emulation of different instruction sets on same hardware Support for powerful instructions
3.2. Micro-Programmed Control Structure micro-instructions Most simple organization: 1 bit per control signal However, Many bits needed (e.g., 80-120 bits) For many signals, only one is needed per cycle; hence they can be grouped Coding is possible: e.g., an address instead of a single control bit per register
3.2. Micro-Programmed Control Example F1 F2 F3 F4 F5 F6 F7 F8 Field 1(4 bits): Register address_in Field 2(4 bits): Register address_out Field 3(4 bits): Other registers_in Field 4(4 bits): Function ALU Field 5(2 bit) : Read/Write/Nop Field 6(1 bit) : Carry-in ALU Field 7(1 bit) : WMFC Field 8(1 bit) : End ............ ..............
3.2. Micro-Programmed Control Forms of organization Little coding: horizontal organization Large words Little decoding logic Fast Much coding: vertical organization Small control store Much decoding logic Slower Mixed organization
3.2. Micro-Programmed Control Horizontal/Vertical F0 F1 F2 F3 Horizontal R0 R1 R2 R3 F0 F1 Decoder Vertical R0 R1 R2 R3
3.2. Micro-Programmed Control Sequencing Thus far only branch after fetch No sharing of micro-code between micro-routines Micro-subroutines lead to more efficient control store
3.2. Micro-Programmed Control Multi-way branching Number of two-way branches disadvantage: slows down More than one branch address in micro-instruction disadvantage: more bits required bit-ORing if specified branch address
3.2. Micro-Programmed Control Example branch address x x x 0 0 micro-instruction OR y z part of IR x x x y z actual branch address
3.2. Micro-Programmed Control Example microroutine (1) ADD (Rsrc)+, Rdst Mode IR OP code 010 Rsrc Rdst 11 10 8 7 4 3 bit 8: direct/indirect bit 9,10: indexed (11) autodecrement(10) autoincrement(01) register(00) Instruction Format
3.2. Micro-Programmed Control Example microroutine (2) Address Micro-instruction 0 PC_out, MAR_in, Read, Set carry-in ALU, F_alu = “ADD”, Z_in Z_out, PC_in, Wait for MFC 2 MDR_out, IR_in 3 μBranch{μPC101 (from PLA); μPC_5,4[IR_10,9]; μPC_3{[not.IR_10].[not.IR_9].[IR_8]} ........................................................................................................... 121 Rsrc_out, MAR_in, Set carry-in ALU,Read, F_alu = “ADD”, Z_in 122 Z_out, Rscr_in 123 μBranch{μPC170; μPC_0[not.IR_8]}, WMFC 170 MDR_out, MAR_in, Read, WMFC 171 MDR_out, Y_in 172 Rdst_out, F_alu = “ADD”, Z_in 173 Z_out, Rdst_in, End use bits from IR for addressing mode FETCH autoincrement direct indirect
3.2. Micro-Programmed Control Micro branch address Mode IR OP code 010 Rsrc Rdst 11 10 9 8 7 4 3 /IR10./IR9.IR8 101 0 0 1 0 1 0 0 0 1 PLA 121
3.2. Micro-Programmed Control Micro branch address Mode IR OP code 010 Rsrc Rdst 11 10 8 7 4 3 /IR8 170 0 0 1 1 1 1 0 0 1 PLA 171
3.2. Micro-Programmed Control Next-address field (1) Micro-instruction contains address next micro-instruction Larger store needed Branch micro-instructions no longer needed
Microinstruction decoder Next-address field (2) Status flags IR Condition codes Decoding circuits micro-AR Control store Next address micro-IR Microinstruction decoder
3.2. Micro-Programmed Control Example F0 F1 F2 F3 F4 F5 F6 F7 F8 Field 0(8 bits): Next address Field 1(4 bits): Register address_in Field 2(4 bits): Register address_out Field 3(4 bits): Other registers_in Field 4(4 bits): Function ALU Field 5(2 bit) : Read/Write/Nop Field 6(1 bit) : Carry-in ALU Field 7(1 bit) : WMFC Field 8(1 bit) : End ............ PLA/ORing etc
3.2. Micro-Programmed Control Emulation A micro-program determines the machine instructions of a computer Suppose we have two computers M1 and M2 with different instruction sets By adapting the micro-program of M1, we can emulate M2
3.2. Micro-Programmed Control Organization Micro-program is often placed in ROM on CPU chip Some machines had writable control store, i.e. user could change instruction set