Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4: ISA 1 Chapter 4: Instruction Set Architectures CS140 Computer Organization These slides are derived from those of Null & Lobur + the work of.

Similar presentations


Presentation on theme: "Chapter 4: ISA 1 Chapter 4: Instruction Set Architectures CS140 Computer Organization These slides are derived from those of Null & Lobur + the work of."— Presentation transcript:

1 Chapter 4: ISA 1 Chapter 4: Instruction Set Architectures CS140 Computer Organization These slides are derived from those of Null & Lobur + the work of others. Considerable material from previous years gleaned from Patterson & Hennessy.

2 Chapter 4: ISA 2 Chapter 4 Objectives & Introduction Objective: Learn the components common to every modern computer system. Be able to explain how each component contributes to program execution. Understand an ISA and how it relates to a real architecture. Know how the program assembly process works. Introduction: Chapter 1 presented a general overview of computer systems. Chapter 2 discussed how data is formatted, stored and manipulated. Chapter 3 described the fundamentals of digital circuits. With this, we can understand how computer components work, and how they fit together to create useful computer systems.

3 Chapter 4: ISA 3 Computer Components Processor PCI Controller ChipsetMemory FSB PCI Bus Video Controller Clock Chapter 7 Chapter 6 Chapters 4 & 5 FSB

4 Chapter 4: ISA CPU Basics The computers CPU fetches, decodes, and executes program instructions. The two principal parts of the CPU are the datapath and the control unit. –The datapath consists of an arithmetic-logic unit and storage units (registers) that are interconnected by a data bus that is also connected to main memory. –Various CPU components perform sequenced operations according to signals provided by its control unit. –The control unit determines which actions to carry out according to the values in a program counter register and a status register. Its like plumbing – the datapath is the pipes, the control units are the faucets and valves.

5 Chapter 4: ISA The Bus PARALLEL BUS The CPU shares data with other system components by way of a data bus. –A bus is a set of wires that simultaneously convey a single bit along each line. Two types of buses are commonly found in computer systems: point-to-point, and multipoint buses. This is a point-to-point bus configuration: SERIAL BUS Deferred to later when we discuss IO

6 Chapter 4: ISA 6 Buses have data lines, control lines, and address lines. The data lines convey bits from one device to another, Control lines determine the direction of data flow, and when each device can access the bus. Address lines determine the location of the source or destination of the data. 4.3 The Bus

7 Chapter 4: ISA 7 A multipoint bus is shown below. A multipoint bus is a shared resource, access to it is controlled through protocols, which are built into the hardware and handled by the control lines. 4.3 The Bus

8 Chapter 4: ISA 8 –Distributed using self- detection: Devices decide which gets the bus among themselves. –Distributed using collision- detection: Any device can try to use the bus. If its data collides with the data of another device, it tries again. –Daisy chain: Permissions are passed from the highest-priority device to the lowest. –Centralized parallel: Each device is directly connected to an arbitration circuit. In a master-slave configuration, where more than one device can be the bus master, concurrent bus master requests must be arbitrated. Four categories of bus arbitration are: 4.3 The Bus

9 Chapter 4: ISA Clocks Every computer contains at least one clock that synchronizes the activities of its components. A fixed number of clock cycles are required to carry out each data movement or computational operation. The clock frequency, measured in megahertz or gigahertz, determines the speed with which all operations are carried out. Clock cycle time is the reciprocal of clock frequency. Typical Intel and typical PIC clocks look like this: –A 2 GHz clock has a cycle time of 0.5 nanoseconds. –A 8 MHz clock has a cycle time of microseconds. One master clock has multiple frequencies used for various parts of the system.

10 Chapter 4: ISA 10 Clock speed should not be confused with CPU performance. The CPU time required to run a program is given by the general performance equation: We can improve CPU performance when we –reduce the number of instructions in a program, –reduce the number of cycles per instruction, or –reduce the number of nanoseconds per clock cycle. 4.4 Clocks

11 Chapter 4: ISA 11 Digression: Sequential Logic, Clocking Combinational circuits: no memory Output depends only on the inputs Sequential circuits: have memory How to ensure memory element is updated neither too soon, nor too late? Recall hardware circuits Flip/flop register is the writable memory element Gate propagation delay means result takes time to stabilize; Delay varies with inputs Must wait until result stable before we can write that output register to the next stage - otherwise garbage results. How to be certain ALU output is stable? Solution: let the inputs chatter and stabilize – THEN apply the clock. 4.4 Clocks

12 Chapter 4: ISA 12 Clock: free running signal with fixed cycle time (clock period) ° Clock determines when to write memory element level-triggered - store clock high (low) edge-triggered - store only on clock edge ° We will use negative (falling) edge-triggered methodology periodrising edgefalling edge high (1) low (0) 4.4 Clocks Digression: Sequential Logic, Clocking

13 Chapter 4: ISA 13 Role of Clock in Processors single-cycle machine: does everything in one clock cycle instruction execution = up to 5 steps must complete 5th step before cycle ends clock signal instruction execution step 1/step 2/step 3/step 4/step 5 datapath stable register(s) written falling clock edge rising clock edge 4.4 Clocks

14 Chapter 4: ISA The Input/Output Subsystem A computer communicates with the outside world through its input/output (I/O) subsystem. I/O devices connect to the CPU through various interfaces. I/O can be memory-mapped-- where the I/O device behaves like main memory from the CPUs point of view. Or I/O can be instruction-based, where the CPU has a specialized I/O instruction set.

15 Chapter 4: ISA Memory Organization Computer memory is a linear array of addressable storage cells that are similar to registers. Memory can be byte-addressable (most common), or word- addressable, where a word consists of two or more bytes. Memory is constructed of RAM chips, often referred to in terms of length width. If the addressable-unit of the machine is 8 bits, then a 4M 8 RAM chip gives us 4 megabytes of 8-bit memory locations.

16 Chapter 4: ISA Memory Organization How does the computer access a memory location corresponds to a particular address? We see that 4 Megabytes = 2 22 bytes. The memory locations for this memory are numbered 0 through Thus, the memory bus of this system requires at least 22 address lines. –The address lines count from 0 to in binary. Each line is either on or off indicating the location of the desired memory element. Power2 ^ Power , , , , , , , , , , ,048, ,097, ,194,304

17 Chapter 4: ISA 17 Physical memory usually consists of more than one RAM chip. Access is more efficient when memory is organized into banks of chips with the addresses interleaved across the chips With low-order interleaving, the low order bits of the address specify which memory bank contains the address of interest. 4.6 Memory Organization Low-Order Interleaving Byte Addresses

18 Chapter 4: ISA Interrupts The normal execution of a program is altered when an event of higher- priority occurs. The CPU is alerted to such an event through an interrupt. Interrupts can be triggered by I/O requests, arithmetic errors (such as division by zero), or when an invalid instruction is encountered. For general-purpose systems, it is common to disable all interrupts during the time in which an interrupt is being processed. –Typically, this is achieved by setting a bit in the flags register. –Interrupts that are ignored in this case are called maskable. Nonmaskable interrupts are high-priority interrupts that cannot be ignored. (the CPU is on fire is nonmaskable.) Each interrupt is associated with a procedure that directs the actions of the CPU when an interrupt occurs. In Chapter 7 well look at interrupts in more detail.

19 Chapter 4: ISA 19 Interrupt processing involves adding another step to the fetch-decode-execute cycle as shown below. The next slide shows a flowchart of Process the interrupt. 4.7 Interrupts

20 Chapter 4: ISA Interrupts

21 Chapter 4: ISA Instruction Processing 1.Detour looking at Instruction Sets –How are instructions laid out so they can be simply decoded? –Going from PIC to MIPS 2. Detour looking at Hardware –Multiplexers, registers, and ALUs 3.The datapath –Looking at each stage –The datapath and some sample instructions.

22 Chapter 4: ISA 22 These are the 35 instructions available on the PIC processors weve used. 4.9 Instruction Processing There are four types of instructions and the formats for these instructions are defined here. Byte oriented instructions have a 00 here. Bit oriented instructions have a 01 here. Detour looking at Instruction Sets Goto and Call have a 10Literal instructions have a 11

23 Chapter 4: ISA 23 Detour looking at Instruction Sets 4.9 Instruction Processing There are four types of instructions and the formats for these instructions are defined here. PIC is a RISC (Reduced Instruction Set Computer.) A relatively simple set of rules can decipher these instructions. Conversely, Intel is a CISC (Complex Instruction Set Computer.) The wiring needed to decipher the thousands of Intel instructions is overwhelming.

24 Chapter 4: ISA 24 The MIPS Instruction Formats All MIPS instructions are 32 bits long. The three instruction formats: –R-type –I-type –J-type The different fields are: –op: operation of the instruction –rs, rt, rd: the source and destination register specifiers –shamt: shift amount –funct: selects the variant of the operation in the op field –address / immediate: address offset or immediate value –target address: target address of the jump instruction optarget address bits26 bits oprsrtrdshamtfunct bits 5 bits oprsrt immediate bits16 bits5 bits Detour looking at Instruction Sets 4.9 Instruction Processing Hey – these are the same flavors as the PIC instructions. MIPS is a RISC It has about 75 instructions.

25 Chapter 4: ISA 25 The MIPS-lite Subset for Today ADD and SUB –addu rd, rs, rt –subu rd, rs, rt OR Immediate: –ori rt, rs, imm16 LOAD / STORE Word –lw rt, rs, imm16 –sw rt, rs, imm16 BRANCH: –beq rs, rt, imm16 oprsrtrdshamtfunct bits 5 bits oprsrtimmediate bits16 bits5 bits oprsrtimmediate bits16 bits5 bits oprsrtimmediate bits16 bits5 bits Detour looking at Instruction Sets 4.9 Instruction Processing MIPS–Lite has 6 instructions.

26 Chapter 4: ISA 26 The MIPS-lite Subset for Today Detour looking at Instruction Sets 4.9 Instruction Processing Things to Note: 1.Since each instruction is 32 bits, instruction addresses are mod 4. 2.These instructions access 3 registers. 3.The ori can use a 16 bit immediate. 4.Each register needs 5 bits to specify it – how many registers does this machine have?

27 Chapter 4: ISA 27 The MIPS-lite Subset for Today Detour looking at Instruction Sets 4.9 Instruction Processing

28 Chapter 4: ISA Instruction Processing 1.Detour looking at Instruction Sets –How are instructions laid out so they can be simply decoded? –Going from PIC to MIPS 2. Detour looking at Hardware –Multiplexers, registers, and ALUs 3.The datapath –Looking at each stage –The datapath and some sample instructions.

29 Chapter 4: ISA 29 D-Latches ° C = 0, no change of state; Q (t + t ) = Q (t ) ° C = 1, change is allowed; Q (t + t ) = D (t ) No Indetermined Output D-latch based on SR-Latch with NAND Gates and control input C Basic Building Blocks Detour looking at Hardware 4.9 Instruction Processing

30 Chapter 4: ISA 30 Basic Building Blocks Adder 32 A B Y Select MUX 32 A B Result OP ALU 32 A B Sum Carry Adder CarryIn ALU MUX Detour looking at Hardware 4.9 Instruction Processing

31 Chapter 4: ISA 31 Storage Element: Register File Register File consists of 32 registers: –Two 32-bit output busses: busA and busB –One 32-bit input bus: busW Register is selected by: –RA (number) selects the register to put on busA (data) –RB (number) selects the register to put on busB (data) –RW (number) selects the register to be written via busW (data) when Write Enable is 1 Clock input (CLK) –The CLK input is a factor ONLY during write operation –During read operation, behaves as a combinational logic block: RA or RB valid busA or busB valid after access time. Write Enable Clk busW 32 busA 32 busB 555 RWRARB bit Registers Basic Building Blocks Detour looking at Hardware 4.9 Instruction Processing

32 Chapter 4: ISA 32 Multiplexer 0 31 Input bits Selector Output Selector Outputs Input 0 31 The 5 selector wires can choose one of the 32 inputs voltages and send it to the output. The 5 selector wires choose which of the 32 outputs will get the input voltage. Decoder Basic Building Blocks Detour looking at Hardware 4.9 Instruction Processing

33 Chapter 4: ISA 33 Multiplexer 0 31 Input words Selector Output …………… Side View End View Word 0 Word 31 Bit 0Bit 31 …………… End View Bit 0Bit 31 Now, each of the 32 inputs has 32 bits. There are 32 x 32 bits in and 1 x 32 bits out. This multiplexer is equivalent to 32 of those on the previous page Each of these input words COULD be a register! Basic Building Blocks Detour looking at Hardware 4.9 Instruction Processing

34 Chapter 4: ISA 34 Multiplexer 0 31 Input words Selector Output Basic Building Blocks Detour looking at Hardware 4.9 Instruction Processing Write Enable Clk busW 32 busA 32 busB 555 RWRARB bit Registers So this register file is just the multiplexer shown here.

35 Chapter 4: ISA 35 Storage Element: Idealized Memory Memory (idealized) –One input bus: Data In –One output bus: Data Out Memory word is selected by: –Address selects the word to put on Data Out –Write Enable = 1: address selects the memory word to be written via the Data In bus Clock input (CLK) –The CLK input is a factor ONLY during write operation –During read operation, behaves as a combinational logic block: Address valid Data Out valid after access time. Clk Data In Write Enable 32 DataOut Address Basic Building Blocks Detour looking at Hardware 4.9 Instruction Processing

36 Chapter 4: ISA Instruction Processing 1.Detour looking at Instruction Sets –How are instructions laid out so they can be simply decoded? –Going from PIC to MIPS 2. Detour looking at Hardware –Multiplexers, registers, and ALUs 3.The datapath –Looking at each stage –The datapath and some sample instructions.

37 Chapter 4: ISA Instruction Processing The fetch-decode-execute-store cycle is the series of steps that a computer carries out when it runs a program. We first have to fetch an instruction from memory, and place it into the Instruction Register (IR). Once in the IR, it is decoded to determine what needs to be done next. If an immediate operand is involved in the operation, it is retrieved and prepared for execution. With everything in place, the instruction is executed. If a result is to be stored in memory, thats done next. If a result is placed in a register, thats the last stage. The next slide shows a flowchart of this process.

38 Chapter 4: ISA 38 Generic Steps: Datapath PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory 5. Reg. Write 4.9 Instruction Processing

39 Chapter 4: ISA 39 Stages of the Datapath (1/6) Problem: a single, atomic block which executes an instruction (performs all necessary operations beginning with fetching the instruction) would be too bulky and inefficient Solution: break up the process of executing an instruction into stages, and then connect the stages to create the whole datapath Smaller stages are easier to design Easy to optimize (change) one stage without touching the others 4.9 Instruction Processing PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory 5. Reg. Write

40 Chapter 4: ISA 40 Stages of the Datapath (2/6) There is a wide variety of MIPS instructions: so what general steps do they have in common? Stage 1: instruction fetch No matter what the instruction, the 32-bit instruction word must first be fetched from memory (the cache- memory hierarchy) Also, this is where we increment PC (that is, PC = PC + 4, to point to the next instruction: byte addressing so + 4) 4.9 Instruction Processing PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory 5. Reg. Write

41 Chapter 4: ISA 41 Stages of the Datapath (3/6) Stage 2: Instruction Decode upon fetching the instruction, we next gather data from the fields (decode all necessary instruction data) first, read the Opcode to determine instruction type and field lengths second, read in data from all necessary registers - for add, read two registers - for addi, read one register - for jal, no reads necessary 4.9 Instruction Processing PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory 5. Reg. Write

42 Chapter 4: ISA 42 Stages of the Datapath (4/6) ° Stage 3: ALU (Arithmetic-Logic Unit) the real work of most instructions is done here: arithmetic (+, -, *, /), shifting, logic (&, |), comparisons (slt) what about loads and stores? - lw $t0, 40($t1) - the address we are accessing in memory = the value in $t1 + the value 40 - so we do this addition in this stage 4.9 Instruction Processing PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory 5. Reg. Write

43 Chapter 4: ISA 43 Stages of the Datapath (5/6) ° Stage 4: Memory Access actually only the load and store instructions do anything during this stage; the others remain idle since these instructions have a unique step, we need this extra stage to account for them as a result of the cache system, this stage is expected to be just as fast (on average) as the others 4.9 Instruction Processing PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory 5. Reg. Write

44 Chapter 4: ISA 44 Stages of the Datapath (6/6) ° Stage 5: Register Write most instructions write the result of some computation into a register examples: arithmetic, logical, shifts, loads, slt what about stores, branches, jumps? - dont write anything into a register at the end - these remain idle during this fifth stage 4.9 Instruction Processing PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory 5. Reg. Write

45 Chapter 4: ISA 45 Generic Steps: Datapath PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory 5. Reg. Write 4.9 Instruction Processing

46 Chapter 4: ISA Instruction Processing 1.Detour looking at Instruction Sets –How are instructions laid out so they can be simply decoded? –Going from PIC to MIPS 2. Detour looking at Hardware –Multiplexers, registers, and ALUs 3.The datapath –Looking at each stage –The datapath and some sample instructions.

47 Chapter 4: ISA 47 Datapath Walkthrough #1 - add add $r3, $r1, $r2# r3 = r1+r2 Stage 1: fetch this instruction, incr. PC; Stage 2: decode to find it s an add, read registers $r1 and $r2 ; Stage 3: add the two values retrieved in Stage 2; Stage 4: idle (nothing to write to memory); Stage 5: write result of Stage 3 into register $r3 ; 4.9 Instruction Processing

48 Chapter 4: ISA 48 PC instruction memory +4 registers ALU Data memory imm add r3, r1, r2 reg[1]+reg[2] reg[2] reg[1] Datapath Walkthrough #1 add $r3, $r1, $r2# r3 = r1+r2 4.9 Instruction Processing

49 Chapter 4: ISA 49 Datapath Walkthroughs #2 - sw sw $r3, 17($r1) Stage 1: fetch this instruction, inc. PC Stage 2: decode to find it s a sw, then read registers $r1 and $r3 Stage 3: add 17 to value in register $41 (retrieved in Stage 2) Stage 4: write value in register $r3 (retrieved in Stage 2) into memory address computed in Stage 3 Stage 5: go idle (nothing to write into a register) 4.9 Instruction Processing

50 Chapter 4: ISA 50 PC instruction memory +4 registers ALU Data memory imm 3 1 x SW r3, 17(r1) reg[1] reg[1] MEM[r1+17]<=r3 reg[3] Datapath Walkthroughs #2 sw $r3, 17($r1) 4.9 Instruction Processing

51 Chapter 4: ISA 51 lw $r3, 17($r1) Stage 1: fetch this instruction, inc. PC Stage 2: decode to find it s a lw, then read register $r1 Stage 3: add 17 to value in register $r1 (retrieved in Stage 2) Stage 4: read value from memory address compute in Stage 3 Stage 5: write value found in Stage 4 into register $r3 Datapath Walkthroughs #3 – lw NOTE: This is the one instruction that requires all 5 stages 4.9 Instruction Processing

52 Chapter 4: ISA 52 PC instruction memory +4 registers ALU Data memory imm 3 1 x LW r3, 17(r1) reg[1] reg[1] MEM[r1+17] Datapath Walkthroughs #3 lw $r3, 17($r1) 4.9 Instruction Processing

53 Chapter 4: ISA 53 Datapath Summary °The datapath based on data transfers required to perform instructions °A controller causes the right transfers to happen PC instruction memory +4 rt rs rd registers ALU Data memory imm Controller opcode, funct

54 Chapter 4: ISA Decoding & Control 1.An overview of how control works from instructions to electronics –How are instructions laid out so they can be simply decoded? –Going from PIC to MIPS 2. Control operations –Samples of load, store and branch 3.The fetch unit in detail using add as example. 4.Operation of controls using or immediate, store, branch

55 Chapter 4: ISA 55 Mapping Code onto the DataPath: How does it all work? We Start With Code ############################################################# # This is code that mimics the following C program. # main( ) # { # printf( "Hello World\n" ); # } # ###########################################################.text.globlmain main: lui $a0, hello ori$v0, $0, 4# li $v0, 4 add$t0, $t1, $t2 syscall jr$ra.data hello:.asciiz"Hello World\n" 4.13 Decoding

56 Chapter 4: ISA 56 From that code we get a 32-bit Equivalent Binary Representation. Address Hex Op-code Mnemonic [0x ]0x3c lui $2, 4097 ; 12: la$a0, hello [0x ]0x ori $4, $2, 0 ; [0x ]0x ori $2, $0, 4 ; 14: ori$v0, $0, 4 [0x c]0x012a4020 add $8, $9, $10 ; 14: add $t0,$t1,$t2 [0x ]0x c syscall ; 15: syscall [0x ]0x03e00008 jr $31 ; 16: jr$ra a Decoding

57 Chapter 4: ISA 57 What the hardware looks like. Registers R0 R8 R31 MUX To ALU MUX From ALU MUX To ALU Op A B Out Select 5 wires 6 wires ovf c 4.13 Decoding

58 Chapter 4: ISA 58 What the hardware looks like. Regist ers R0 R8 R31 MUX To ALU MUX From ALU MUX To ALU A B Out Op 4.13 Decoding

59 Chapter 4: ISA 59 The hardware reads each of those fields. Registers R0 R8 R31 MUX To ALU MUX From ALU MUX To ALU Out A B 4.13 Decoding

60 Chapter 4: ISA 60 An Overview of the Implementation Data Out Clk 5 RwRaRb bit Registers Rd ALU Clk Data In Data Address Ideal Data Memory Instruction Address Ideal Instruction Memory Clk PC 5 Rs 5 Rt 32 A B Next Address Control Datapath Control Signals Conditions 4.13 Decoding & Control

61 Chapter 4: ISA Decoding & Control 1.An overview of how control works from instructions to electronics –How are instructions laid out so they can be simply decoded? –Going from PIC to MIPS 2. Control operations –Samples of load, store and branch 3.The fetch unit in detail using add as example. 4.Operation of controls using or immediate, store, branch

62 Chapter 4: ISA 62 Overview of the Instruction Fetch Unit The common operations –Fetch the Instruction: mem[PC] –Update the program counter: Sequential Code: PC PC + 4 Branch and Jump: PC something else 32 Instruction Word Address Instruction Memory PC Clk Next Address Logic 4.13 Decoding & Control Control Samples

63 Chapter 4: ISA 63 Add & Subtract R[rd] R[rs] op R[rt]; Example: addu rd, rs, rt –Ra, Rb, and Rw come from instructions rs, rt, and rd fields –ALUctr and RegWr: control logic after decoding the instruction 32 Result ALUctr Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers RsRtRd ALU oprsrtrdshamtfunct bits 5 bits 4.13 Decoding & Control Control Samples

64 Chapter 4: ISA 64 Logical Operations With Immediate R[rt] R[rs] op ZeroExt[ imm16 ] 11 oprsrtimmediate bits16 bits5 bits rd? immediate bits Result ALUct r Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers Rs ZeroExt Mux RtRd RegDst Mux imm16 ALUSrc ALU Rt? 4.13 Decoding & Control Control Samples

65 Chapter 4: ISA 65 Load Operations R[rt] Mem[R[rs] + SignExt[imm16]]; Example: lw rt, rs, imm16 11 oprsrtimmediate bits16 bits5 bits rd 32 ALUctr Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers Rs RtRd RegDst Extender Mux imm16 ALUSrc ExtOp Clk Data In WrEn 32 Adr Data Memory 32 ALU MemWr Mu x W_Src ?? Rt? 4.13 Decoding & Control Control Samples

66 Chapter 4: ISA 66 Store Operations Mem[ R[rs] + SignExt[imm16] R[rt] ]; Example: sw rt, rs, imm16 oprsrtimmediate bits16 bits5 bits 32 ALUctr Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst Extender Mux imm16 ALUSrc ExtOp Clk Data In WrEn 32 Adr Data Memory MemWr ALU 32 Mu x W_Src 4.13 Decoding & Control Control Samples

67 Chapter 4: ISA 67 The Branch Instruction beqrs, rt, imm16 –mem[PC]Fetch the instruction from memory –Equal R[rs] == R[rt]Calculate the branch condition –if (Equal)Calculate the next instructions address PC PC ( SignExt(imm16) 4 ) –else PC PC + 4 oprsrtimmediate bits16 bits5 bits 4.13 Decoding & Control Control Samples

68 Chapter 4: ISA 68 Datapath for Branch Operations beq rs, rt, imm16 Datapath generates condition (equal) oprsrtimmediate bits16 bits5 bits 32 imm16 PC Clk 00 Adder Mux Adder 4 nPC_sel Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Register s Rs Rt Equal? Cond PC Ext Inst Address 4.13 Decoding & Control Control Samples

69 Chapter 4: ISA 69 Summary: A Single Cycle Datapath imm16 32 ALUctr Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst Extender Mux imm16 ALUSrc ExtOp Mux MemtoReg Clk Data In WrEn 32 Adr Data Memory MemWr ALU Equal Instruction Imm16RdRtRs = Adder PC Clk 00 Mux 4 nPC_sel PC Ext Adr Inst Memory 4.13 Decoding & Control Rs, Rt, Rd and Imed16 hardwired into datapath from Fetch Unit We have everything except control signals (the underlined pieces)

70 Chapter 4: ISA 70 Summary: Meaning of the Control Signals ExtOp:zero, sign ALUsrc:0 regB; 1 immed ALUctr:add, sub, or ° MemWr:1 write memory ° MemtoReg:0 ALU; 1 Mem ° RegDst:0 rt; 1 rd ° RegWr:1 write register 4.13 Decoding & Control

71 Chapter 4: ISA Decoding & Control 1.An overview of how control works from instructions to electronics –How are instructions laid out so they can be simply decoded? –Going from PIC to MIPS 2. Control operations –Samples of load, store and branch 3.The fetch unit in detail using add as example. 4.Operation of controls using or immediate, store, branch

72 Chapter 4: ISA 72 The add Instruction addrd, rs, rt mem[PC]Fetch the instruction from memory R[rd] R[rs] + R[rt]The actual operation PC PC + 4Calculate next instruction address oprsrtrdshamtfunct bits 5 bits 4.13 Decoding & Control

73 Chapter 4: ISA 73 The Fetch Unit nPC_sel: 0 PC PC PC PC SignExt(Im16) || 00 Adr Inst Memory Adder PC Clk 00 Mux 4 nPC_sel PC Ext imm Decoding & Control

74 Chapter 4: ISA 74 Fetch Unit at Beginning (and end) of add Fetch the instruction from Instruction memory: Instruction mem[PC] (This is the same for all instructions) But wait until we get to branch instructions!! Adr Inst Memory Adder PC Clk 00 Mux 4 nPC_sel imm16 Instruction Decoding & Control

75 Chapter 4: ISA Decoding & Control 1.An overview of how control works from instructions to electronics –How are instructions laid out so they can be simply decoded? –Going from PIC to MIPS 2. Control operations –Samples of load, store and branch 3.The fetch unit in detail using add as example. 4.Operation of controls using or immediate, store, branch

76 Chapter 4: ISA 76 The Single Cycle Datapath during Or Immediate R[rt] R[rs] or ZeroExt(Imm16) oprsrtimmediate Decoding & Control

77 Chapter 4: ISA 77 R[rt] R[rs] or ZeroExt(Imm16) oprsrtimmediate The Single Cycle Datapath during Or Immediate 4.13 Decoding & Control

78 Chapter 4: ISA 78 Data Memory {R[rs] + SignExt[imm16]} R[rt] oprsrtimmediate The Single Cycle Datapath during Store 4.13 Decoding & Control

79 Chapter 4: ISA 79 Instruction oprsrtimmediate Data Memory {R[rs] + SignExt[imm16]} R[rt] The Single Cycle Datapath during Store 4.13 Decoding & Control

80 Chapter 4: ISA 80 if (R[rs] – R[rt] == 0) then Zero 1; else Zero 0 oprsrtimmediate The Single Cycle Datapath during Branch 4.13 Decoding & Control

81 Chapter 4: ISA 81 if (Zero == 1) then PC = PC SignExt(imm16) 4 ; else PC = PC + 4 oprsrtimmediate ° What is encoding of nPC_sel? Direct MUX select? Branch / not branch ° Lets choose second option Adr Inst Memory Adder PC Clk 00 Mux 4 nPC_sel imm16 Instruction 0 1 Zero The instruction fetch unit at end of branch 4.13 Decoding & Control

82 Chapter 4: ISA 82 Summary: A Single Cycle Processor 4.13 Decoding & Control

83 Chapter 4: ISA 83 Drawback of this Single Cycle Processor Long cycle time: –Cycle time must be long enough for the load instruction: PCs Clock -to-Q + Instruction Memory Access Time + Register File Access Time + ALU Delay (address calculation) + Data Memory Access Time + Register File Setup Time + Clock Skew Cycle time for load is much longer than needed for all other instructions 4.13 Decoding & Control

84 Chapter 4: ISA Real World Architectures We will look at an Intel architecture, which is a CISC machine and MIPS, which is a RISC machine. –CISC is an acronym for complex instruction set computer. –RISC stands for reduced instruction set computer.

85 Chapter 4: ISA Real World Architectures The classic Intel architecture, the 8086, was born in It is a CISC architecture. It was adopted by IBM for its famed PC, which was released in The 8086 operated on 16-bit data words and supported 20-bit memory addresses. Later, to lower costs, the 8-bit 8088 was introduced. Like the 8086, it used 20-bit memory addresses. What was the largest memory that the 8086 could address?

86 Chapter 4: ISA Real World Architectures The 8086 had four 16-bit general-purpose registers that could be accessed by the half-word. It also had a flags register, an instruction register, and a stack accessed through the values in two other registers, the base pointer and the stack pointer. The 8086 had no built in floating-point processing. In 1980, Intel released the 8087 numeric coprocessor, but few users elected to install them because of their cost.

87 Chapter 4: ISA Real World Architectures In 1985, Intel introduced the 32-bit It also had no built-in floating-point unit. The 80486, introduced in 1989, was an that had built-in floating-point processing and cache memory. The and offered downward compatibility with the 8086 and Software written for the smaller word systems was directed to use the lower 16 bits of the 32-bit registers.

88 Chapter 4: ISA Real World Architectures Currently, Intels most advanced 32-bit microprocessor is the Pentium 4. It can run as fast as 3.8 GHz. This clock rate is nearly 800 times faster than the 4.77 MHz of the Speed enhancing features include multilevel cache and instruction pipelining. Intel, along with many others, is marrying many of the ideas of RISC architectures with microprocessors that are largely CISC.

89 Chapter 4: ISA Real World Architectures The MIPS family of CPUs has been one of the most successful in its class. In 1986 the first MIPS CPU was announced. It had a 32-bit word size and could address 4GB of memory. Over the years, MIPS processors have been used in general purpose computers as well as in games. The MIPS architecture now offers 32- and 64-bit versions.

90 Chapter 4: ISA Real World Architectures MIPS was one of the first RISC microprocessors. The original MIPS architecture had only 55 different instructions, as compared with the 8086 which had over 100. MIPS was designed with performance in mind: It is a load/store architecture, meaning that only the load and store instructions can access memory. The large number of registers in the MIPS architecture keeps bus traffic to a minimum. How does this design affect performance?

91 Chapter 4: ISA 91 The major components of a computer system are its control unit, registers, memory, ALU, and data path. A built-in clock keeps everything synchronized. Control units can be microprogrammed or hardwired. Hardwired control units give better performance, while microprogrammed units are more adaptable to changes. Computers run programs through iterative fetch-decode- execute cycles. Computers can run programs that are in machine language. The Intel architecture is an example of a CISC architecture; MIPS is an example of a RISC architecture. Chapter 4 Conclusion


Download ppt "Chapter 4: ISA 1 Chapter 4: Instruction Set Architectures CS140 Computer Organization These slides are derived from those of Null & Lobur + the work of."

Similar presentations


Ads by Google