Download presentation
1
Important Notice for Lecturers
This file is provided as an example only Lecturers are expected to modify / enhance slides to suit their teaching style Lecturers are expected to cover the topics presented in these slides Lecturers can export slides to another format if it suits their teaching style (but must cover the topics indicated in the slides) This file should not be used AS PROVIDED – you should modify it to suit your own needs! This slide should be deleted from this presentation Provided by the FIT1001 SIG
3
Lecture 4 CPU Internal Bus Organisation
FIT1001- Computer Systems Lecture 4 CPU Internal Bus Organisation
4
Lecture 4: Learning Objectives
Identify that registers, ALU, control unit and buses are components of a simple central processing unit Describe how the components are joined by buses to form a datapath Understand that clock pulses are used to regulate the operational timing of buses and the components in the datapath Describe the fetch–decode–execute cycle and explain how this is used to perform instructions in a simple digital computer program Demonstrate the operation of a simple computer using a simulator (MARIE)
5
Internal CPU Organisation
6
Internal CPU Organisation
SG1 presented a general overview of computer systems In SG2 we discussed how data is stored and manipulated by various computer system components SG3 described the fundamental components of digital circuits Having this background, we can now understand how computer components work, and how they fit together to create useful computer systems
7
Internal CPU Organisation
All computers have a central processing unit The CPU fetches, decodes, and executes program instructions The two principal parts of the CPU are the: Datapath The datapath consists of an arithmetic-logic unit and storage units (registers) that are interconnected by a data bus that is also connected to main memory Control unit Various CPU components perform sequenced operations according to signals provided by its control unit
8
Internal CPU Organisation - CPU
Registers Hold data that can be readily accessed by the CPU They can be implemented using D flip-flops A 32-bit register requires 32 D flip-flops The arithmetic-logic unit (ALU) Carries out logical and arithmetic operations as directed by the control unit Often affects the status register (eg., overflow, carries etc) Operations are controlled by the control unit The control unit (CU) Policeman or traffic manager Determines which actions to carry out according to the values in a program counter register and a status register
9
Internal CPU Organisation - CPU
The CPU shares data with other system components by way of a data bus Set of wires that act as a shared but common data path Connect multiple subsystems within the system Consists of multiple lines which allow parallel movement of bits At any one time, only one device may use the bus Can result in a bottleneck The speed of the bus is affected by its length / number of devices sharing it Devices can be divided into two categories Master: a device that initiates an action Salve: a device that responds to an action
10
Internal CPU Organisation - Buses
Two types of buses are commonly found Point-to-point: a bus that connect two specific components Multipoint Because a multipoint bus is a shared resource, access to it is controlled through protocols, which are built into the hardware
11
Internal CPU Organisation - Buses
Buses consist of: Data lines: convey bits from one device to another Control lines: determine the direction of data flow and when each device can access the bus Address lines: determine the location of the source or destination of the data
12
Internal CPU Organisation - Buses
Typical bus transactions Sending an address (for a read or write) Transferring data from memory to a register Transferring data from a register to memory I/O reads and writes from peripheral devices Computers have a hierarchy of buses Not uncommon to have two buses or more in the same system In PC’s Main internal bus called the system bus Connects CPU, memory and all other internal components External buses Connect external devices, peripherals, expansion slots and I/O ports
13
Internal CPU Organisation - Buses
Local buses Connect a peripheral device directly to the CPU High speed Limited number of similar devices can be connected To use a bus A device must reserve it (can only be a bus master) Bus master: a device that is allowed to initiate a transfer Bus slaves: a device that responds to a bus master request Eg., a basic system may only allow the CPU to be bus master In cases where more than one device can be the bus master Concurrent bus master requests must be arbitrated Arbitration schemes must provide priority to certain master devices / ensure lower priority devices are not starved
14
Internal CPU Organisation - Buses
Arbitration schemes fall into four categories: Daisy chain: Permissions are passed from the highest-priority device to the lowest. Centralized parallel: Each device is directly connected to an arbitration circuit. Distributed using self-detection: Devices decide which gets the bus among themselves. Distributed using collision-detection: Any device can try to use the bus. If its data collides with the data of another device, it tries again.
15
Internal CPU Organisation - Clocks
Every computer contains at least one clock that synchronizes the activities of its components A fixed number of clock cycles are required to carry out each data movement or computational operation The clock frequency determines the speed of all operations Measured in megaHertz or gigaHertz Clock cycle time is the reciprocal of clock frequency An 800 MHz clock has a cycle time of 1.25 ns 1/800,000,000 = = 1.25 * 10-9 Most machines are synchronous Controlled by a master clock signal Register must wait for the clock to tick before loading new data
16
Internal CPU Organisation - Clocks
Clock speed should not be confused with CPU performance The CPU time required to run a program is given by the general performance equation: We see that we can improve CPU throughput when we reduce the number of instructions in a program, reduce the number of cycles per instruction, or reduce the number of nanoseconds per clock cycle
17
Internal CPU Organisation - Clocks
Eg., Two machines with the same clock speed may not execute instructions in the same number of cycles Multiply operation On Intel 286 takes 20 clock cycles On Intel Pentium take 1 clock cycle Implies that the Pentium is 20 time faster (if both used the same internal clock) In general Multiplication requires more times than addition Floating point operations require more cycles than integers Accessing memory takes longer than accessing registers Generally the term clock refers to the master / system clock Buses can have their own clocks which are usually slower
18
Internal CPU Organisation – I/O Subsystem
A computer communicates with the outside world through its input/output (I/O) subsystem I/O devices connect to the CPU through various interfaces Input devices: keyboards, mice, scanners, touch screens etc Output devices: monitors, printers, speakers etc These devices are not connected to the CPU directly Interface for data transfers Converts the system bus signals to and from formats acceptable to the given device CPU communicates via input/output registers Two methods: memory-mapped I/O / instruction-based I/O
19
Internal CPU Organisation – I/O Subsystem
Memory-mapped I/O device behaves like main memory from the CPU’s point of view Fast in terms of speed Uses memory space Instruction-based The CPU has specialised instructions that perform the input / output Requires specific I/O instructions which can only be used by the CPU
20
Internal CPU Organisation - Memory
Consists of a linear array of addressable storage cells that are similar to registers A memory address is always represented by an unsigned integer Can be byte-addressable or word-addressable Byte-addressable: each byte has a unique address Word-addressable: A series of bytes are addresses, eg., 2 bytes Byte with lowest address determines the word address
21
Internal CPU Organisation - Memory
Memory is constructed of RAM chips Referred to in terms of length width Eg., A memory word size of a machine is 16 bits A 4M 16 RAM chip gives us 4 megabytes of 16-bit memory locations 4M = 22 * 220 = 222 = 4,194,304 unique locations (each word is 16 bits) Memory locations will range from 0 – 4,194,303 in unsigned integers 2N addressable units of memory will require N bits to address each byte Thus, the memory bus of this system requires at least 22 address lines The address lines “count” from 0 to in binary Each line is either “on” or “off” indicating the location of the desired memory element
22
Internal CPU Organisation - Memory
Physical memory usually consists of more than one RAM chip Chips are combined into a single memory module Eg., You want to build 32K x 16 memory and all you have is 2K x 8 RAM chips: Each row address 2K words but requires two chips to handle the full width of 16
23
Internal CPU Organisation - Memory
A single shared memory module causes sequentialization of access Memory interleaving helps this by splitting memory across multiple memory modules Low-order interleaving: the low order bits of the address specify which memory bank contains the address of interest Places consecutive words of memory in different memory modules Eg., 32 addresses:
24
Internal CPU Organisation - Memory
High-order interleaving: the high order address bits specify which memory bank contains the address of interest Distributes the addresses so each module contains consecutive addresses Eg., 32 addresses:
25
Internal CPU Organisation - Interrupts
The normal execution of a program is altered when an event of higher-priority occurs The CPU is alerted to such an event through an interrupt Interrupts can be triggered by: I/O requests Arithmetic errors (such as division by zero) When an invalid instruction is encountered Each interrupt is associated with a procedure that directs the actions of the CPU when an interrupt occurs Interrupts can be initiated by the user or the system Interrupts can be of two types: Maskable (disabled or ignored) Non-maskable (high-priority interrupts that cannot be ignored
26
MARIE
27
MARIE - Introduction We can now bring together many of the ideas that we have discussed to this point using a very simple model computer Our model computer, the Machine Architecture that is Really Intuitive and Easy (MARIE) was designed to illustrate basic computer system concepts While this system is too simple to do anything useful in the real world, a deep understanding of its functions will enable you to comprehend system architectures that are much more complex
28
MARIE - Introduction The MARIE architecture has the following characteristics: Binary, two's complement data representation Stored program, fixed word length data and instructions 4K words of word-addressable main memory 16-bit data words 16-bit instructions, 4 for the opcode and 12 for the address A 16-bit arithmetic logic unit (ALU) Seven registers for control and data movement
29
MARIE - Introduction MARIE has seven registers: Accumulator (AC)
A 16-bit register that holds data values General purpose register that holds data for the CPU to process Memory address register (MAR) 12-bit register that holds a memory address Holds the memory address of the data being referenced Memory buffer register (MBR) 16-bit register that holds the data after its retrieval from, or before its placement in memory Program counter (PC) 12-bit register that holds the memory address of the next program instruction to be executed
30
MARIE - Introduction Instruction register (IR) Input register (InREG)
Holds a copy of an instruction immediately preceding its execution (so the next instruction basically) Input register (InREG) 8-bit register that holds data that was read from an input device Output register (OutREG) 8-bit register that holds data that is ready for writing to the output device Hidden register A Status Flag is present but is not displayed
31
MARIE - Introduction The MARIE architecture:
32
MARIE - Introduction Bus connections
The registers are interconnected, and connected with main memory through a common data bus Each device on the bus is identified by a unique number that is set on the control lines whenever that device is required to carry out an operation Separate connections are also provided: Between the AC and MBR , the ALU and the AC and MBR This permits data transfer between these devices without use of the main data bus
33
MARIE - ISA Instruction Set Architecture (ISA)
Specifies the format of a machines instructions Specifies the primitive operations that the machine can perform The ISA is an interface between a computer’s hardware and its software Some ISAs include hundreds of different instructions for processing data and controlling program execution The MARIE ISA consists of only thirteen instructions Format of a MARIE instruction:
34
MARIE - ISA ISAs generally consist of instructions for:
Processing / moving data Controlling the execution sequence of a programs The fundamental MARIE instructions are:
35
MARIE - ISA Bit pattern for a LOAD instruction as it would appear in the IR: We see that the opcode is 1, which is LOAD The memory address from which to load the data is 3 Each of the instructions actually consists of a sequence of smaller instructions called microoperations Eg., The LOAD instruction at a component level 1. The address from the instruction is loaded into MAR 2. Data in memory at the specified location is loaded into MBR 3. MBR loaded into the AC
36
MARIE - ISA MAR X MBR M[MAR], AC MBR
The exact sequence of microoperations can be specified using register transfer language (RTL) In the MARIE RTL: M[X] is used to indicate the actual data value stored in memory location X is used to indicate the transfer of bytes to a register or memory location Transfers from one register to another always involves transfers onto and of the bus Omitted for clarity but need to be aware The RTL for the LOAD instruction is: MAR X MBR M[MAR], AC MBR
37
MARIE - ISA MAR X MBR M[MAR] AC AC + MBR
The RTL for the ADD instruction is: MAR X MBR M[MAR] AC AC + MBR
38
MARIE - ISA If IR[11 - 10] = 00 then If AC < 0 then PC PC + 1
The SKIPCOND instructions jumps to the next instruction according to the value of the AC The RTL for the this instruction is the most complex in the instruction set If IR[ ] = 00 then If AC < 0 then PC PC + 1 else If IR[ ] = 01 then If AC = 0 then PC PC + 1 else If IR[ ] = 11 then If AC > 0 then PC PC + 1
39
Instruction Processing
40
Instruction Processing
The fetch-decode-execute cycle is the series of steps A computer carries this cycle out when it runs a program until all the code is processed Steps: 1. Fetch an instruction from memory and place it into the IR 2. Once in the IR it is decoded to determine what needs to be done next 3. If a memory value (operand) is involved in the operation, it is retrieved and placed into the MBR 4. With everything in place, the instruction is executed
41
Instruction Processing
42
Instruction Processing
Interrupts and I/O Recall that MARIE has two registers to handle input / output The timing used by these registers is important Each character must be loaded into the input register If the processor is fast, input slow, the same character may be read multiple times Solved by interrupt-driven I/O When the CPU executes an input / output instruction Relevant I/O device notified, CPU continues with other work I/O device sends an interrupt to the CPU which processes it CPU returns to fetch-decode-execute This process requires: 1. An interrupt to indicate the I/O is complete 2. Means for the CPU to detour from fetch-decode-execute cycle
43
Instruction Processing
Many computers check at the start of a cycle for a pending interrupt The I/O device will send an interrupt to the status register CPU will execute an interrupt routine that is determined by the type of interrupt External interrupt such as Ctrl-C Internal interrupts such as overflow / divide by zero Software interrupts
44
Instruction Processing
Processing an interrupt Interrupt routine is processed via the fetch-decode-execute cycle Once the interrupt routine is complete the CPU switches back to the program it was running before The CPU must return to the exact point in the previous program Therefore contents of registers and status conditions mush be saved
45
Instruction Processing - A Simple Program
Consider the simple MARIE program given below A set of mnemonic instructions stored at addresses (hex):
46
Instruction Processing - A Simple Program
Let’s look at what happens inside the computer when the program runs This is the LOAD 104 instruction:
47
Instruction Processing - A Simple Program
The second instruction is ADD 105:
48
Instruction Processing - A Simple Program
The third instruction is STORE 106: The fourth instruction is a HALT The binary contents at location 106 changes to: (000C16, 1210)
49
Assemblers
50
Assemblers Mnemonic instructions such as LOAD 104 Assemblers
Easy for humans to write and understand Impossible for computers to understand Assemblers Translate instructions that are comprehensible to humans into the machine language that is comprehensible to computers Convert assembly language (mnemonic) in machine language (strings of binary values) Assembler reads a source file (assembly) and produces an object file (machine code)
51
Assemblers Applying names or lables to assembly
Substituting alphanumeric names for opcodes makes programming easier Labels can also be given to particular memory addresses Labels are nice for programmers but more work is now required by the assembler
52
Assemblers Assemblers create an object program file from mnemonic source code in two passes Reads entire program twice First pass: the assembler builds a set of correspondences called a symbol table / begins to translate the instructions For our program the first pass creates: Symbol Table Instructions
53
Assemblers Second pass: the assembler uses the symbol table to fill in the addresses and create the corresponding machine language instruction It would know that X is located at address 104 For our program the second pass creates:
54
Assemblers Assembler directives Instructions that are not be converted
Assists in specifying values In MARIE DEC = decimal, HEX = hexadecimal For our program we have included two directives DEC and HEX that specify the radix of the constants Comment delimiter In MARIE = / (so all text after / is ignored by the assembler
55
Extending Instruction Set
56
Extending Our Instruction Set
Add some more instruction to make programming in MARIE easier Together with the 9 original instructions we now have 13 in the MARIE instruction set
57
Extending Our Instruction Set
Indirect addressing So far we have only used direct addressing mode This means that the address of the operand is explicitly stated in the instruction Often useful to employ a indirect addressing Where the address of the address of the operand is given in the instruction If you have ever used pointers in a program, you are already familiar with indirect addressing Instead of using the value found at location X as the actual address, use the value found in X as a pointer to a new memory location that contains the data we wish to use in the instruction Eg., Instruction AddI 400 means go to location 400, and if the value is say 240, go to location 240 and get the operand
58
Extending Our Instruction Set
The following RTL tells us what is happening at the register level for AddI: MAR X MBR M[MAR] MAR MBR AC AC + MBR
59
Extending Our Instruction Set
Another programming tool is the use of subroutines The jump-and-store instruction, JNS, gives us limited subroutine functionality The details of the JNS instruction are given by the following RTL: MBR PC MAR X M[MAR] MBR MBR X AC 1 AC AC + MBR AC PC
60
Extending Our Instruction Set
The CLEAR instruction All it does is set the contents of the accumulator to all zeroes RTL for CLEAR: AC 0
61
Extending Our Instruction Set
Program: Loop to add 5 numbers 116 |Ctr HEX 0 117 |One DEC 1 118 | DEC 10 119 | DEC 15 11A | DEC 2 11B | DEC 25 11C | DEC 30 100 | LOAD Addr 101 | STORE Next 102 | LOAD Num 103 | SUBT One 104 | STORE Ctr 105 | CLEAR 106 |Loop LOAD Sum 107 | ADDI Next 108 | STORE Sum 109 | LOAD Next 10A | ADD One 10B | STORE Next 10C | LOAD Ctr 10D | SUBT One 10E | STORE Ctr 10F | SKIPCOND 000 110 | JUMP Loop 111 | HALT 112 |Addr HEX 118 113 |Next HEX 0 114 |Num DEC 5 115 |Sum DEC 0
62
Decoding – Hardwired vs Microprogrammed
63
Decoding A computer’s control unit keeps things synchronised
Makes sure that bits flow to the correct components as the components are needed Must be control signals to trigger events When an Add is performed we assume that the addition takes place because the control signal for the ALU are set to Add Question: How do these control lines actually become asserted? There are two general ways in which a control unit can be implemented: Microprogrammed control: a small program is placed into read-only memory in the microcontroller Hardwired: controllers implement this program using digital logic components
64
Decoding Hardwired Physically connect all of the control lines to the actual machine instruction Instructions are divided into fields and different bits are combined with various digital logic components (which drive the control line) The control unit is implemented using hardware The digital circuit uses inputs to generate the control signal to drive various components Advantage: very fast Disadvantage: instruction set and digital logic locked
65
Decoding Microprogrammed Uses software for control
All machine instructions fed through the microprogram to convert instruction into the correct control signal Microprogram: stored in firmware (ROM, PROM or EPROM) Also known as the control store Converts machine instructions (binary) into control signals One subroutine for each machine instruction Advantage: very flexible Disadvantage: additional layer of interpretation
66
Next Week Study Guide 5 Introduction to Instruction Set Architecture
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.