Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hardware structures – system bus, internal (operational) memory. Piotr Mielecki Ph. D. Introduction to Computer Systems (4)

Similar presentations


Presentation on theme: "Hardware structures – system bus, internal (operational) memory. Piotr Mielecki Ph. D. Introduction to Computer Systems (4)"— Presentation transcript:

1 Hardware structures – system bus, internal (operational) memory. Piotr Mielecki Ph. D. Introduction to Computer Systems (4)

2 1.System Bus concept and example implementation. The bus implemented in particular computer design for connecting its basic modules (not intended as widely used standard, like PCI, AGP etc.) is usually called a System Bus. The bus implemented in particular computer design for connecting its basic modules (not intended as widely used standard, like PCI, AGP etc.) is usually called a System Bus. The basic problem rests on the fact that usually the same data and address lines (contacts) of the CPU microcircuit are used to make connection with different devices. So the System Bus has to separate form the CPU the devices which dont take a part in particular connection (access cycle), enabling only those which are exchanging data right now. The basic problem rests on the fact that usually the same data and address lines (contacts) of the CPU microcircuit are used to make connection with different devices. So the System Bus has to separate form the CPU the devices which dont take a part in particular connection (access cycle), enabling only those which are exchanging data right now. The bus in hardware structure of computer consists of a set of wires which connect the CPU with other parts of computer (memory and Input / Output devices in von Neumanns architecture based computer). DEFINITION:

3 Each implementation of the bus consists of 3 types of wires: Address Bus – set of m signals (usually named A 0 – A m-1 ) which pass the binary address to the device accessed for read (RD) or write (WR) operation (address in physical memory or in Input / Output space). If the CPU has address bus of 16 bit length (A 0 – A 15 ) it can address up to 2 16 = different cells in memory, for example. The 32-bit address bus can address up to 4 GB (2 32 = ) different bytes if the basic word in memory has 1-byte length. Address Bus – set of m signals (usually named A 0 – A m-1 ) which pass the binary address to the device accessed for read (RD) or write (WR) operation (address in physical memory or in Input / Output space). If the CPU has address bus of 16 bit length (A 0 – A 15 ) it can address up to 2 16 = different cells in memory, for example. The 32-bit address bus can address up to 4 GB (2 32 = ) different bytes if the basic word in memory has 1-byte length. Data Bus – set of n signals (D 0 – D n-1 ) which can pass the binary value to or from the device accessed for read or write operation (physical memory or I/O port). The length of the Data Bus is usually (but not always) equal to the basic machine word length of the particular CPU (8, 16, 32, 64 bit for example) and determines the class of CPU (8-bit CPU, 64-bit CPU etc.). The data signals are bidirectional (inputs and outputs). Data Bus – set of n signals (D 0 – D n-1 ) which can pass the binary value to or from the device accessed for read or write operation (physical memory or I/O port). The length of the Data Bus is usually (but not always) equal to the basic machine word length of the particular CPU (8, 16, 32, 64 bit for example) and determines the class of CPU (8-bit CPU, 64-bit CPU etc.). The data signals are bidirectional (inputs and outputs). Control Bus – set of logical signals used to drive particular devices and access cycles like distinguishing between memory and I/O, reading and writing etc. The System Clock signal (CLK) which synchronizes all the devices is also a part of the Control Bus. Control Bus – set of logical signals used to drive particular devices and access cycles like distinguishing between memory and I/O, reading and writing etc. The System Clock signal (CLK) which synchronizes all the devices is also a part of the Control Bus.

4 Zilog Z-80 CPU (modification of Intel 8080) – example of CPUs System Bus. A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 D0 D1 D2 D3 D4 D5 D6 D7 Vcc GND RESET CLK INT NMI WAIT BUSRQ M1 RFSH HALT BUSACK MREQ IOREQ RD WR Z-80 CPU ADDRESS BUS DATA BUS CONTROL BUS (INPUTS) (OUTPUTS)

5 To complete each data exchange cycle between the CPU and other device appropriate sequence of signals (timing) synchronized with CLK signal is needed. The control signals Read (RD), Write (WR) and Memory Request (MREQ) of the Z-80 CPU (for example) drive the attached memory circuits to enter read mode and complete the Memory Read access cycle or Operation Code Fetch (M1) cycle.

6 Most important and/or interesting things seen on the diagram are: Control signals in most of circuits used to build microcomputers are active in low logical state (0 V). The signals MREQ and RD both have to be active to read from memory. Control signals in most of circuits used to build microcomputers are active in low logical state (0 V). The signals MREQ and RD both have to be active to read from memory. The rising edge of the CLK pulse causes the signals provided by the memory circuit and seen on Data Bus (D 0 – D 7 lines) to be copied (latched) to CPUs internal data buffer register (see Lecture 3). The rising edge of the CLK pulse causes the signals provided by the memory circuit and seen on Data Bus (D 0 – D 7 lines) to be copied (latched) to CPUs internal data buffer register (see Lecture 3). The lines D 0 – D 7 are the CPUs inputs in reading cycle, the same lines are outputs in writing cycles. When the CPU or other device attached to the bus doesnt have any data to write or read, it keeps its data lines in the logical third state (also called the high-resistance state), which is equivalent to electrically cut-off (R = Ω) from the external wires. The lines D 0 – D 7 are the CPUs inputs in reading cycle, the same lines are outputs in writing cycles. When the CPU or other device attached to the bus doesnt have any data to write or read, it keeps its data lines in the logical third state (also called the high-resistance state), which is equivalent to electrically cut-off (R = Ω) from the external wires. The processors like Intel 8080 and Zilog Z-80 were designed to co- operate with slow memory circuits, so they used to add entire CLK cycle (T2) between activation their MREQ and RD control signals and reading the data lines D 0 – D 7. Its seen in both M1 and Memory Read cycles. The processors like Intel 8080 and Zilog Z-80 were designed to co- operate with slow memory circuits, so they used to add entire CLK cycle (T2) between activation their MREQ and RD control signals and reading the data lines D 0 – D 7. Its seen in both M1 and Memory Read cycles.

7 Memory hardware can force the CPU to add additional empty CLK cycles if memory circuits are not ready with data on Data Bus. It is possible with CPUs WAIT input. Memory hardware can force the CPU to add additional empty CLK cycles if memory circuits are not ready with data on Data Bus. It is possible with CPUs WAIT input. During the M1 cycle (T3 and T4 CLK pulses) Z-80 CPU supports the Refresh cycle for dynamic RAM (Random Access Memory) integrated circuits (today its not a common solution, the D-RAM modules have their own circuitry to do so). It is done by sending to 8 lower bytes of Address Bus (A 0 – A 7 ) the number of the entire raw in the RAM structure and activating the RFSH control output. The short pulse of MREQ signal causes the dynamic RAM circuitry (physically organized in a different mode this time) to rewrite its contents in one raw of cells. The CPUs internal R register, which keeps the number of current raw in D-RAM structure, is then incremented by 1 for another Refresh cycle. During the M1 cycle (T3 and T4 CLK pulses) Z-80 CPU supports the Refresh cycle for dynamic RAM (Random Access Memory) integrated circuits (today its not a common solution, the D-RAM modules have their own circuitry to do so). It is done by sending to 8 lower bytes of Address Bus (A 0 – A 7 ) the number of the entire raw in the RAM structure and activating the RFSH control output. The short pulse of MREQ signal causes the dynamic RAM circuitry (physically organized in a different mode this time) to rewrite its contents in one raw of cells. The CPUs internal R register, which keeps the number of current raw in D-RAM structure, is then incremented by 1 for another Refresh cycle.

8 The Memory Write CPU cycle looks in a bit different way. This time the Write (WR) control signal together with MREQ causes the memory circuit (addressed by Address Bus lines A 0 – A 15 ) to read the data which CPU wants to write. The falling or rising edge of the WR signal should be used by the memory circuit to latch the data from Data Bus. Finally we can say, that System Bus (and any other bus) is defined by set of wires and sequences of control signals which drive different hardware devices during data-exchange cycles.

9 2. The operational memory – hardware implementation. implementation. The internal memory (operational memory or primary level storage) should be implemented as a linear array of binary words (bytes, for example) addressed by a unique binary addresses. So it takes a part of memory in the von Neumanns model of computer. The internal memory (operational memory or primary level storage) should be implemented as a linear array of binary words (bytes, for example) addressed by a unique binary addresses. So it takes a part of memory in the von Neumanns model of computer. To be compatible with the System Bus the memory circuitry should have: To be compatible with the System Bus the memory circuitry should have: address inputs, address inputs, data inputs / outputs, data inputs / outputs, some control inputs (sometimes outputs like WAIT also). some control inputs (sometimes outputs like WAIT also). In most of cases the set of control signals has to be passed trough additional logic circuits (decoders, logical gates etc.) to fit particular CPU circuit with memory chips used. In most of cases the set of control signals has to be passed trough additional logic circuits (decoders, logical gates etc.) to fit particular CPU circuit with memory chips used.

10 Simplified diagram of the Z-80 CPU attached to memory with System Bus.

11 From technological point of view we should distinguish between several types of memory circuits, for example: RAM (Random Access Memory) – the circuits that can be written or read. Von Neumann assigned entire memory as RAM. Most of RAM implementations are the electronic integrated circuits which looses their contents after power-off. RAM (Random Access Memory) – the circuits that can be written or read. Von Neumann assigned entire memory as RAM. Most of RAM implementations are the electronic integrated circuits which looses their contents after power-off. SRAM (Static RAM) – the RAM circuits which dont need refresh cycle. They are much faster than Dynamic RAMs, but have much less level of integration. Today used first of all as cache buffers between CPU and DRAM. SRAM (Static RAM) – the RAM circuits which dont need refresh cycle. They are much faster than Dynamic RAMs, but have much less level of integration. Today used first of all as cache buffers between CPU and DRAM. DRAM (Dynamic RAM) – RAM circuits of very high level of integration but with very short time of remembering the data. They have to be refreshed with special cycle provided by external generator (or CPU). The access time (speed) of standard DRAMs is much worse than in SRAM. DRAM (Dynamic RAM) – RAM circuits of very high level of integration but with very short time of remembering the data. They have to be refreshed with special cycle provided by external generator (or CPU). The access time (speed) of standard DRAMs is much worse than in SRAM.

12 SDRAM (Synchronous Dynamic RAM) – DRAM which has a synchronous interface, meaning that it waits for a clock signal before responding to its control inputs (normal DRAMs have an asynchronous interface which means that they react as quickly as possible to changes in control inputs like RD or WR). The CLK signal is used to drive an internal sequential automat that pipelines incoming cycles. Pipelining means that the chip can accept a new access cycle before it has finished processing the previous one. In a pipelined write, the write cycle can be immediately followed by another cycle without waiting for the data to be written to the memory array. In a pipelined read, the requested data appears after a fixed number of clock pulses after the read instruction, at the same time additional access cycles can be sent to memory. SDRAM (Synchronous Dynamic RAM) – DRAM which has a synchronous interface, meaning that it waits for a clock signal before responding to its control inputs (normal DRAMs have an asynchronous interface which means that they react as quickly as possible to changes in control inputs like RD or WR). The CLK signal is used to drive an internal sequential automat that pipelines incoming cycles. Pipelining means that the chip can accept a new access cycle before it has finished processing the previous one. In a pipelined write, the write cycle can be immediately followed by another cycle without waiting for the data to be written to the memory array. In a pipelined read, the requested data appears after a fixed number of clock pulses after the read instruction, at the same time additional access cycles can be sent to memory. DDR RAM (Double Data Rate RAM) – SDRAM which reads or writes two words of data per clock cycle (one on rising edge, one on falling edge of the CLK pulse). DDR RAM (Double Data Rate RAM) – SDRAM which reads or writes two words of data per clock cycle (one on rising edge, one on falling edge of the CLK pulse).

13 ROM (Read Only Memory) – the circuit which is written once (in the factory) and cannot be written with other data. ROM (Read Only Memory) – the circuit which is written once (in the factory) and cannot be written with other data. PROM (Programmable ROM) – the ROM chip which can be programmed by user (not inside the computer, with special device – programmer), but only once. PROM (Programmable ROM) – the ROM chip which can be programmed by user (not inside the computer, with special device – programmer), but only once. E-PROM (Erasable PROM) – the ROM which can be programmed and erased many times, but with external device (not inside the computer). Usually the ultra-violet (UV) lamp is the device which erases the contents of E-PROM. E-PROM (Erasable PROM) – the ROM which can be programmed and erased many times, but with external device (not inside the computer). Usually the ultra-violet (UV) lamp is the device which erases the contents of E-PROM. EE-PROM (Electrically Erasable PROM) – ROM integrated circuit which can be erased and programmed electrically (without UV lamp), but still outside the computer. EE-PROM (Electrically Erasable PROM) – ROM integrated circuit which can be erased and programmed electrically (without UV lamp), but still outside the computer. NV-RAM (Non-volatile RAM) – RAM memory which can preserve its contents after switching-off the power. In older constructions the battery was mounted inside or outside the NV-RAM integrated circuit. Today the flash technology makes it possible without the battery, but we are distinguishing between NV-RAMs and flash memory making different use – flash technology is more suitable for large mass storage instead of magnetic disks (SSD – Solid State Disk). NV-RAM (Non-volatile RAM) – RAM memory which can preserve its contents after switching-off the power. In older constructions the battery was mounted inside or outside the NV-RAM integrated circuit. Today the flash technology makes it possible without the battery, but we are distinguishing between NV-RAMs and flash memory making different use – flash technology is more suitable for large mass storage instead of magnetic disks (SSD – Solid State Disk).

14 3. The operational memory – more advanced organizations. advanced organizations. In most of todays computers the operational memory is not organized exactly according to von Neumanns concept. One of the first well- known modifications was the segmentation of the physical memory used by Intel in 16-bit processors 8086 and 8088 (introduced in first IBM-PC and XT in mid-1980-ties), now known as a real mode addressing in todays Intel CPUs. The idea of segmentation came from the different roles that parts of memory can play: The code of program is read-only in most of cases and the CPU processes it instruction after instruction (with branches sometimes). The code of program is read-only in most of cases and the CPU processes it instruction after instruction (with branches sometimes). The data processed (simple variables, arrays etc.) is read and written not sequentially rather (variables are located in different addresses). The data processed (simple variables, arrays etc.) is read and written not sequentially rather (variables are located in different addresses).

15 The stack processed with CPUs PUSH and POP operations is used in a different way than normal data area (we need to PUSH something to the stack before we can POP it back). But the stack plays very important part in the execution of the program, for example: The stack processed with CPUs PUSH and POP operations is used in a different way than normal data area (we need to PUSH something to the stack before we can POP it back). But the stack plays very important part in the execution of the program, for example: the return address is PUSHed on the stack before calling the subroutine and POPed from the stack on return from subroutine, the return address is PUSHed on the stack before calling the subroutine and POPed from the stack on return from subroutine, the parameters for the called subroutine are PUSHed to the stack by calling program and then POPed by subroutine, the parameters for the called subroutine are PUSHed to the stack by calling program and then POPed by subroutine, local (automatic) variables declared inside the subroutine are usually allocated on the stack. local (automatic) variables declared inside the subroutine are usually allocated on the stack.

16 Intels designers assumed that up to 4 different segments of memory can be used by program at the same time: Code Segment only for instructions (please notice that it corresponds with the Harvard Architecture concept), Code Segment only for instructions (please notice that it corresponds with the Harvard Architecture concept), Stack Segment only for (system or application) stack, Stack Segment only for (system or application) stack, Data Segment for global variables and large blocks of memory (not automatic), Data Segment for global variables and large blocks of memory (not automatic), Extra Segment as the additional data segment. Extra Segment as the additional data segment. According to this the processor has four 16-bit segment registers to support addressing in these independent segments: CS pointing to the Code Segment, CS pointing to the Code Segment, SS pointing to the Stack Segment, SS pointing to the Stack Segment, DS pointing to the Data Segment, DS pointing to the Data Segment, ES pointing to the Extra Segment. ES pointing to the Extra Segment.

17 The Address Bus in these processors was 20-bit length (could address up to 1 MB of memory). The address itself was not quite linear. It was calculated in each segment as the sum of two 16-bit values (segment + displacement) with 4-bit offset: To address the current instruction in the program CPU must have appropriate value in the CS register and in the dedicated index register, responsible for displacement in CS. This register is called Instruction Pointer (IP) and its value is added to value of CS in the way shown above. After each instruction cycle the value of IP is incremented by the length of instruction code. Stack segment is addressed by SS segment register and the displacement within this segment is pointed by index register Stack Pointer (SP) etc.

18 This implementation of segmented memory (only up to 16 fully separated, 64 kB segments was possible) was too poor to support multitasking operation system with several programs loaded into memory at the same time (although the CPM/86 or Concurrent DOS were designed, they were never widely used), so the MS-DOS was pure non-multitasking system. This implementation of segmented memory (only up to 16 fully separated, 64 kB segments was possible) was too poor to support multitasking operation system with several programs loaded into memory at the same time (although the CPM/86 or Concurrent DOS were designed, they were never widely used), so the MS-DOS was pure non-multitasking system. First truly multitasking operating systems (OS/2, MS-Windows 3.0) were implemented on IBM-AT machines with Intel and newer processors, which could support memory in virtual (or protected) mode by more advanced hardware mechanisms. First truly multitasking operating systems (OS/2, MS-Windows 3.0) were implemented on IBM-AT machines with Intel and newer processors, which could support memory in virtual (or protected) mode by more advanced hardware mechanisms.

19 4. Virtual memory – basic concepts. The concept of virtual memory assumes that the application can see the memory space much larger than physical memory installed in the computer. Using the technique called pagination the operating system, supported by some hardware solutions implemented in CPU, can map the desired, constant length (4096 bytes for example) block (called page) of this huge, virtual space into a block of physical memory (called frame). The idea of paginated virtual memory supported by Page Table.

20 CPU PD Logical (virtual) address F FD Page Table Physical memory Physical address Page NoFrame No Calculating of the physical address in paginated virtual memory.

21 The CPU has the hardware mechanism which can detect, whether a desired page of virtual memory is present somewhere in the physical RAM (in any frame) or not (support for valid flag in the Page Table record, for example). The CPU has the hardware mechanism which can detect, whether a desired page of virtual memory is present somewhere in the physical RAM (in any frame) or not (support for valid flag in the Page Table record, for example). If not (valid flag set to 0), CPU rises an internal interrupt (exception) which starts system routine to find at last one free frame in RAM and reload the desired page from the swap area (usually in the mass storage). If not (valid flag set to 0), CPU rises an internal interrupt (exception) which starts system routine to find at last one free frame in RAM and reload the desired page from the swap area (usually in the mass storage). If the free frame cant be found, system must choose one of the used frames, write its contents (page) to the swap area (if the frame was modified since last loading from swap – the dirty flag is often applied in the Page Table record to mark this) and then replace it with the desired page. If the free frame cant be found, system must choose one of the used frames, write its contents (page) to the swap area (if the frame was modified since last loading from swap – the dirty flag is often applied in the Page Table record to mark this) and then replace it with the desired page. In the operation systems which use the paginated virtual memory (practically all todays multitasking systems) swap area is implemented as the special file (Windows) or the separated disk partition (Linux, UNIX etc.). The swap operation takes much time and makes the access to memory much slower than normal (real- mode) access cycle to physical memory. To avoid this problem (to minimize the number of swapping operations) advanced algorithms are implemented in operating systems for managing the pages. In the operation systems which use the paginated virtual memory (practically all todays multitasking systems) swap area is implemented as the special file (Windows) or the separated disk partition (Linux, UNIX etc.). The swap operation takes much time and makes the access to memory much slower than normal (real- mode) access cycle to physical memory. To avoid this problem (to minimize the number of swapping operations) advanced algorithms are implemented in operating systems for managing the pages.

22 Another problem is the length of the Page Table. Normally the page is a segment of memory 4 kB (4096 bytes) long. The 1 GB memory ( bytes) should then be divided into pages (or frames). Such a number of records in the Page Table would also take the space in memory, of course. To overcome this problem Page Tables are implemented not as single, constant-length arrays but as the lists or multilevel tables rather (Intel processors and newer, for example). Another problem is the length of the Page Table. Normally the page is a segment of memory 4 kB (4096 bytes) long. The 1 GB memory ( bytes) should then be divided into pages (or frames). Such a number of records in the Page Table would also take the space in memory, of course. To overcome this problem Page Tables are implemented not as single, constant-length arrays but as the lists or multilevel tables rather (Intel processors and newer, for example).


Download ppt "Hardware structures – system bus, internal (operational) memory. Piotr Mielecki Ph. D. Introduction to Computer Systems (4)"

Similar presentations


Ads by Google