Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.

Similar presentations


Presentation on theme: "Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands."— Presentation transcript:

1 Chapter 11 System Performance Enhancement

2 Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands are decoded and required data fetched from specified location (using addressing mode built into instruction) l Operation corresponding to instruction is executed l Additional operand determines return location for the result of operation

3 Performance l CPU performs program instructions via a sequence of fetch-execute cycles l Note: F-E Cycle consists of many phases l Performance is degraded by delays in memory accesses -

4 Performance Enhancement l RISC Architecture - Reduced Instruction Set Computing - Simple instructions: easier to decode and to run in parallel - Limited memory access - only load and store - Many registers and compilers that optimize their use

5 Performance Enhancement l Pipelining - Overlap processing of instructions so that more than one instruction is being worked on at a given time l While one instruction is fetching, another may be executing l So pipelining performs Fetch - Execute phases in parallel NOTE: Only 1 instruction at a time is actually being executed to completion l Objective: start and finish one instruction per clock cycle: CPI = 1

6 Pipelining - Fig. 10.23

7 Performance Enhancement l SuperScalar Design - start and finish more than one instruction per clock cycle: CPI < 1 l Executes several operations at once l Hardware duplication to support parallelism l CPU may have instruction fetch unit and several execution units operating in parallel l Hardware schedules instructions to exploit parallelism

8 Other Means of Improving Performance l Multiprocessing l Faster Clock Speed l Wider Instructions and Data Paths l Longer Registers l Faster Disk Access l Memory Enhancements

9 Multiprocessing l Increase number of processors l Multiprocessors - computers that have multiple CPUs within a single system, sharing memory and I/O devices l Typically, 2-4 processors l Tightly coupled system

10 Typical Multiprocessing System

11 Symmetrical Multiprocessing (SMP) Systems l Each CPU operates independently l Each CPU has access to all the system resources (memory and I/O) l Any CPU can respond to an interrupt l A program in memory can be executed by any CPU l Each CPU has identical access to OS l Each CPU performs its own dispatch scheduling - that is, determining what program will execute l Very controlled environment - CPUs, memory, I/O devices, and OS are designed to operate together and communication is built into the system

12 Increase Clock Speed l Faster clock speeds impact overall speed of the system since instruction cycle time is proportional to clock speed l Limitation - ability of CPU, busses, and other components to keep up

13 Wider Instruction and Data Paths l Ability to process more bits at a time improves performance l CPU can fetch or store more data in a single operation l CPU can fetch more instructions at a time l Memory accesses are slow compared to CPU operations, so improves performance

14 Longer Registers l Longer registers (# of bits) within CPU reduces number of program steps to complete a calculation l Example - Using 16-bit registers for 64-bit addition requires 4 additions plus steps to handle carries between registers and 4 moves to transfer result to memory l With 64-bit registers only a single addition and single move to memory via wider internal bus

15 Faster Disk Access l Small improvements in disk access can have significant improvement in system performance l Approach - data distributed among multiple devices so data can be accessed simultaneously from different devices l Manufacturers continue to produce disk drives that are smaller and more densely packed

16 Larger/Faster Memory l Increased amounts of memory provide larger buffers that can be used to hold data and programs transferred from I/O devices l Reduces number of disk accesses l Faster memory reduces number of wait states that must be inserted into the instruction cycle when memory access takes place l Memory access time can be reduced via RISC architecture - more registers - and l by providing wider memory data paths (8 bytes)

17 Memory l DRAM - Dynamic RAM - inexpensive memory, requires less electrical power, and more compact with more bits of memory in single integrated circuit. Requires periodic refreshing. l SRAM -Static RAM - 2-3 times faster, but more expensive and requires more chips l Impractical to use SRAM memory l Solution - Cache Memory

18 Cache Memory

19 l Cache memory is organized into blocks of 8-16 bytes each l Block holds exact copy of data stored in main memory l Each block has a tag that identifies location of data in main memory contained in the block l 64KB of cache => 8,192 blocks of data l CPU request for memory is handled by Cache Controller that checks tags for desired location Hit => data in CacheMiss => not present l Read => transfer data from Cache to CPU and Write => store data with tag in Cache memory l If Miss, data is copied from memory to Cache

20 Cache Illustration

21

22 Cache Situations l Full Cache and Memory Write: LRU - Least Recently Used Algorithm replace block that has not been accessed for the longest l Suppose block to be replaced has been altered - first write block to memory before replacement l Cache controller manages entire cache operation. CPU is unaware of Cache presence. l Why does Cache work? Locality of Reference - Empirical studies show that most well written programs confine memory references to a few small regions of memory - e.q. sequential instructions or loops or small procedure or array data. Hit-to-Miss ratios of 90%.

23 Two-Level Cache System


Download ppt "Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands."

Similar presentations


Ads by Google