Principles of Computer Architecture Chapter 4: Memory

Slides:



Advertisements
Similar presentations
Cosc 3P92 Week 9 Lecture slides
Advertisements

Chapter 6 Computer Architecture
Computer Organization and Architecture
7-1 Chapter 7 - Memory Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring Computer Architecture and.
Characteristics of Computer Memory
Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
MODULE 5: Introduction to Memory system
Characteristics of Computer Memory
Overview Booth’s Algorithm revisited Computer Internal Memory Cache memory.
Computer Organization and Architecture
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
CH05 Internal Memory Computer Memory System Overview Semiconductor Main Memory Cache Memory Pentium II and PowerPC Cache Organizations Advanced DRAM Organization.
CSNB123 coMPUTER oRGANIZATION
Faculty of Information Technology Department of Computer Science Computer Organization and Assembly Language Chapter 4 Cache Memory.
CSCI 4717/5717 Computer Architecture
CMPE 421 Parallel Computer Architecture
CACHE MEMORY Cache memory, also called CPU memory, is random access memory (RAM) that a computer microprocessor can access more quickly than it can access.
7-1 Chapter 7 - Memory Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles of Computer Architecture.
7-1 Chapter 7 - Memory Department of Information Technology, Radford University ITEC 352 Computer Organization Principles of Computer Architecture Miles.
Cache Memory. Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation.
EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
Computer Architecture Lecture 3 Cache Memory. Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics.
03-04 Cache Memory Computer Organization. Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics.
Cache Memory.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
Computer system & Architecture
2007 Sept. 14SYSC 2001* - Fall SYSC2001-Ch4.ppt1 Chapter 4 Cache Memory 4.1 Memory system 4.2 Cache principles 4.3 Cache design 4.4 Examples.
CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.
4-1 Computer Organization Part The Memory Hierarchy.
Computer Organization & Programming
Chapter 8: System Memory Dr Mohamed Menacer Taibah University
Memory Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation.
1 Computer Memory System Overview. Objectives  Discuss the overview of the memory elements of a computer  Describe the characteristics of the computer.
Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module.
Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation.
Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation.
Cache Memory.
William Stallings Computer Organization and Architecture 7th Edition
Computer Organization
The Memory System (Chapter 5)
Yunju Baek 2006 spring Chapter 4 Cache Memory Yunju Baek 2006 spring.
Cache Memory Presentation I
Chapter 6 Memory System Design
Cache Memory.
Computer Organization & Architecture 3416
Chapter Contents 7.1 The Memory Hierarchy 7.2 Random Access Memory
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Principles of Computer Architecture Chapter 4: Memory

Chapter Contents 1. The Memory Hierarchy 2. Random Access Memory 3. Chip Organization 4. Commercial Memory Modules 5. Read-Only Memory 6. Cache Memory 7. Virtual Memory

The Memory Hierarchy

Functional Behavior of a RAM Cell

Simplified RAM Chip Pinout

A Four-Word Memory with Four Bits per Word in a 2D Organization

A Simplified Representation of the Four-Word by Four-Bit RAM

Organization of a 64-Word by One-Bit RAM

Two Four-Word by Four-Bit RAMs are Used in Creating a Four-Word by Eight-Bit RAM

Two Four-Word by Four-Bit RAMs Make up an Eight-Word by Four-Bit RAM

Memory Packaging

Single-In-Line Memory Module Adapted from(Texas Instruments, MOS Memory: Commercial and Military Specifications Data Book, Texas Instruments, Literature Response Center, P.O. Box 172228, Denver, Colorado, 1991.)

DIMM: Dual Inline Memory Module At present DIMM are the standard way for memory to be packaged. Each DIMM has 84 gold-plated connectors on each side, for a total of 168 connectors. A DIMM is capable of delivering 64 data bits at once. Typical DIMM capacities are 64 MB and up. A small DIMM used for in notebooks called SO-DIMM Average error rate of a module is 1 error every 10 years!!

A ROM Stores Four Four-Bit Words

A Lookup Table (LUT) Implements an Eight-Bit ALU

Memory Cells with addresses each cell contains (usually 8) ordered bits most common definition of "byte" n-bit byte can represent 2n different "codes" addresses are sequential bit patterns m-bit address can specify 2m cells e.g., a 32-bit address supports approx. 4G Some machines use "byte-addressable" memory; others use "word-addressable"

QUIZZZZZZZZZZZZZZ! 1) What are the sizes of MAR and MDR, in a machine having 4M x 32 RAM? 18 Mbyte addressable RAM? 24 Mbits, 16-bit word RAM 2) What is the size of MAR to address a 28 MB memory 3)If the IR has 16 bits, how many instructions can my machine have if the operands are coded on 9 bits?

Computer Memory System Overview Characteristics of memory systems Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organization

Location CPU: registers, control memory Internal: main memory External: secondary memory

Capacity In terms of words or bytes 1 Byte = 8 bits Word size The natural unit of organization size: 8, 16, and 32 bits are common, even 64 bits

Unit of Transfer Number of data elements transferred at a time Internal Usually governed by data bus width External Usually a block which is much larger than a word

Addressable Unit Smallest location which can be uniquely addressed Word or byte E.g., Motorola 68000 word: 16 bits internal transfer unit: 16 bits addressable unit: 8 bits (byte addressable) Let A = address length in bits N = # addressable units  2A = N

Access Methods (1) Sequential Data does not have a unique address Start at the beginning and read through in order Must read intermediate data items until the desired item found Access time depends on location of data and previous location e.g. tape

Access Methods (2) Direct Individual blocks have unique addresses Access is by jumping to vicinity plus sequential search Access time depends on location and previous location e.g. disk

Access Methods (3) Random Individual addresses identify locations exactly Location can be selected randomly and addressed and accessed directly Access time is independent of location or previous access (i.e., constant) e.g. RAM

Access Methods (4) Associative A variation of random access Data is located by a comparison with contents of a portion of the store All words are searched simultaneously Access time is independent of location or previous access e.g. cache

Performance (1) Access time Memory Cycle time Time between presenting the address and getting the valid data For random access memory: time to address data unit and perform transfer For non-random access memory: time to position hardware mechanism at the desired position Memory Cycle time Primarily applied to random access memory Time may be required for the memory to “recover” before next access Cycle time is (access time + recovery time)

Performance (2) Transfer Rate: R bps Rate at which data can be transferred in/out of memory For random access memory, R = 1/(memory cycle time) For non-random access memory, TN = TA + N/R, where TN : average time to R/W N bits TA : average access time N : # bits

Physical Types Semiconductor RAM Magnetic Disk & Tape Optical CD & DVD

Physical Characteristics Decay Volatility Erasability Power consumption

Organization Physical arrangement of bits into words Not always obvious e.g. interleaved

The Bottom Line How much? How fast? How expensive? Capacity Time is money How expensive? Cost/bit

Memory Hierarchy (1) Major design objective of memory systems Provision of adequate storage capacity at an acceptable level of performance a reasonable cost Memory technologies Smaller access time  greater cost/bit Greater capacity  smaller cost/bit Greater capacity  greater access time  DILEMMA  Solution: MEMORY HIERARCHY

Internal Memory

Several types of dynamic RAM chips exist Several types of dynamic RAM chips exist. The oldest type still in use is the FPM (Fast Page Mode) DRAM, organized internally as a matrix of bits and CAS, RAS control signal for bit access. FPM DRAM is being gradually replaced by EDO (Extended Data Output) DRAM, which allows a second memory reference to begin before the previous has been completed as a simple pipelining which doesn’t make the memory reference faster but does improve the memory bandwidth

Byte-ordering big-endian little-endian bytes numbered left  right in each word little-endian bytes numbered right  left in each word In both schemes, numeric values are stored left  right. In both schemes, strings are stored in byte order. Problem with communication.

Cache memory in a Computer System

Cache Memory

Address Locality principles Caches depends on two kinds of address locality to achieve their goal: Temporal locality : a recently referenced memory location is likely to be referenced again; Spatial locality : a neighbor of a recently referenced memory location is likely to be referenced .

If a word is read k times, the CPU will need 1 reference to slow memory and k-1 references to fast memory Cache access time: c Main memory access time: m Hit ratio: h ( which is the fraction of all references that can be satisfied out of the cache). h=(k-1)/k Miss ratio: 1-h the mean access time =c+(1-h)m

A system with three levels of cache.

Model for all caches Main memory is divided up into fixed size blocks called cache lines. A cache line typically consists of 4 to 64 consecutives bytes. So with a 32-bytes line size, line 0 is bytes 0 to 31, line 1 is bytes 32 to 63 and so on……. Two configurations are possible: 1- Direct-Mapped caches 2- Set-Associative caches

A Direct Mapping Scheme for Cache Memory Example of 2048 entries, each row in the cache contains exactly one cache line from main memory.. With 32-byte cache line, the cache can hold 64 KB. Each cache entry consists of 3 parts: 1- Valid bit indicates whether there is a valid data or not. At boot, all entries are invalid 2- Tag: consists of a unique, 16-bit value identifying the corresponding line of memory from which data came. 3-Data: contains one cache line of 32 bytes.

In a direct-mapped cache, a given memory word can be stored in exactly one place within the cache. Given a memory address, there is only place to look for it in the cache. If it’s not there, then it is not in the cache. For storing and retrieving data from the cache, the virtual address is broken into 4 components as follows:

32-bit virtual address TAG: corresponds to the tag bits stored in the cache entry LINE: indicates which cache entry holds the corresponding data, if they are present WORD: tells which word within a line is referenced BYTE: usually not used, used if a single byte is requested.

How it works When the CPU produces a memory address, the hardware extracts the 11 LINE bits to index the cache(1 of 2048 entries). If entry is valid, the TAG of memory address is compared to the Tag filed in cache entry. If they agree so there is a cache hit, if the cache entry is invalid or the tags do not match, so there is a cache miss.

If the cache entry is not valid or the tag do not match, the needed entry is not present (cache miss). In this case, the 32-byte cache line is fetched from memory and stored in the cache, replacing what was there. If the existing cache entry line had been modified since loaded, it must be written back to main memory before discarded

This mapping scheme puts consecutive memory lines in consecutive cache entries (up to 64KB of contiguous data can be stored in the cache). However 2 lines that differ in their address by 64K (65 536) cannot be stored in the cache simultaneously.

Cache/Main-Memory Structure 2n addressable words each word has a unique n-bit address M fixed length blocks of K words each  M = 2n/K Cache C slots (lines) of K words each C << M

Cache/Main-Memory Structure At any time, some subset of blocks resides in lines As C << M, each line includes a tag indicating which block is being stored tag is a portion of an address

(line)

Elements of Cache Design Size Mapping Function Replacement Algorithm Write Policy Block Size Number of Caches

Size does matter Usually 1K - 512K Cost Speed More cache is more expensive Speed More cache is faster (up to a point) Checking cache for data takes time

Mapping Function Algorithms for mapping main memory blocks to cache lines Needed, as C << M Approaches Direct Associate Set Associate

Mapping Function Example Cache of 64KByte Cache block of 4 bytes i.e. cache is 16K (214) lines of 4 bytes (why?) 16MBytes main memory, byte addressable 24 bit address (224 = 16M) 4M blocks C = 16K, M = 4M, C << M

Direct Mapping (1) Each block of main memory maps to only one possible cache line i.e. if a block is in cache, it must be in one specific place Mapping i = j mod m, where i : cache line number j : memory block number m : number of lines (i.e., C )

Direct Mapping (2) Example of mapping: 16 blocks, 4 lines line blocks 0 0, 4, 8, 12 1 1, 5, 9, 13 2 2, 6, 10, 14 3 3, 7, 11, 15 Which block (in the line)? No two blocks in the same line have the same Tag field in address Check contents of cache by finding line and then check Tag

Direct Mapping - Address Structure Address is in 3 fields Least Significant w bits identify unique word in a block (or line) Most Significant s bits specify one memory block The MSBs are split into cache line field of r bits, where m = 2r (or C = 2r) tag of s-r (most significant) bits

Direct Mapping Cache Line Table Cache line Main Memory blocks held 0 0, m, 2m, …, 2s-m 1 1, m+1, 2m+1, …, 2s-m+1 : : m-1 m-1, 2m-1, 3m-1, …, 2s-1

Direct Mapping Cache Organization

Direct Mapping Example (1) Tag s-r Line or Slot r Word w 14 2 8 24 bit address 2 bit word identifier (4 byte block) 22 bit block identifier 8 bit tag (=22-14) 14 bit slot or line Again No two blocks in the same line have the same Tag field Check contents of cache by finding line and checking Tag

exercise How many total bits are required for a direct mapped cache with 16KB of data and 4-word blocks, assuming a 32-bit address

Direct Mapping Example (2) Q1: Where in cache is the word from main memory location 16339D mapped? 0 C E 7 Ans: Line 0CE7, Tag 16, word offset 1 Q2: Where in cache is the word from main memory location ABCDEF mapped? Tag 8 bits Line 14 bits Word 2 bits 01 0001 0110 0011 0011 1001 11

Direct Mapping Summary Address length = (s + w) bits Number of addressable units = 2s+w words or bytes Block size = line size = 2w words or bytes Number of blocks in main memory = 2s+ w/2w = 2s Number of lines in cache = m = 2r Size of tag = (s – r) bits

Exercise Consider a 64K direct-mapped cache with a 16 byte blocksize. Show how a 32-bit address is partitioned to access the cache.

Solution There are 64K/16 = 4K = 4096 = 212 lines in the cache. Lower 2 bits ignored. Next higher 2 bits = position of word within block. Next higher 12 bits = index. Remaining 16 bits = tag.

Direct Mapping Example • For a direct mapped cache, each main memory block can be mapped to only one slot, but each slot can receive more than one block. Consider how an access to memory location (A035F014)16 is mapped to the cache for a 232 word memory. The memory is divided into 227 blocks of 25 = 32 words per block, and the cache consists of 214 slots:

If the addressed word is in the cache, it will be found in word (14)16 of slot (2F80)16, which will have a tag of (1406)16.

Exercise Consider a Cache with 64 blocks and a block size of 16 bytes. What block number does byte address 1200 map to?

Solution We know that: (block address) modulo (Number of cache blocks) Where block address is: (Byte address) /(Bytes per block). Note that this block address is the block containing all addresses between: and Byte address X Bytes per block Bytes per block Byte address X Bytes per block + (bytes per block -1) Bytes per block

So, with 16 bytes, byte address 1200 is block address: 1200 / 16 = 75 Which maps to cache block number ( 75 modulo 64)= 75-64=11. Note: this block maps all addresses between 1200 and 1215.

Direct Mapping pros & cons Advantages Simple Inexpensive to implement Disadvantage Fixed location for given block  If a program accesses 2 blocks that map to the same line repeatedly, cache misses are very high  These blocks will be continually swapped in and out  Hit ratio will be low