Principles of Computer Architecture Chapter 4: Memory

Name: Principles of Computer Architecture Chapter 4: Memory
Uploaded: 2017-07-07T05:59:40+00:00
Duration: PTM24S3
Channel: Conrad Shields
Description: Principles of Computer Architecture Chapter 4: Memory

Principles of Computer Architecture Chapter 4: Memory

Chapter Contents 1. The Memory Hierarchy 2. Random Access Memory
3. Chip Organization 4. Commercial Memory Modules 5. Read-Only Memory 6. Cache Memory 7. Virtual Memory

The Memory Hierarchy

Functional Behavior of a RAM Cell

Simplified RAM Chip Pinout

A Four-Word Memory with Four Bits per Word in a 2D Organization

A Simplified Representation of the Four-Word by Four-Bit RAM

Organization of a 64-Word by One-Bit RAM

Two Four-Word by Four-Bit RAMs are Used in Creating a Four-Word by Eight-Bit RAM

Two Four-Word by Four-Bit RAMs Make up an Eight-Word by Four-Bit RAM

Memory Packaging

Single-In-Line Memory Module
Adapted from(Texas Instruments, MOS Memory: Commercial and Military Specifications Data Book, Texas Instruments, Literature Response Center, P.O. Box , Denver, Colorado, 1991.)

DIMM: Dual Inline Memory Module
At present DIMM are the standard way for memory to be packaged. Each DIMM has 84 gold-plated connectors on each side, for a total of 168 connectors. A DIMM is capable of delivering 64 data bits at once. Typical DIMM capacities are 64 MB and up. A small DIMM used for in notebooks called SO-DIMM Average error rate of a module is 1 error every 10 years!!

A ROM Stores Four Four-Bit Words

A Lookup Table (LUT) Implements an Eight-Bit ALU

Memory Cells with addresses each cell contains (usually 8) ordered bits most common definition of "byte" n-bit byte can represent 2n different "codes" addresses are sequential bit patterns m-bit address can specify 2m cells e.g., a 32-bit address supports approx. 4G Some machines use "byte-addressable" memory; others use "word-addressable"

QUIZZZZZZZZZZZZZZ! 1) What are the sizes of MAR and MDR, in a machine having 4M x 32 RAM? 18 Mbyte addressable RAM? 24 Mbits, 16-bit word RAM 2) What is the size of MAR to address a 28 MB memory 3)If the IR has 16 bits, how many instructions can my machine have if the operands are coded on 9 bits?

Computer Memory System Overview
Characteristics of memory systems Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organization

Location CPU: registers, control memory Internal: main memory
External: secondary memory

Capacity In terms of words or bytes 1 Byte = 8 bits Word size
The natural unit of organization size: 8, 16, and 32 bits are common, even 64 bits

Unit of Transfer Number of data elements transferred at a time
Internal Usually governed by data bus width External Usually a block which is much larger than a word

Addressable Unit Smallest location which can be uniquely addressed
Word or byte E.g., Motorola 68000 word: 16 bits internal transfer unit: 16 bits addressable unit: 8 bits (byte addressable) Let A = address length in bits N = # addressable units  2A = N

Access Methods (1) Sequential Data does not have a unique address
Start at the beginning and read through in order Must read intermediate data items until the desired item found Access time depends on location of data and previous location e.g. tape

Access Methods (2) Direct Individual blocks have unique addresses
Access is by jumping to vicinity plus sequential search Access time depends on location and previous location e.g. disk

Access Methods (3) Random
Individual addresses identify locations exactly Location can be selected randomly and addressed and accessed directly Access time is independent of location or previous access (i.e., constant) e.g. RAM

Access Methods (4) Associative A variation of random access
Data is located by a comparison with contents of a portion of the store All words are searched simultaneously Access time is independent of location or previous access e.g. cache

Performance (1) Access time Memory Cycle time
Time between presenting the address and getting the valid data For random access memory: time to address data unit and perform transfer For non-random access memory: time to position hardware mechanism at the desired position Memory Cycle time Primarily applied to random access memory Time may be required for the memory to “recover” before next access Cycle time is (access time + recovery time)

Performance (2) Transfer Rate: R bps
Rate at which data can be transferred in/out of memory For random access memory, R = 1/(memory cycle time) For non-random access memory, TN = TA + N/R, where TN : average time to R/W N bits TA : average access time N : # bits

Physical Types Semiconductor RAM Magnetic Disk & Tape Optical CD & DVD

Physical Characteristics
Decay Volatility Erasability Power consumption

Organization Physical arrangement of bits into words
Not always obvious e.g. interleaved

The Bottom Line How much? How fast? How expensive? Capacity
Time is money How expensive? Cost/bit

Memory Hierarchy (1) Major design objective of memory systems
Provision of adequate storage capacity at an acceptable level of performance a reasonable cost Memory technologies Smaller access time  greater cost/bit Greater capacity  smaller cost/bit Greater capacity  greater access time  DILEMMA  Solution: MEMORY HIERARCHY

Internal Memory

Several types of dynamic RAM chips exist
Several types of dynamic RAM chips exist. The oldest type still in use is the FPM (Fast Page Mode) DRAM, organized internally as a matrix of bits and CAS, RAS control signal for bit access. FPM DRAM is being gradually replaced by EDO (Extended Data Output) DRAM, which allows a second memory reference to begin before the previous has been completed as a simple pipelining which doesn’t make the memory reference faster but does improve the memory bandwidth

Byte-ordering big-endian little-endian
bytes numbered left  right in each word little-endian bytes numbered right  left in each word In both schemes, numeric values are stored left  right. In both schemes, strings are stored in byte order. Problem with communication.

Cache memory in a Computer System

Cache Memory

Address Locality principles
Caches depends on two kinds of address locality to achieve their goal: Temporal locality : a recently referenced memory location is likely to be referenced again; Spatial locality : a neighbor of a recently referenced memory location is likely to be referenced .

If a word is read k times, the CPU will need 1 reference to slow memory and k-1 references to fast memory Cache access time: c Main memory access time: m Hit ratio: h ( which is the fraction of all references that can be satisfied out of the cache). h=(k-1)/k Miss ratio: 1-h the mean access time =c+(1-h)m

A system with three levels of cache.

Model for all caches Main memory is divided up into fixed size blocks called cache lines. A cache line typically consists of 4 to 64 consecutives bytes. So with a 32-bytes line size, line 0 is bytes 0 to 31, line 1 is bytes 32 to 63 and so on……. Two configurations are possible: 1- Direct-Mapped caches 2- Set-Associative caches

A Direct Mapping Scheme for Cache Memory
Example of 2048 entries, each row in the cache contains exactly one cache line from main memory.. With 32-byte cache line, the cache can hold 64 KB. Each cache entry consists of 3 parts: 1- Valid bit indicates whether there is a valid data or not. At boot, all entries are invalid 2- Tag: consists of a unique, 16-bit value identifying the corresponding line of memory from which data came. 3-Data: contains one cache line of 32 bytes.

In a direct-mapped cache, a given memory word can be stored in exactly one place within the cache. Given a memory address, there is only place to look for it in the cache. If it’s not there, then it is not in the cache. For storing and retrieving data from the cache, the virtual address is broken into 4 components as follows:

32-bit virtual address TAG: corresponds to the tag bits stored in the cache entry LINE: indicates which cache entry holds the corresponding data, if they are present WORD: tells which word within a line is referenced BYTE: usually not used, used if a single byte is requested.

How it works When the CPU produces a memory address, the hardware extracts the 11 LINE bits to index the cache(1 of 2048 entries). If entry is valid, the TAG of memory address is compared to the Tag filed in cache entry. If they agree so there is a cache hit, if the cache entry is invalid or the tags do not match, so there is a cache miss.

If the cache entry is not valid or the tag do not match, the needed entry is not present (cache miss). In this case, the 32-byte cache line is fetched from memory and stored in the cache, replacing what was there. If the existing cache entry line had been modified since loaded, it must be written back to main memory before discarded

This mapping scheme puts consecutive memory lines in consecutive cache entries (up to 64KB of contiguous data can be stored in the cache). However 2 lines that differ in their address by 64K (65 536) cannot be stored in the cache simultaneously.

Cache/Main-Memory Structure
2n addressable words each word has a unique n-bit address M fixed length blocks of K words each  M = 2n/K Cache C slots (lines) of K words each C << M

Cache/Main-Memory Structure
At any time, some subset of blocks resides in lines As C << M, each line includes a tag indicating which block is being stored tag is a portion of an address

(line)

Elements of Cache Design
Size Mapping Function Replacement Algorithm Write Policy Block Size Number of Caches

Size does matter Usually 1K - 512K Cost Speed
More cache is more expensive Speed More cache is faster (up to a point) Checking cache for data takes time

Mapping Function Algorithms for mapping main memory blocks to cache lines Needed, as C << M Approaches Direct Associate Set Associate

Mapping Function Example
Cache of 64KByte Cache block of 4 bytes i.e. cache is 16K (214) lines of 4 bytes (why?) 16MBytes main memory, byte addressable 24 bit address (224 = 16M) 4M blocks C = 16K, M = 4M, C << M

Direct Mapping (1) Each block of main memory maps to only one possible cache line i.e. if a block is in cache, it must be in one specific place Mapping i = j mod m, where i : cache line number j : memory block number m : number of lines (i.e., C )

Direct Mapping (2) Example of mapping: 16 blocks, 4 lines
line blocks 0 0, 4, 8, 12 1 1, 5, 9, 13 2 2, 6, 10, 14 3 3, 7, 11, 15 Which block (in the line)? No two blocks in the same line have the same Tag field in address Check contents of cache by finding line and then check Tag

Direct Mapping - Address Structure
Address is in 3 fields Least Significant w bits identify unique word in a block (or line) Most Significant s bits specify one memory block The MSBs are split into cache line field of r bits, where m = 2r (or C = 2r) tag of s-r (most significant) bits

Direct Mapping Cache Line Table
Cache line Main Memory blocks held 0 0, m, 2m, …, 2s-m 1 1, m+1, 2m+1, …, 2s-m+1 : : m-1 m-1, 2m-1, 3m-1, …, 2s-1

Direct Mapping Cache Organization

Direct Mapping Example (1)
Tag s-r Line or Slot r Word w 14 2 8 24 bit address 2 bit word identifier (4 byte block) 22 bit block identifier 8 bit tag (=22-14) 14 bit slot or line Again No two blocks in the same line have the same Tag field Check contents of cache by finding line and checking Tag

exercise How many total bits are required for a direct mapped cache with 16KB of data and 4-word blocks, assuming a 32-bit address

Direct Mapping Example (2)
Q1: Where in cache is the word from main memory location 16339D mapped? 0 C E Ans: Line 0CE7, Tag 16, word offset 1 Q2: Where in cache is the word from main memory location ABCDEF mapped? Tag 8 bits Line 14 bits Word 2 bits 01

Direct Mapping Summary
Address length = (s + w) bits Number of addressable units = 2s+w words or bytes Block size = line size = 2w words or bytes Number of blocks in main memory = 2s+ w/2w = 2s Number of lines in cache = m = 2r Size of tag = (s – r) bits

Exercise Consider a 64K direct-mapped cache with a 16 byte blocksize. Show how a 32-bit address is partitioned to access the cache.

Solution There are 64K/16 = 4K = 4096 = 212 lines in the cache.
Lower 2 bits ignored. Next higher 2 bits = position of word within block. Next higher 12 bits = index. Remaining 16 bits = tag.

Direct Mapping Example
• For a direct mapped cache, each main memory block can be mapped to only one slot, but each slot can receive more than one block. Consider how an access to memory location (A035F014)16 is mapped to the cache for a 232 word memory. The memory is divided into 227 blocks of 25 = 32 words per block, and the cache consists of 214 slots:

If the addressed word is in the cache, it will be found in word (14)16 of slot (2F80)16, which will have a tag of (1406)16.

Exercise Consider a Cache with 64 blocks and a block size of 16 bytes. What block number does byte address 1200 map to?

Solution We know that: (block address) modulo (Number of cache blocks)
Where block address is: (Byte address) /(Bytes per block). Note that this block address is the block containing all addresses between: and Byte address X Bytes per block Bytes per block Byte address X Bytes per block + (bytes per block -1) Bytes per block

So, with 16 bytes, byte address 1200 is block address:
1200 / 16 = 75 Which maps to cache block number ( 75 modulo 64)= 75-64=11. Note: this block maps all addresses between 1200 and 1215.

Direct Mapping pros & cons
Advantages Simple Inexpensive to implement Disadvantage Fixed location for given block  If a program accesses 2 blocks that map to the same line repeatedly, cache misses are very high  These blocks will be continually swapped in and out  Hit ratio will be low

Principles of Computer Architecture Chapter 4: Memory

Similar presentations

Presentation on theme: "Principles of Computer Architecture Chapter 4: Memory"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Principles of Computer Architecture Chapter 4: Memory

Similar presentations

Presentation on theme: "Principles of Computer Architecture Chapter 4: Memory"— Presentation transcript:

Similar presentations

About project

Feedback