Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)

Slides:



Advertisements
Similar presentations
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 23, 2002 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Advertisements

Lecture 8: Memory Hierarchy Cache Performance Kai Bu
Cache Here we focus on cache improvements to support at least 1 instruction fetch and at least 1 data access per cycle – With a superscalar, we might need.
M. Mateen Yaqoob The University of Lahore Spring 2014.
Microprocessor Dr. Rabie A. Ramadan Al-Azhar University Lecture 1.
Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to.
© Karen Miller, What do we want from our computers?  correct results we assume this feature, but consider... who defines what is correct?  fast.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 (and Appendix B) Memory Hierarchy Design Computer Architecture A Quantitative Approach,
The Memory Hierarchy CPSC 321 Andreas Klappenecker.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Memory Organization.
Caching I Andreas Klappenecker CPSC321 Computer Architecture.
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Overview: Memory Memory Organization: General Issues (Hardware) –Objectives in Memory Design –Memory Types –Memory Hierarchies Memory Management (Software.
CS 524 (Wi 2003/04) - Asim LUMS 1 Cache Basics Adapted from a presentation by Beth Richardson
DAP Spr.‘98 ©UCB 1 Lecture 11: Memory Hierarchy—Ways to Reduce Misses.
Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.
Computing Systems Memory Hierarchy.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
Unit-4 (CO-MPI Autonomous)
CH05 Internal Memory Computer Memory System Overview Semiconductor Main Memory Cache Memory Pentium II and PowerPC Cache Organizations Advanced DRAM Organization.
Memory Systems Architecture and Hierarchical Memory Systems
Introduction to Computers Rabie A. Ramadan, PhD. 2 Class Information Website ses/2012/intro/
Cache memory October 16, 2007 By: Tatsiana Gomova.
Memory and Caching Chapter 6.
Computer Orgnization Rabie A. Ramadan Lecture 7. Wired Control Unit What are the states of the following design:
Chapter 6: Memory Memory is organized into a hierarchy
CMPE 421 Parallel Computer Architecture
Memory Hierarchy and Cache Memory Jennifer Tsay CS 147 Section 3 October 8, 2009.
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
Lecture 19 Today’s topics Types of memory Memory hierarchy.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.
Computer Architecture Lecture 26 Fasih ur Rehman.
3-May-2006cse cache © DW Johnson and University of Washington1 Cache Memory CSE 410, Spring 2006 Computer Systems
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 January Session 2.
CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
CSE378 Intro to caches1 Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.
1 CSCI 2510 Computer Organization Memory System II Cache In Action.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
Cosc 2150: Computer Organization
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
SOFTENG 363 Computer Architecture Cache John Morris ECE/CS, The University of Auckland Iolanthe I at 13 knots on Cockburn Sound, WA.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
CACHE MEMORY CS 147 October 2, 2008 Sampriya Chandra.
Introduction to computer architecture April 7th. Access to main memory –E.g. 1: individual memory accesses for j=0, j++, j
Memory Hierarchy and Cache. A Mystery… Memory Main memory = RAM : Random Access Memory – Read/write – Multiple flavors – DDR SDRAM most common 64 bit.
Computer Orgnization Rabie A. Ramadan Lecture 9. Cache Mapping Schemes.
Lecture 6 Memory Lecture Duration: 2 Hours. AOU – Fall Lecture Overview  Introduction  Types of Memory  The Memory Hierarchy  Cache Memory.
CMSC 611: Advanced Computer Architecture
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Computer Organization
COSC3330 Computer Architecture
CSE 351 Section 9 3/1/12.
Multilevel Memories (Improving performance using alittle “cash”)
Cache Memory Presentation I
Memory Organization.
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Computer Architecture Memory organization

Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory) Stores programs and data that the computer needs when executing a program Dynamic RAM (DRAM) Uses Tiny Capacitors Needs to be recharged every few milliseconds to keep the stored data Static RAM (SRAM) Holds its data as long as the power is on D Flip Flop

Types of Memory (Cont.) ROM Stores critical information necessary to operate the system. Hardwired  can not be programmed Programmable Read Only Memory (PROM) Can be programmed once using appropriate equipment Erasable PROM (EPROM) Can be programmed with special tool It has to be totally erased to be reprogrammed Electrical Erasable PROM (EEPROM) No special tools required Can erase a portion

Memory Hierarchy The idea Hide the slower memory behind the fast memory Cost and performance play major roles in selecting the memory.

Hit Vs. Miss Hit The requested data resides in a given level of memory. Miss The requested data is not found in the given level of memory Hit rate The percentage of memory accesses found in a given level of memory. Miss rate The percentage of memory accesses not found in a given level of memory.

Hit Vs. Miss (Cont.) Hit time The time required to access the requested information in a given level of memory. Miss penalty The time required to process a miss, Replacing a block in an upper level of memory, The additional time to deliver the requested data to the processor.

Miss Scenario The processor sends a request to the cache for location X if found  cache hit If not  try next level When the location is found  load the whole block into the cache Hoping that the processor will access one of the neighbor locations next. One miss may lead to multiple hits  Locality Can we compute the average access time based on this memory Hierarchy?

Average Access Time Assume a memory hierarchy with three levels (L1, L2, and L3) What is the memory average access time? h1  hit at L1 (1-h1)  miss at L1 t1  L1 access time h2  hit at L2 (1-h2)  miss at L2 t2  L2 access time h3  hit at L3=100% (1-h3)  miss at L3 t3  L3 access time

Locality of Reference One miss may lead to multiple hits  Locality Temporal locality Recently accessed items tend to be accessed again in the near future. Spatial locality when a given address has been referenced, it is most likely that addresses near it will be referenced within a short period of time. (for example, as in arrays or loops). Sequential locality  part of the spatial locality Instructions tend to be accessed sequentially.

Cache memory Cache Stores recently used data closer to the CPU Your home is the cache and the main memory is the grocery store Buy what is most probably to be needed in the coming week How a processor can know which block(s) to bring to the cache? No way to know but can benefit from the locality concept

Impact of Temporal Locality Assume that: A loop instruction that is executed n times The request data created a cache miss requires t m to load the requested block from the main memory to the cache t c is the cache access time What is the average access time? nt avg What does it mean?

Impact of Spatial Locality Assume that: m elements are requested due to spatial locality. The request data created a cache miss that requires t m to load the requested block from the main memory to the cache t c is the cache access time What is the average access time? What does it mean? mt avg

Cache Mapping Schemes

Cache memory is smaller than the main memory Only few blocks can be loaded at the cache The cache does not use the same memory addresses Which block in the cache is equivalent to which block in the memory? The processor uses Memory Management Unit (MMU) to convert the requested memory address to a cache address

Direct Mapping Assigns cache mappings using a modular approach j = i mod n j cache block number i memory block number n number of cache blocks Memory Cache

Example Given M memory blocks to be mapped to 10 cache blocks, show the direct mapping scheme? How do you know which block is currently in the cache?

Direct Mapping (Cont.) Bits in the main memory address are divided into three fields. Word  identifies specific word in the block Block  identifies a unique block in the cache Tag  identifies which block from the main memory currently in the cache

Example Consider, for example, the case of a main memory consisting of 4K blocks, a cache memory consisting of 128 blocks, and a block size of 16 words. Show the direct mapping and the main memory address format? Tag

Example (Cont.)

Direct Mapping Advantage Easy Does not require any search technique to find a block in cache Replacement is a straight forward Disadvantages Many blocks in MM are mapped to the same cache block We may have others empty in the cache Poor cache utilization

Group Activity Consider, the case of a main memory consisting of 4K blocks, a cache memory consisting of 8 blocks, and a block size of 4 words. Show the direct mapping and the main memory address format?

Group Activity Given the following direct mapping chart, what is the cache and memory location required by the following addresses:

Fully Associative Mapping Allowing any memory block to be placed anywhere in the cache A search technique is required to find the block number in the tag field

Example We have a main memory with 2 14 words, a cache with 16 blocks, and blocks is 8 words. How many tag & word fields bits? Word field requires 3 bits Tag field requires 11 bits  2 14 /8 = 2048 blocks

Which MM block in the cache? Naïve Method: Tag fields are associated with each cache block Compare tag field with tag entry in cache to check for hit. CAM (Content Addressable Memory) Words can be fetched on the basis of their contents, rather than on the basis of their addresses or locations. For example: Find the addresses of all “Smiths” in Dallas.

Fully Associative Mapping Advantages Flexibility Utilizing the cache Disadvantage Required tag search Associative search  Parallel search Might require extra hardware unit to do the search Requires a replacement strategy if the cache is full Expensive

N-way Set Associative Mapping Combines direct and fully associative mapping The cache is divided into a set of blocks All sets are the same size Main memory blocks are mapped to a specific set based on : s = i mod S s specific to which block i mapped S total number of sets Any coming block is assigned to any cache block inside the set

N-way Set Associative Mapping Tag field  uniquely identifies the targeted block within the determined set. Word field  identifies the element (word) within the block that is requested by the processor. Set field  identifies the set

N-way Set Associative Mapping

Group Activity Compute the three parameters (Word, Set, and Tag) for a memory system having the following specification: Size of the main memory is 4K blocks, Size of the cache is 128 blocks, The block size is 16 words. Assume that the system uses 4-way set- associative mapping.

Answer

N-way Set Associative Mapping Advantages : Moderate utilization to the cache Disadvantage Still needs a tag search inside the set

If the cache is full and there is a need for block replacement, Which one to replace?

Cache Replacement Policies Random Simple Requires random generator First In First Out (FIFO) Replace the block that has been in the cache the longest Requires keeping track of the block lifetime Least Recently Used (LRU) Replace the one that has been used the least Requires keeping track of the block history

Cache Replacement Policies (Cont.) Most Recently Used (MRU) Replace the one that has been used the most Requires keeping track of the block history Optimal Hypothetical Must know the future

Example Consider the case of a 4X8 two-dimensional array of numbers, A. Assume that each number in the array occupies one word and that the array elements are stored column-major order in the main memory from location 1000 to location The cache consists of eight blocks each consisting of just two words. Assume also that whenever needed, LRU replacement policy is used. We would like to examine the changes in the cache if each of the direct mapping techniques is used as the following sequence of requests for the array elements are made by the processor:

Array elements in the main memory

Conclusion 16 cache miss No single hit 12 replacements Only 4 cache blocks are used

Group Activity Do the same in case of fully and 4-way set associative mappings ?