The Memory Hierarchy CPSC 321 Andreas Klappenecker.

Slides:

Advertisements

Similar presentations

1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 23, 2002 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.

Advertisements

Computation I pg 1 Embedded Computer Architecture Memory Hierarchy: Cache Recap Course 5KK73 Henk Corporaal November 2014

CMSC 611: Advanced Computer Architecture Cache Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from.

1 Lecture 20 – Caching and Virtual Memory  2004 Morgan Kaufmann Publishers Lecture 20 Caches and Virtual Memory.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.

Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.

1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.

Chapter 7 Large and Fast: Exploiting Memory Hierarchy Bo Cheng.

Memory Chapter 7 Cache Memories.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.

The Memory Hierarchy II CPSC 321 Andreas Klappenecker.

331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.

ENGS 116 Lecture 121 Caches Vincent H. Berk Wednesday October 29 th, 2008 Reading for Friday: Sections C.1 – C.3 Article for Friday: Jouppi Reading for.

Caching III Andreas Klappenecker CPSC321 Computer Architecture.

Computer ArchitectureFall 2007 © November 12th, 2007 Majd F. Sakr CS-447– Computer Architecture.

Caching I Andreas Klappenecker CPSC321 Computer Architecture.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )

331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.

Caching II Andreas Klappenecker CPSC321 Computer Architecture.

1  2004 Morgan Kaufmann Publishers Chapter Seven.

1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.

CS 524 (Wi 2003/04) - Asim LUMS 1 Cache Basics Adapted from a presentation by Beth Richardson

1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.

Computer ArchitectureFall 2007 © November 12th, 2007 Majd F. Sakr CS-447– Computer Architecture.

DAP Spr.‘98 ©UCB 1 Lecture 11: Memory Hierarchy—Ways to Reduce Misses.

Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.

Computing Systems Memory Hierarchy.

Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.

CMPE 421 Parallel Computer Architecture

EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff Case.

Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.

Lecture 19 Today’s topics Types of memory Memory hierarchy.

10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.

CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.

CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and

Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.

Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

The Goal: illusion of large, fast, cheap memory Fact: Large memories are slow, fast memories are small How do we create a memory that is large, cheap and.

Computer Organization & Programming

1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.

CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.

CPE232 Cache Introduction1 CPE 232 Computer Organization Spring 2006 Cache Introduction Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.

1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.

Memory Hierarchy How to improve memory access. Outline Locality Structure of memory hierarchy Cache Virtual memory.

1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.

1  1998 Morgan Kaufmann Publishers Chapter Seven.

1  2004 Morgan Kaufmann Publishers Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality:

Memory Hierarchy and Caches. Who Cares about Memory Hierarchy? Processor Only Thus Far in Course CPU-DRAM Gap 1980: no cache in µproc; level cache,

The Memory Hierarchy (Lectures #17 - #20) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.

What is it and why do we need it? Chris Ward CS147 10/16/2008.

1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.

Computer Organization CS224 Fall 2012 Lessons 37 & 38.

CMSC 611: Advanced Computer Architecture

COSC3330 Computer Architecture

Computer Organization

Yu-Lun Kuo Computer Sciences and Information Engineering

The Goal: illusion of large, fast, cheap memory

Improving Memory Access 1/3 The Cache and Virtual Memory

Cache Memory Presentation I

Morgan Kaufmann Publishers Memory & Cache

CMSC 611: Advanced Computer Architecture

EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007

CS-447– Computer Architecture Lecture 20 Cache Memories

Chapter Five Large and Fast: Exploiting Memory Hierarchy

Memory & Cache.

Presentation transcript:

The Memory Hierarchy CPSC 321 Andreas Klappenecker

Some Results from the Survey Issues with the CS curriculum CPSC 111 Computer Science Concepts & Prg CPSC 310 Databases CPSC 431 Software Engineering Something from the wish list: More C++ More Software Engineering More focus on industry needs Less focus on industry needs

Some Results from the Survey Why (MIPS) assembly language? More detailed explanations of programming language xyz. Implement slightly reduced version of the Pentium 4 or Athlon processors Have another computer architecture class Lack of information on CS website about specialization...

Follow Up CPSC 462 Microcomputer Systems CPSC 410 Operating Systems Go to seminars/lectures by Bjarne Stroustrup, Jaakko Jarvi, or Gabriel Dos Reis

Today’s Menu Caches

Memory Current memory is largely implemented in CMOS technology. Two alternatives: SRAM fast, but not area efficient stored value in a pair of inverting gates DRAM slower, but more area efficient value stored on charge of a capacitor (must be refreshed)

Static RAM

Dynamic RAM

Memory Users want large and fast memories SRAM is too expensive for main memory DRAM is too slow for many purposes Compromise Build a memory hierarchy

Locality If an item is referenced, then it will be again referenced soon (temporal locality) nearby data will be referenced soon (spatial locality) Why does code have locality?

Memory Hierarchy The memory is organized as a hierarchy levels closer to the processor is a subset of any level further away the memory can consist of multiple levels, but data is typically copied between two adjacent levels at a time initially, we focus on two levels

Memory Hierarchy

Two Level Hierarchy Upper level (smaller and faster) Lower level (slower) A unit of information that is present or not within a level is called a block If data requested by the processor is in the upper level, then this is called a hit, otherwise it is called a miss If a miss occurs, then data will be retrieved from the lower level. Typically, an entire block is transferred

Cache A cache represents some level of memory between CPU and main memory [More general definitions are often used]

A Toy Example Assumptions Suppose that processor requests are each one word, and that each block consists of one word Example Before request C = [X1,X2,…,Xn-1] Processor requests Xn not contained in C item Xn is brought from the memory to the cache After the request C = [X1,X2,…,Xn-1,Xn] Issues What happens if the cache is full?

Issues How do we know whether the data item is in the cache? If it is, how do we find it? Simple strategy: direct mapped cache exactly one location where data might be in the cache

Mapping: address modulo the number of blocks in the cache, x -> x mod B Direct Mapped Cache

Cache with 1024=2 10 words tag from cache is compared against upper portion of the address If tag=upper 20 bits and valid bit is set, then we have a cache hit otherwise it is a cache miss What kind of locality are we taking advantage of? Direct Mapped Cache

Direct Mapped Cache Example

Taking advantage of spatial locality: Direct Mapped Cache

Read hits this is what we want! Read misses stall the CPU, fetch block from memory, deliver to cache, restart Write hits: can replace data in cache and memory (write-through) write the data only into the cache (write-back the cache later) Write misses: read the entire block into the cache, then write the word Hits vs. Misses

Hits vs. Miss Example

What Block Size? A large block size reduces cache misses Cache miss penalty increases We need to balance these two constraints How can we measure cache performance? How can we improve cache performance?

The performance of a cache depends on many parameters: Memory stall clock cycles Read stall clock cycles Write stall clock cycles

Cache Block Mapping Direct mapped cache a block goes in exactly one place in the cache Fully associative a block can go anywhere in the cache difficult to find a block parallel comparison to speed-up search

Cache Block Mapping Set associative Each block maps to a unique set, and the block can be placed into any element of that set Position is given by (Block number) modulo (# of sets in cache) If the sets contain n elements, then the cache is called n-way set associative

Cache Types