15-447 Computer ArchitectureFall 2007 © November 14th, 2007 Majd F. Sakr CS-447– Computer Architecture.

Slides:



Advertisements
Similar presentations
Lecture 12 Reduce Miss Penalty and Hit Time
Advertisements

Memory Hierarchy CS465 Lecture 11. D. Barbara Memory CS465 2 Control Datapath Memory Processor Input Output Big Picture: Where are We Now?  The five.
CS2100 Computer Organisation Cache II (AY2014/2015) Semester 2.
CS 430 Computer Architecture 1 CS 430 – Computer Architecture Caches, Part II William J. Taffe using slides of David Patterson.
1 Lecture 20: Cache Hierarchies, Virtual Memory Today’s topics:  Cache hierarchies  Virtual memory Reminder:  Assignment 8 will be posted soon (due.
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
CS61C L31 Caches II (1) Garcia, Fall 2006 © UCB GPUs >> CPUs?  Many are using graphics processing units on graphics cards for high-performance computing.
CS61C L22 Caches II (1) Garcia, Fall 2005 © UCB Lecturer PSOE, new dad Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
Lecturer PSOE Dan Garcia
How caches take advantage of Temporal locality
CS61C L23 Cache II (1) Chae, Summer 2008 © UCB Albert Chae, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #23 – Cache II.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 32 – Caches III Prem Kumar of Northwestern has created a quantum inverter.
Modified from notes by Saeid Nooshabadi
CS61C L22 Caches III (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #22: Caches Andy.
CS61C L23 Caches I (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #23 Cache I.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 31 – Caches II In this week’s Science, IBM researchers describe a new class.
CS61C L33 Caches III (1) Garcia, Spring 2007 © UCB Future of movies is 3D?  Dreamworks says they may exclusively release movies in this format. It’s based.
COMP3221: Microprocessors and Embedded Systems Lecture 26: Cache - II Lecturer: Hui Wu Session 2, 2005 Modified from.
CS61C L32 Caches II (1) Garcia, 2005 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
CS61C L32 Caches II (1) Garcia, Spring 2007 © UCB Experts weigh in on Quantum CPU  Most “profoundly skeptical” of the demo. D-Wave has provided almost.
Computer ArchitectureFall 2007 © November 12th, 2007 Majd F. Sakr CS-447– Computer Architecture.
CS61C L33 Caches III (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
EENG449b/Savvides Lec /13/04 April 13, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II
CS61C L24 Cache II (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Cache II.
Cs 61C L17 Cache.1 Patterson Spring 99 ©UCB CS61C Cache Memory Lecture 17 March 31, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs61c/schedule.html.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Nov 9, 2005 Topic: Caches (contd.)
Computer ArchitectureFall 2008 © November 3 rd, 2008 Nael Abu-Ghazaleh CS-447– Computer.
CS61C L18 Cache2 © UC Regents 1 CS61C - Machine Structures Lecture 18 - Caches, Part II November 1, 2000 David Patterson
Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.
CS 61C L21 Caches II (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L32 Caches III (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C.
Computer ArchitectureFall 2007 © November 12th, 2007 Majd F. Sakr CS-447– Computer Architecture.
Caches Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes)
Lecture 21 Last lecture Today’s lecture Cache Memory Virtual memory
CSC 4250 Computer Architectures December 5, 2006 Chapter 5. Memory Hierarchy.
Lecture Objectives: 1)Define set associative cache and fully associative cache. 2)Compare and contrast the performance of set associative caches, direct.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
CS 3410, Spring 2014 Computer Science Cornell University See P&H Chapter: , 5.8, 5.15.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
Csci 211 Computer System Architecture – Review on Cache Memory Xiuzhen Cheng
CS2100 Computer Organisation Cache II (AY2015/6) Semester 1.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Chapter 5 Memory III CSE 820. Michigan State University Computer Science and Engineering Miss Rate Reduction (cont’d)
Review °We would like to have the capacity of disk at the speed of the processor: unfortunately this is not feasible. °So we create a memory hierarchy:
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 14 – Caches III Google Glass may be one vision of the future of post-PC.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Cache Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.
Memory Hierarchy— Five Ways to Reduce Miss Penalty.
1 Memory Hierarchy Design Chapter 5. 2 Cache Systems CPUCache Main Memory Data object transfer Block transfer CPU 400MHz Main Memory 10MHz Bus 66MHz CPU.
CMSC 611: Advanced Computer Architecture
Improving Memory Access The Cache and Virtual Memory
CS61C : Machine Structures Lecture 6. 2
Memristor memory on its way (hopefully)
CS61C : Machine Structures Lecture 6. 2
Part V Memory System Design
Lecture 23: Cache, Memory, Virtual Memory
Lecturer PSOE Dan Garcia
Lecture 22: Cache Hierarchies, Memory
Lecture 22: Cache Hierarchies, Memory
Instructor Paul Pearce
Lecturer PSOE Dan Garcia
ECE232: Hardware Organization and Design
CS-447– Computer Architecture Lecture 20 Cache Memories
Some of the slides are adopted from David Patterson (UCB)
Csci 211 Computer System Architecture – Review on Cache Memory
Presentation transcript:

Computer ArchitectureFall 2007 © November 14th, 2007 Majd F. Sakr CS-447– Computer Architecture M,W 10-11:20am Lecture 21 Set Associative Cache

Computer ArchitectureFall 2007 © Review… °Mechanism for transparent movement of data among levels of a storage hierarchy set of address/value mappings address => index to set of candidate blocks compare desired address with tag service hit or miss -load new block and binding on miss Valid Tag 0x0-3 0x4-70x8-b0xc-f abcd address: tag index offset

Computer ArchitectureFall 2007 © Review… °Drawbacks of Larger Block Size Larger block size means larger miss penalty -on a miss, takes longer time to load a new block from next level If block size is too big relative to cache size, then there are too few blocks -Result: miss rate goes up °In general, minimize Average Access Time = Hit Time + Miss Penalty x Miss Rate

Computer ArchitectureFall 2007 © Review… °Hit Time = time to find and retrieve data from current level cache °Miss Penalty = average time to retrieve data on a current level miss (includes the possibility of misses on successive levels of memory hierarchy) °Hit Rate = % of requests that are found in current level cache °Miss Rate = 1 - Hit Rate

Computer ArchitectureFall 2007 © Types of Cache Misses (1/2) °“Three Cs” Model of Misses °1st C: Compulsory Misses occur when a program is first started cache does not contain any of that program’s data yet, so misses are bound to occur can’t be avoided easily, so won’t focus on these in this course

Computer ArchitectureFall 2007 © Types of Cache Misses (2/2) °2nd C: Conflict Misses miss that occurs because two distinct memory addresses map to the same cache location two blocks (which happen to map to the same location) can keep overwriting each other big problem in direct-mapped caches how do we lessen the effect of these? °Dealing with Conflict Misses Solution 1: Make the cache size bigger -Fails at some point Solution 2: Multiple distinct blocks can fit in the same cache Index?

Computer ArchitectureFall 2007 © Fully Associative Cache (1/3) °Memory address fields: Tag: same as before Offset: same as before Index: non-existant °What does this mean? no “rows”: any block can go anywhere in the cache must compare with all tags in entire cache to see if data is there

Computer ArchitectureFall 2007 © Fully Associative Cache (2/3) °Fully Associative Cache (e.g., 32 B block) compare tags in parallel Byte Offset : Cache Data B : Cache Tag (27 bits long) Valid : B 1B 31 : Cache Tag = = = = = :

Computer ArchitectureFall 2007 © Fully Associative Cache (3/3) °Benefit of Fully Assoc Cache No Conflict Misses (since data can go anywhere) °Drawbacks of Fully Assoc Cache Need hardware comparator for every single entry: if we have a 64KB of data in cache with 4B entries, we need 16K comparators: infeasible

Computer ArchitectureFall 2007 © Third Type of Cache Miss °Capacity Misses miss that occurs because the cache has a limited size miss that would not occur if we increase the size of the cache sketchy definition, so just get the general idea °This is the primary type of miss for Fully Associative caches.

Computer ArchitectureFall 2007 © N-Way Set Associative Cache (1/4) °Memory address fields: Tag: same as before Offset: same as before Index: points us to the correct “row” (called a set in this case) °So what’s the difference? each set contains multiple blocks once we’ve found correct set, must compare with all tags in that set to find our data

Computer ArchitectureFall 2007 © N-Way Set Associative Cache (2/4) °Summary: cache is direct-mapped w/respect to sets each set is fully associative basically N direct-mapped caches working in parallel: each has its own valid bit and data

Computer ArchitectureFall 2007 © N-Way Set Associative Cache (3/4) °Given memory address: Find correct set using Index value. Compare Tag with all Tag values in the determined set. If a match occurs, hit!, otherwise a miss. Finally, use the offset field as usual to find the desired data within the block.

Computer ArchitectureFall 2007 © N-Way Set Associative Cache (4/4) °What’s so great about this? even a 2-way set assoc cache avoids a lot of conflict misses hardware cost isn’t that bad: only need N comparators °In fact, for a cache with M blocks, it’s Direct-Mapped if it’s 1-way set assoc it’s Fully Assoc if it’s M-way set assoc so these two are just special cases of the more general set associative design

Computer ArchitectureFall 2007 © Associative Cache Example °Recall this is how a simple direct mapped cache looked. °This is also a 1-way set- associative cache! Memory Memory Address A B C D E F 4 Byte Direct Mapped Cache Cache Index

Computer ArchitectureFall 2007 © Associative Cache Example °Here’s a simple 2 way set associative cache. Memory Memory Address A B C D E F Cache Index