Cache Operation.

Slides:



Advertisements
Similar presentations
Cache Memory Exercises. Questions I Given: –memory is little-endian and byte addressable; memory size; –number of cache blocks, size of cache block –An.
Advertisements

Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
Computer ArchitectureFall 2007 © November 14th, 2007 Majd F. Sakr CS-447– Computer Architecture.
How caches take advantage of Temporal locality
Memory Hierarchies Exercises [ ] Describe the general characteristics of a program that would exhibit very little spatial or temporal locality with.
CS61C L22 Caches III (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #22: Caches Andy.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Caches The principle that states that if data is used, its neighbor will likely be used soon.
1  Caches load multiple bytes per block to take advantage of spatial locality  If cache block size = 2 n bytes, conceptually split memory into 2 n -byte.
Computer ArchitectureFall 2008 © November 3 rd, 2008 Nael Abu-Ghazaleh CS-447– Computer.
Direct Map Cache Tracing Exercise. Exercise #1: Setup Information CS2100 Cache I 2 Memory 4GB Memory Address 310N-1N Block Number Offset 1 Block = 8 bytes.
CSCE 212 Quiz 11 – 4/13/11 Given a direct-mapped cache with 8 one-word blocks and the following 32-bit memory address references: 1 2, ,
DAP Spr.‘98 ©UCB 1 Lecture 11: Memory Hierarchy—Ways to Reduce Misses.
COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Caches Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes)
1 CMPE 421 Advanced Computer Architecture Caching with Associativity PART2.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Lecture Objectives: 1)Define set associative cache and fully associative cache. 2)Compare and contrast the performance of set associative caches, direct.
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
CS 3410, Spring 2014 Computer Science Cornell University See P&H Chapter: , 5.8, 5.15.
1 CMPE 421 Advanced Computer Architecture Accessing a Cache PART1.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 2 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Computer Architecture Lecture 26 Fasih ur Rehman.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 January Session 2.
Multiprocessor cache coherence. Caching: terms and definitions cache line, line size, cache size degree of associativity –direct-mapped, set and fully.
Lecture 5 Cache Operation ECE 463/521 Fall 2002 Edward F. Gehringer Based on notes by Drs. Eric Rotenberg & Tom Conte of NCSU.
Lecture 40: Review Session #2 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
B. Ramamurthy.  12 stage pipeline  At peak speed, the processor can request both an instruction and a data word on every clock.  We cannot afford pipeline.
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 3.
The Memory Hierarchy Lecture # 30 15/05/2009Lecture 30_CA&O_Engr Umbreen Sabir.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
Lecture Objectives: 1)Explain the relationship between miss rate and block size in a cache. 2)Construct a flowchart explaining how a cache miss is handled.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 1.
Lecture 20 Last lecture: Today’s lecture: Types of memory
1 Lecture: Virtual Memory Topics: virtual memory, TLB/cache access (Sections 2.2)
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
CAM Content Addressable Memory
Recitation 6 – 3/11/01 Outline Cache Organization Replacement Policies MESI Protocol –Cache coherency for multiprocessor systems Anusha
Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module.
Cache Organization 1 Computer Organization II © CS:APP & McQuain Cache Memory and Performance Many of the following slides are taken with.
Memory Hierarchy and Cache Design (4). Reducing Hit Time 1. Small and Simple Caches 2. Avoiding Address Translation During Indexing of the Cache –Using.
CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic
Memory Hierarchy and Cache Design (3). Reducing Cache Miss Penalty 1. Giving priority to read misses over writes 2. Sub-block placement for reduced miss.
نظام المحاضرات الالكترونينظام المحاضرات الالكتروني Cache Memory.
Chapter 9 Memory Organization. 9.1 Hierarchical Memory Systems Figure 9.1.
Recitation 6 – 3/11/02 Outline Cache Organization Accessing Cache Replacement Policy Mengzhi Wang Office Hours: Thursday.
Lecture 5 Cache Operation
Tutorial Nine Cache CompSci Semester One 2016.
CAM Content Addressable Memory
Replacement Policy Replacement policy:
Consider a Direct Mapped Cache with 4 word blocks
CSCI206 - Computer Organization & Programming
Lecture 08: Memory Hierarchy Cache Performance
Interconnect with Cache Coherency Manager
Lecture 22: Cache Hierarchies, Memory
Direct Mapping.
Module IV Memory Organization.
CMSC 611: Advanced Computer Architecture
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
Basic Cache Operation Prof. Eric Rotenberg
Cache - Optimization.
Cache Memory and Performance
Presentation transcript:

Cache Operation

Cache Parameters SIZE = total amount of cache data storage, in bytes BLOCKSIZE = total number of bytes in a single block ASSOC = associativity, i.e., # of blocks in a set

Cache Parameters (cont.) Equation for # of cache blocks in cache: Equation for # of sets in cache:

Address Fields 31 block offset index tag 31 block offset index tag Tag field is compared to the tag(s) of the indexed cache block(s). If there is a match, memory block is there (hit). If there isn’t a match, memory block is not there (miss). Used to lookup a “set”, which contains one or more memory blocks. The number of blocks in a set is the “associativity”. Once block is found, offset selects a particular byte or word of data in the block.

Address Fields (cont.) Widths of address fields (# bits) # index bits = log2(# sets) # block offset bits = log2(block size) # tag bits = 32 - # index bits - # block offset bits Assuming 32-bit addresses 31 block offset index tag

31 block offset index tag Address (from processor) TAGS DATA Match? hit/miss byte or word select requested data (byte or word) (to processor)

Example --- Direct-mapped cache Example: Processor accesses a 256 Byte direct-mapped cache, which has block size of 32 Bytes, with following sequence of addresses. Show contents of cache after each access, count # of hits, count # of replacements.

Example address sequence Address (hex) Tag (hex) Index & Offset bits (binary) Index (decimal) Comment 0xFF0040E0 0xBEEF005C 0xFF0040E2 0xFF0040E8 0x00101078 0x002183E0 0x00101064 0x0012255C 0x00122544

# index bits = log2(# sets) = log2(8) = 3 # block offset bits = log2(block size) = log2(32 bytes) = 5 # tag bits = 32 bits – 3 bits – 5 bits = 24 SO: Top 6 nibbles (24 bits) of address form the tag and lower 2 nibbles (8 bits) of address form the index and block offset fields

Index & Offset bits (binary) Index (decimal) Comment Address (hex) Tag (hex) Index & Offset bits (binary) Index (decimal) Comment 0xFF0040E0 0xFF0040 1110 0000 7 0xBEEF005C 0xBEEF00 0101 1100 2 0xFF0040E2 1110 0010 0xFF0040E8 1110 1000 0x00101078 0x001010 0111 1000 3 0x002183E0 0x002183 0x00101064 0110 0100 0x0012255C 0x001225 0x00122544 0100 0100

MISS TAGS DATA 31 block offset index tag 8 7 5 4 FF0040 7 24 3 FF0040 31 block offset index tag 8 7 5 4 1 2 3 6 FF0040 7 24 3 FF0040 Get block from memory (slow) Match? MISS

MISS TAGS DATA 31 block offset index tag 8 7 5 4 1 2 3 6 FF0040 BEEF00 31 block offset index tag 8 7 5 4 1 2 3 6 FF0040 BEEF00 2 24 3 BEEF00 Get block from memory (slow) Match? MISS

HIT TAGS DATA 31 block offset index tag 8 7 5 4 FF0040 FF0040 7 24 3 31 block offset index tag 8 7 5 4 1 2 3 6 FF0040 FF0040 7 24 3 BEEF00 Match? HIT

HIT TAGS DATA 31 block offset index tag 8 7 5 4 FF0040 FF0040 7 24 3 31 block offset index tag 8 7 5 4 1 2 3 6 FF0040 FF0040 7 24 3 BEEF00 Match? HIT

MISS TAGS DATA 31 block offset index tag 8 7 5 4 FF0040 001010 3 24 3 31 block offset index tag 8 7 5 4 1 2 3 6 FF0040 001010 3 24 3 BEEF00 Get block from memory (slow) 001010 Match? MISS

MISS & REPLACE TAGS DATA 31 block offset index tag 8 7 5 4 FF0040 31 block offset index tag 8 7 5 4 1 2 3 6 FF0040 002183 7 24 3 BEEF00 001010 Get block from memory (slow) 002183 Match? MISS & REPLACE

HIT TAGS DATA 31 block offset index tag 8 7 5 4 001010 3 24 3 BEEF00 31 block offset index tag 8 7 5 4 1 2 3 6 001010 3 24 3 BEEF00 001010 002183 Match? HIT

MISS & REPLACE TAGS DATA 31 block offset index tag 8 7 5 4 001225 2 24 31 block offset index tag 8 7 5 4 1 2 3 6 001225 2 24 3 Get block from memory (slow) 001225 BEEF00 001010 002183 Match? MISS & REPLACE

HIT TAGS DATA 31 block offset index tag 8 7 5 4 001225 2 24 3 001225 31 block offset index tag 8 7 5 4 1 2 3 6 001225 2 24 3 001225 001010 002183 Match? HIT

Index & Offset bits (binary) Index (decimal) Comment Address (hex) Tag (hex) Index & Offset bits (binary) Index (decimal) Comment 0xFF0040E0 0xFF0040 1110 0000 7 Miss 0xBEEF005C 0xBEEF00 0101 1100 2 0xFF0040E2 1110 0010 Hit 0xFF0040E8 1110 1000 0x00101078 0x001010 0111 1000 3 0x002183E0 0x002183 Miss/Repl 0x00101064 0110 0100 0x0012255C 0x001225 0x00122544 0100 0100

Example --- N-way set-associative cache Example: Processor accesses a 256 byte 2-way set-associative cache, which has block size of 32 bytes, with following sequence of addresses. Show contents of cache after each access, count # of hits, count # of replacements.

31 block offset index tag Address (from processor) TAGS DATA (32 bytes) (32 bytes) select a block Match? Match? select certain bytes hit OR

# index bits = log2(# sets) = log2(4) = 2 # block offset bits = log2(block size) = log2(32 bytes) = 5 # tag bits = total # address bits - # index bits - # block offset bits = 32 bits – 2 bits – 5 bits = 25

Index & Offset bits (binary) Index (decimal) Comment Address (hex) Tag (hex) Index & Offset bits (binary) Index (decimal) Comment 0xFF0040E0 0x1FE0081 110 0000 3 0xBEEF005C 0x17DDE00 101 1100 2 0x00101078 0x0002020 111 1000 0xFF0040E2 110 0010 0x002183E0 0x0004307 0x00101064 110 0100

MISS 31 7 6 5 4 tag index block offset 1FE0081 3 25 2 TAGS DATA tag index block offset 1FE0081 3 25 2 TAGS 1 2 3 LRU DATA not shown for convenience 1FE0081 1 Match? Match? MISS

MISS 31 7 6 5 4 tag index block offset 17DDE00 2 25 2 TAGS 17DDE00 tag index block offset 17DDE00 2 25 2 TAGS 1 2 3 17DDE00 1 LRU DATA not shown for convenience 1FE0081 Match? Match? MISS

MISS 31 7 6 5 4 tag index block offset 0002020 3 25 2 TAGS 17DDE00 tag index block offset 0002020 3 25 2 TAGS 1 2 3 17DDE00 DATA not shown for convenience 1FE0081 0002020 1 1 LRU Match? Match? MISS

HIT 31 7 6 5 4 tag index block offset 1FE0081 3 25 2 TAGS 17DDE00 DATA tag index block offset 1FE0081 3 25 2 TAGS 1 2 3 17DDE00 DATA not shown for convenience 1FE0081 0002020 1 1 LRU Match? Match? HIT

HIT 31 7 6 5 4 tag index block offset 0002020 3 25 2 TAGS 17DDE00 DATA tag index block offset 0002020 3 25 2 TAGS 1 2 3 17DDE00 DATA not shown for convenience 1FE0081 0002020 1 1 LRU Match? Match? HIT

MISS & REPLACE LRU BLOCK 31 7 6 5 4 tag index block offset 0004307 3 25 2 TAGS 1 2 3 17DDE00 0004307 DATA not shown for convenience 1FE0081 0002020 1 1 LRU Match? Match? MISS & REPLACE LRU BLOCK

HIT 31 7 6 5 4 tag index block offset 0002020 3 25 2 TAGS 17DDE00 DATA tag index block offset 0002020 3 25 2 TAGS 1 2 3 17DDE00 DATA not shown for convenience 0004307 0002020 1 1 LRU Match? Match? HIT

Index & Offset bits (binary) Index (decimal) Comment Address (hex) Tag (hex) Index & Offset bits (binary) Index (decimal) Comment 0xFF0040E0 0x1FE0081 110 0000 3 Miss 0xBEEF005C 0x17DDE00 101 1100 2 0x00101078 0x0002020 111 1000 0xFF0040E2 110 0010 Hit 0x002183E0 0x0004307 Miss/Repl 0x00101064 110 0100