CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Published byModified over 4 years ago
Presentation on theme: "CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos."— Presentation transcript:
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos
CSCE 212 2 Memory Hierarchy Programmers want more memory and faster memory Problems: –Denser memories require longer access times Example: papers on your desk vs. papers in your filing cabinet –Fast memories are extremely expensive per unit capacity Examples: –SRAM:.5 – 5 ns access time, $1K/GB –DRAM: 50 – 70 ns access time, $100/GB –Magnetic disk: 5 – 20 ms access time, $.10/GB
CSCE 212 3 Locality Goal: –Achieve the access time of smaller memories but have the effective capacity of larger memories Solution: –Temporal locality memory locations are accessed more than once –Spatial locality when a memory location is accessed, there’s a good chance a nearly location will be accessed in the near future
CSCE 212 5 Memory Hierarchy Each level of the hierarchy stores a subset of the level below it Each level can only communicate with the level below it For now, assume 2-level hierarchy –CPU-cache-RAM –cache is usually on-chip Sometimes the data we need is not in cache –hit rate Block or line –spatial locality miss penalty –time required to move a line to the top of the hierarchy (may vary) CPU cache main memory
CSCE 212 6 Caches Questions: 1.How do we know if the requested location is in the cache? 2.How do we find it?
CSCE 212 7 Cache Organization n words tags address(31 downto (log 2 n + 2)) Fully associative –Too many tags to compare!
CSCE 212 9 Direct Mapped Cache Direct mapped – each memory location maps to only one location in the cache 8 words tags addr(31:8) addr(7:5) 000 001 010 011 100 101 110 111
CSCE 212 10 Addresses The memory address can be partitioned: Example: 128 lines, 16 word lines: tag bits index log 2 lines bits (which line in each set?) word offset log 2 lines_size bits (which word in the line?) byte offset 2 bits (which byte in the word?) tag bits indexword offsetbyte offset 1:05:29:331:10
CSCE 212 12 The Three C’s Three different kinds of misses: –Compulsary (cold-start) misses First access to a block –Capacity misses Replaced block is needed again Because… cache capacity isn’t sufficient for the program –Conflict (collision) misses Multiple blocks compete for the same set
CSCE 212 13 Associativity 2-way set associative: –Two choices where to store a given line Replacement policy (ex. LRU) 8 words tags 0 addr(31:8) addr(7:5) 000 001 010 011 100 101 110 111 8 words tags 1 addr(31:8)
CSCE 212 15 Cache Behavior Hits at the top-level cache can usually be performed in one (or a few) clock cycles Misses stall the processor Writes can be handled using –Write-through (write allocate, write no-allocate) When cache data is changed, the lower level memory is updated immediately Use a write buffer –Write-back When cache data is changed, the lower level memory isn’t updated until the cache line containing the changes is replaced
CSCE 212 16 Memory Systems Main memory is DRAM, designed for density (not access time) How to reduce miss penalty?
CSCE 212 17 Average Memory Access Time AMAT = hit_time + miss_rate * miss_penalty Reduce miss rate: –Larger cache (capacity misses) –Increase associativity (conflict misses) –Replacement policy –Each of these may increase hit time and miss penalty Reduce miss penalty: –Wider or banked memory bus
CSCE 212 18 Virtual Memory Main memory acts as a cache to secondary storage –Allows memory to be shared –Make memory appear to be larger than it physically is Each program has own address space Enforces protection Virtual memory block is called a page, a miss is called a page fault Virtual addresses are translated into physical addresses –Address mapping / address translation –Combination of hardware and software
CSCE 212 21 Page Faults Main memory is 100,000 times faster than disk –Page faults are expensive Reduce page fault rate –Fully associative placement of pages in memory Each process has a page table that maps virtual addresses to physical addresses OS creates space on disk for all the process’s pages –Swap space OS maintains another table that keeps track of each page in main memory –During a page fault, the OS must decide which page to replace –Least recently used (LRU) –Write-back used for writes
CSCE 212 24 TLB Page lookups must be performed in hardware –Page table is cached on-chip –Translation-lookaside buffer –Small fully associative or large limited associative
CSCE 212 25 Integrating Cache and VM Data cannot be in the cache unless it is present in main memory Cache can be –physically addressed (TLB in critical path) –virtually addressed (TLB out of critical path) Cache miss requires TLB access TLB miss means: –page is in memory but we need the TLB entry, or –page is not in memory (page fault) –(both handled by OS software)
CSCE 212 26 TLB Misses and Page Faults When a virtual address causes a page fault… 1.Look up page table entry and find location on disk 2.Choose a physical page to replace, write-back if dirty 3.Read page from disk into chosen physical page (allow another process to run) TLB miss in MIPS –BadVAddr set, special exception triggered (8000 0000), go to TLB miss handler –Context register: bits 31:20 base of the page table bits 19:2 virtual address of the missing page –Use Context register directly to load missing entry If the page table entry is invalid, a page fault exception occurs at the normal handler (8000 0180) –Move missing entry to EntryLo register –Execute tlbwr to move EntryLo to TLB at address stored in Random register (free running counter) –Execute eret to return TLB miss exception doesn’t save process state (fast) while page fault does (slow)