Download presentation
Presentation is loading. Please wait.
1
CS 300 – Lecture 20 Intro to Computer Architecture / Assembly Language Caches
2
Announcements Tests not graded yet – will have them Thursday Next week – let's skip class Next homework – will be in the wiki later today. I'll have my code ready (the assembler) later this week. Quicksort – I'm hoping for a final effort. Email me your code and I'll take a look tomorrow.
3
The Memory Hierarchy This is an essential part of the course … How do we manage storage? Where is the storage? * Network (the entire WWW!) * Disk (may cache network) * Memory (may cache disk) * On chip / off chip caches * Registers Each is smaller / faster than the one above.
4
Prediction Storage strategies often need to predict what will happen next: * If a program opens a file, it usually needs to read it starting at the beginning * If a program reads a memory address, it will likely use adjacent addresses too * If a program writes a memory address it will usually read the value again soon Hardware / software often has to take a guess at what will happen next.
5
Locations We address data using locations (pointers). These may refer to url, disk, memory word, or register number. Locations have to work "correctly" when a value is shared in any way. The operation of the storage device requires that readers see what the most recent writer has written.
6
Cache A cache replicates a portion of a larger memory. It makes use of the fact that most references are to a relatively small number of locations.
7
Cache Design Issues * Replacement strategy (LRU, FIFO) * Address set (Fully associative, partly associative, non-associative) * Write policy (write through, write at replacement) * Coherence * Ports / sharing These issues are mostly the same whether the cache is "soft" or "hard".
8
The "Shape" of a Cache A cache (obviously) has a size. But it also has a "shape" – a cache cannot generally hold an arbitrary set of data. This shape describes the set of addresses that could be in the cache.
9
Example: Fully Associative A fully associate cache can handle any set of addresses up to the cache size. If we have a cache of 8 words, then any set of 8 addresses can be placed in cache. Sounds good! What are the drawbacks?
10
Example: Direct Mapping Suppose we cache 32 bit words by using low bits of a memory address. What sort of address sets are allowed? What is the advantage of this design? 2822 Determines cache location
11
Partially Associative Caching In the real world, we need to choose a design point somewhere between direct mapping (simplest hardware, severe conflict issues) and fully associative (slow, complex) This is generally done by replicating a small associative cache at each slot of a directly mapped cache.
12
Replacement Strategy When a new word arrives in the cache, we (usually) have to throw an old one out. Operating systems have this same problem in managing memory – when something new pages in, what to page out? Strategies: FIFO (easy to implement, poor prediction), LRU (harder to implement, much better prediction. Random replacement is better than FIFO in many cases!
13
Understanding Algorithm Performance In data structures, we compare algorithms by looking at asymptotic complexity. In the real world, constants matter! But the constants often depend much more on memory access patterns than number of instructions executed. Why is Quicksort so much better that Heapsort?
14
Stacking Up Caches We don't use just one cache between main memory and the CPU. Generally, there are two caches between the CPU and memory: on-chip (built into CPU) and off-chip (between CPU and memory bus). This leads to even hairier equations for performance! The difference in access time between primary and secondary cache is pretty huge!
15
Explicit vs Implicit Caching The "RISC vs CISC" debate is essentially about caching. In the MIPS / other RISC processors, the user has so many registers to use that they effectively become an explicit cache. And variable brought into register space is explicitly cached. On the pentium, there are so few registers that the "real" cache has all of the responsibility for memory performance. Which design is better?
16
Understanding Cache Performance We evaluate a cache by looking at the access time and the miss rate. Overall access time = hit * cache speed + (1 – hit) * memory speed. Another issue is pipelining – whether the application has work to do during a cache miss. Overall performance measurements are hard!
17
Dual Caches One possible cache design is to separate instruction caching from data caching. There are major differences in the access patterns for instructions & data: * No writes to instructions (simplifies cache design) * Instructions are more sequential – pre-loading is a big issue. A less associative design is possible * A data cache has to worry about regular access patterns (much array code)
18
Cache Coherence This is a problem when more than one party is using a cache. If two processors use a common memory, their on- chip caches can lose coherence. How to deal with this? * Write-through (cache is never out of synch with memory) instead of write-back (avoid writing dirty cache words until replacement). * Invalidation signals: When processor A writes into memory, it must invalidate the corresponding word in processor B's cache
19
Virtual Memory Virtual Memory is another level of caching that sits between the disk and the main memory. While cache is exclusively a hardware issue, VM is handled in both hardware and operating system. That is, many of the things that are done in hardware in the cache can be addressed in software in the VM context.
20
Virtual Memory Issues What problems are virtual memory hardware trying to solve?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.