Presentation is loading. Please wait.

Presentation is loading. Please wait.

Caches J. Nelson Amaral University of Alberta. Processor-Memory Performance Gap Bauer p. 47.

Similar presentations


Presentation on theme: "Caches J. Nelson Amaral University of Alberta. Processor-Memory Performance Gap Bauer p. 47."— Presentation transcript:

1 Caches J. Nelson Amaral University of Alberta

2 Processor-Memory Performance Gap Bauer p. 47

3 Memory Hierarchy Bauer p. 48

4 Principle of Locality Temporal Locality what was used in the past is likely to be reused in the near future Spatial Locality what is close to the thing that is being used now is likely to be also used in the near future Bauer p. 48

5 Hits and Misses Cache hit: the requested location is in the cache Cache miss: the requested location in not in the cache Bauer p. 48

6 Cache Organizations When to bring the content of a memory location into the cache? Where to put it? How do we know it is there? What happens if the cache is full and we need to bring the content of a location into the cache? On demand Depends on Cache Organization Tag entries Use a replacement algorithm Bauer p. 49

7 Cache Organization Bauer p. 50

8 Mapping Bauer p. 51

9 Content-Addressable Memories (CAMs) Indexed by matching (part of) the content of entries All entries are searched in parallel Drawbacks: – expensive hardware – consume more power – difficult to modify Bauer p. 50

10 Cache Geometry C: number of cache lines m: number of banks in the cache (associativity) L: line size S: Cache size (or capacity) S = C × L (S, L, m) gives the geometry of a cache d: number of bits needed for displacement Bauer p. 52

11 Hit and Miss Detection (S,L,m) = (32KB, 16B, 1) Cache Geometry: Memory Reference: (t,i,d) = (?, ?, ?) d = log 2 L = log 2 16 = 4 i = log 2 (C/m) = log 2048 = 11 C = S/L = 32KB/16B = 2048 t= 32 – i – d = 32 – 11 – 4 = 17 Bauer p. 52 C: # of cache lines m: associativity L: line size S: Cache size S = C × L (S, L, m): geometry d: # displacement bits (t,i,d) = (tag, index, displacement)

12 Hit and Miss Detection d = log 2 L = log 2 16 = 4 i = log 2 (C/m) = log 2048 = 11 C = S/L = 32KB/16B = 2048 t= 32 – i – d = 32 – 11 – 4 = 17 Bauer p. 52 What happens to t if we double the line size? 32 32B 5 1024 10 5 C: # of cache lines m: associativity L: line size S: Cache size S = C × L (S, L, m): geometry d: # displacement bits (t,i,d) = (tag, index, displacement) (S,L,m) = (32KB, 16B, 1) 32

13 Hit and Miss Detection d = log 2 L = log 2 16 = 4 i = log 2 (C/m) = log 2048 = 11 C = S/L = 32KB/16B = 2048 t= 32 – i – d = 32 – 11 – 4 = 17 Bauer p. 52 What happens to t if we change to a 2-way associativity? 1024 10 17 Need one more comparator and a multiplexor. (S,L,m) = (32KB, 16B, 1) 2 C: # of cache lines m: associativity L: line size S: Cache size S = C × L (S, L, m): geometry d: # displacement bits (t,i,d) = (tag, index, displacement)

14 Replacement Algorithm Direct mapped – There is only one location for a block – If the location is occupied, the block that is there is evicted m-way set associative – If all m are valid, must select a victim Low associativity: -Least-Recently Used (LRU) entry should be evicted -High associativity: -(Two) Most-Recently Used (MRU) should not be evicted. Bauer p. 53

15 Write Strategies (on a hit) Write back – Write only to the cache (memory becomes stale) – Add a dirty bit to each cache line – Must write back to memory when entry is evicted Write through – Write to both cache and memory – No need to have a dirty bit – Memory is consistent at all times Bauer p. 54

16 Write Strategies (on a miss) Write allocate – read the line from the memory – write to the line to modify it Write around – write to the next level only Combinations that make sense: – write back with write allocate – write through with write around Bauer p. 54

17 Write Buffer Processor Cache Write Buffer Memory Read Write Bauer p. 54

18 The three C’s Compulsory (cold) misses – first time a memory block is referenced Conflict misses – more than m blocks compete for the same cache entries in an m-way cache Capacity misses – more than C blocks compute for space in a cache with C lines Coherence misses – needed blocks are invalidated because of I/O or multiprocessor operations. Bauer p. 54

19 Caches and I/O (read) Bauer p. 55 What happens to the cache when data need to move from disk to memory? 1. Invalidate cache data using valid bit.

20 Caches and I/O (read) Bauer p. 55 2. Update cache with new data. What happens to the cache when data need to move from disk to memory?

21 Caches and I/O (Write) Bauer p. 55 What happens to the cache when data need to move from memory to disk? purge dirty lines Alternative: Hardware Snoopy Protocol.

22 Cache Performance Hit Ratio: For two levels of cache: Bauer p. 56

23 Cache Performance Goal: Reduce AMAT Strategies: 1. Increase hit ratio (h) 2. Reduce T cache Parameters: 1. Cache Capacity 2. Cache Associativity 3. Cache Line Size Bauer p. 56

24 Influence of Capacity on Miss Rate Bauer p. 57 Cache is (S, 2, 64)Application: 176.gcc

25 Associativity X Miss Rate Cache is (32KB, m, 64) Application: 176.gcc

26 Line Size X Miss Rate Cache is (16KB, 1, L)

27 Memory Access time

28 AMAT Example We will study two alternative configurations, C A and C B, for a single level of cache. What is the AMAT in each case?


Download ppt "Caches J. Nelson Amaral University of Alberta. Processor-Memory Performance Gap Bauer p. 47."

Similar presentations


Ads by Google