Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani.

Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani

Contents  Cache Basics  Types of Cache misses  Cost of Cache misses  How to remove the cache misses  Larger Block size  Adding Associativity (Reducing Conflict Misses) Miss Cache Victim Cache.. An Improvement over miss cache  Removing Capacity Misses and Compulsory Misses Prefetch Technique Stream Buffers  Conclusion

Mapping (Block Address) modulo (Number of cache blocks in the cache) Cache is accessed using lower order bits. e.g Memory address between (0001) and 11101 map to locations 001 and 101 in cache. Data is addressed using tag (higher order bits of address)

Direct Mapped Cache 000 001 010 011 100 101 110 111 00001 00101 01001 01101 10001

Cache Terminology  Cache Hit  Cache Miss  Miss Penalty : The miss penalty is the time to replace the block in the upper level with corresponding block from the lower level.

 In a direct-Mapped cache, there is only one place the newly requested item and hence only one choice of what to replace.

Types of Misses Compulsory —The first access to a block is not in the cache, so the block must be brought into the cache. These are also called cold start misses or first reference misses. (Misses in Infinite Cache) –Capacity —If the cache cannot contain all the blocks needed during execution of a program, capacity misses will occur due to blocks being discarded and later retrieved. (Misses in Size ) –Conflict —If the block-placement strategy is set associative or direct mapped, conflict misses (in addition to compulsory and capacity misses) will occur because a block can be discarded and later retrieved if too many blocks map to its set. These are also called collision misses or interference misses. (Misses in N-way Associative) –Coherence Misses: Result of invalidation to preserve multiprocessor cahce consistency.

Conflict Misses account for Between 20% to 40% of of all direct-mapped cache misses

Cost of Cache Misses Cycle time has been decreasing much faster than memory access time. Average number of machine cycles per instruction has been decreasing dramatically. This two effects can results in miss cost. Eg : Cache miss on VAX11/780 only cost 60% of the average instruction execution. If every instruction had cache miss then machine performance can go down by %60.

How to Reduce the Cache Miss  Increase Block Size  Increase Associativity  Use a Victim Cache  Use a Pseudo Associative Cache  Hardware Prefetching  Compiler-Controlled Prefetching  Compiler Optimizations

Increasing Block size One way to reduce the miss rate is to increase the block size –Reduce compulsory misses - why? Take advantage of spacial locality However, larger blocks have disadvantages –May increase the miss penalty (need to get more data) –May increase hit time (need to read more data from cache and larger mux) –May increase conflict and capacity misses.

Adding Associativity tag and comparator one cache line of data when a miss occur,data is returned to DM and miss cache Each time the upper cache and miss cache is probed From processor To processor From next lower cache tagdata MRU entry Fully-associative miss cache LRU entry

Performance of Miss cache Replaces a long off-chip miss penalty with a short one-cycle on-chip miss. Data conflict misses more removed

Disadvantage of Miss Cache  Waste of storage space in the miss cache due to duplication of data.

Victim Cache An improvement over miss cache. Loads victim line instead of requested line. In case of miss contents of DM cache and victim cache are swapped.

The effect of DM cache size on victim cache performance DM size increase, likelyhood of conflict miss removed by victim cache reduces

Reducing Capacity and Compulsory Misses Use prefetch technique 1.prefetch always 2.prefetch on miss 3.tagged prefetch

Prefetch always prfetches always after every reference. On miss prefetch on miss always fetches the next line. In tagged prefetch each block has a tag bit associated with it. When a block is fetched its tag bit set is set zero and one when it is used While block undergoes this change a new block is fetched.

Stream buffers Start prefetch before tag transition

Stream buffer consist of a series of entries, each consisting of a tag, an available bit, and a data line. On a miss it fetches successive line at the miss target. Lines after the line requested are placed in buffer which avoid populating the cache with the data which is not needed.

Multi-Way Stream Buffers ▪ only remove 25% of data cache miss  interleaved stream of data from different sources ▪ four stream buffer in parallel ▪ instruction stream unchanged ▪ twice the performance of the single stream buffer

Stream buffer Vs Prefetch Feasible to Implement Lower latency Extra hardware required by stream buffers is comparable with additional tag required by tagged prefetch.

Stream buffer performance vs.cache size Only data stream buffer performance improve as cache size increase It can contain data for reference pattern that access several sets of data.

Conclusion Miss cache beneficial in removing data cache miss and conflict misses. Victim cache is an improvement over Miss cache that saves the victim of the cache miss instead of target. stream buffer reduces capacity,compulsory miss Multiway stream buffers are set of stream buffers that can prefetch down several stream concurrently.

References Improving Direct-Mapped Cache Performance by the Addition of a small Fully-Associative Cache and Prefetch Buffers Norman P. Jouppi Computer Organization and design Patterson D. and Hennesy J.

Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani.

Similar presentations

Presentation on theme: "Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani.

Similar presentations

Presentation on theme: "Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani."— Presentation transcript:

Similar presentations

About project

Feedback