Cache memory. Cache memory Overview CPU Cache Main memory Transfer of words Transfer of blocks of words.

Slides:

Advertisements

Similar presentations

Multi-Level Caches Vittorio Zaccaria. Preview What you have seen: Data organization, Associativity, Cache size Policies -- how to manage the data once.

Advertisements

1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.

Performance of Cache Memory

Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.

1 Adapted from UCB CS252 S01, Revised by Zhao Zhang in IASTATE CPRE 585, 2004 Lecture 14: Hardware Approaches for Cache Optimizations Cache performance.

© Karen Miller, What do we want from our computers?  correct results we assume this feature, but consider... who defines what is correct?  fast.

Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.

Associative Cache Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word (or sub-address in line) Tag.

S. Barua – CPSC 440 CHAPTER 7 LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Topics to be covered – Principle.

TK 2123 COMPUTER ORGANISATION & ARCHITECTURE

1  Caches load multiple bytes per block to take advantage of spatial locality  If cache block size = 2 n bytes, conceptually split memory into 2 n -byte.

1 Lecture: Cache Hierarchies Topics: cache innovations (Sections B.1-B.3, 2.1)

An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.

Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.

1 Overview 2 Cache entry structure 3 mapping function 4 Cache hierarchy in a modern processor 5 Advantages and Disadvantages of Larger Caches 6 Implementation.

Chapter Twelve Memory Organization

10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.

L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.

Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.

CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.

Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Memory Hierarchy. Hierarchy List Registers L1 Cache L2 Cache Main memory Disk cache Disk Optical Tape.

1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.

DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%

Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache And Pefetch Buffers Norman P. Jouppi Presenter:Shrinivas Narayani.

Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:

1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.

1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.

High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.

1 Memory Hierarchy Design Chapter 5. 2 Cache Systems CPUCache Main Memory Data object transfer Block transfer CPU 400MHz Main Memory 10MHz Bus 66MHz CPU.

İnformasiya texnologiyaları kafedrası Mövzu № 4. Yaddaşın iyerarxiyası. Keş yaddaşı. (Kompüterin yaddaş sisteminin təsviri. Yaddaşın iyerarxiyası. Əsas.

Associative Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word Tag uniquely identifies block of.

Computer Architecture. Characteristics of Memory Systems Memory exhibits perhaps the widest range of type, technology, organization, performance, and.

Virtual memory.

Improving Memory Access 1/3 The Cache and Virtual Memory

CSC 4250 Computer Architectures

Multilevel Memories (Improving performance using alittle “cash”)

Lecture: Cache Hierarchies

5.2 Eleven Advanced Optimizations of Cache Performance

Cache Memory Presentation I

Morgan Kaufmann Publishers

William Stallings Computer Organization and Architecture 7th Edition

Lecture: Cache Hierarchies

Cache memory Direct Cache Memory Associate Cache Memory

BIC 10503: COMPUTER ARCHITECTURE

Chapter 5 Memory CSE 820.

Lecture 08: Memory Hierarchy Cache Performance

Module IV Memory Organization.

Lecture: Cache Innovations, Virtual Memory

Module IV Memory Organization.

Chapter 6 Memory System Design

Performance metrics for caches

Performance metrics for caches

Performance metrics for caches

Miss Rate versus Block Size

* From AMD 1996 Publication #18522 Revision E

CS 3410, Spring 2014 Computer Science Cornell University

Lecture: Cache Hierarchies

Performance metrics for caches

Cache - Optimization.

Performance metrics for caches

10/18: Lecture Topics Using spatial locality

Overview Problem Solution CPU vs Memory performance imbalance

Presentation transcript:

Cache memory

Cache memory Overview CPU Cache Main memory Transfer of words Transfer of blocks of words

Cache memory Organization Main memory k words word Address k-1 Line no c-1 Cache memory Tag k words

Elements of cache memory design Cache size Small enough so that the overall average cost per bit is close to that of main memory Large enough so that the overall average access time is close to the cache alone

Elements of cache memory design Cache size – hit ratio Average access time T s = H T 1 + (1 – H) (T 1 +T 2 ) Hit ratio H = N 1 / (N 1 + N 2 )

Elements of cache memory design Cache size – hit ratio M 1 (S 1 ) M 2 (S 2 ) CPU

Elements of cache memory design Mapping function Direct mapping Associative mapping Set associative mapping

Elements of cache memory design Mapping function – direct mapping Addresses of blocks Cache lines

Elements of cache memory design Mapping function – direct mapping The direct mapping technique is simple and inexpensive to implement. Its main disadvantage is that there is a fixed cache location for any given block. If a program happens to repeatedly reference words from two different blocks that map into the same line, then the blocks will be continually swapped in the cache.

Elements of cache memory design Mapping function – Associative mapping Associative mapping overcomes the disadvantage of direct mapping by permitting each main memory block to be loaded into any line of cache. Main disadvantage of associative mapping is the complex circuitry required to examine the tags of all cache lines in parallel.

Elements of cache memory design Mapping function – Set-associative mapping A compromise that exhibits the strengths of both the direct and associative approaches without their disadvantages.

Elements of cache memory design Mapping function – Set-associative mapping

Elements of cache memory design Write policy Write through All write operations are made to main memory as well as to the cache, ensuring that main memory is always valid. Write back Updates are made only in the cache. UPDATE bit is associated with each slot. When a block is replaced, it is written back to main memory if the update bit is set.

Elements of cache memory design Block size

Hit ratio increases at first because of principle of locality of reference. At some point hit ratio decreases because probability of using the newly fetched information becomes less than the probability of reusing the information that has to be replaced. Size of from 4 to 8 addressable units seems to be reasonably close to optimum.

Elements of cache memory design Number of caches Several studies have shown that, in general, the use of a second level of cache does improve performance Inclusion policy Inclusive multilevel cache Exclusive multilevel cache

Elements of cache memory design Unified vs Split Cache Unified cache contains both instructions and data Split cache has dedicated cache memories for storing instructions and data separately Which one is better?

Elements of cache memory design Unified vs Split Cache Unified cache For a given cache size, a unified cache has a higher hit rate than split cache because it balances the load between instruction and data fetches automatically. Only one cache needs to be designed and implemented.

Elements of cache memory design Unified vs Split Cache Split cache is currently used mainly due to various mechanisms of parallel processing employed in modern general-purpose processors (superscalar execution, pipelining) Split cache eliminates to some degree contention for the cache in the context of instruction and data accesses

Elements of cache memory design Advanced subjects Hardware prefetching Software prefetching - cacheability control instructions Streaming load & store instructions Cache coherency in the context of parallel processing mechanisms