Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute.

Cache Memory By Tom Austin

What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute. With modern CPU speeds, accessing main memory would become a major performance bottleneck without cache memory.

Analogy for Cache An analogy can be made between cache & main memory and a desk & a bookshelf. Like a desk, cache allows you to access things quickly. A desk can hold only a few books at a time, but they are right at your fingertips. This can be thought of as providing better performance. Like a book shelf, main memory provides a large capacity. It takes longer to get a book from a bookshelf, but it can hold far more books. In other words, it has a much greater capacity.

Types of Cache Memory L1 cache, also known as internal cache, is part of the CPU. L2 cache, known as external cache, supplements the internal cache. It is usually not as fast as L1 cache, but has a larger capacity. L3 cache exists on some larger machines, usually on a separate chip between the L1 cache and L2 cache. Its benefits are seen mostly on extremely large data sets. It is very expensive, and is not used in most PCs today.

Cache Hierarchy Information is read from the L1 cache first. In “inclusive” caches, all of the information in the L1 cache is duplicated in the L2 cache. In “exclusive” caches, no block of memory will be in both caches.

Cache Read Operations Requested information is taken from the cache if it is available. This is known as a “hit”. If not available in the cache, the information will be taken from main memory instead. Also, the cache will be updated with this block from memory. This is a “miss”. Cache is organized into fixed length “lines”. Each block in main memory can be mapped to one or more lines in the cache.

Evaluating Cache Performance The “hit ratio” is the ratio of hits to total checks. Hit Ratio = Hits / (Hits + Misses) A well designed cache can have a hit ratio of close to one.

Replacement Policies Whenever there is a “miss”, the information must be read from main memory. In addition, the cache is updated with this new information. One line will be replaced with the new block of information. Policies for doing this vary. The three most commonly used are FIFO, LRU, and Random.

FIFO Replacement Policy First in, first out – Replaces the oldest line in the cache, regardless of the last time that this line was accessed. The main benefit is that this is easy to implement. The principle drawback is that you won’t keep any item in cache for long – you may find that you are constantly removing and adding the same block of memory.

LRU Replacement Policy Least Recently Used – The line that was accessed least recently is replaced with the new block of data. The benefit is that this keeps the most frequently accessed lines in the cache. The drawback is that this can be difficult and costly to implement, especially if there are lots of lines to consider.

Random Replacement Policy With this policy, the line that is replaced is chosen randomly. Performance is close to that of LRU, and the implementation is much simpler.

Cache Write Operations If the information to be updated is in the cache, this is known as a “write hit”. If not, the information must be updated in main memory. This is known as a “write miss”.

Write-Hit Policies Write through – Cache and main memory are immediately updated. Write back – Cache is immediately updated. Main memory is not updated until the time a block is read from memory. –Performance is better with this approach, particularly with an application that is performing a lot of writes. –An extra bit (called a “dirty bit”) keeps track of what lines in the cache need to update main memory.

Write-Miss Policies Write-allocate – On a write-miss, the updated block of memory will be added to the cache. Write-no-allocate – On a write-hit, the updated block of memory is not added to the cache. This is also known as a “write around” policy.

Multi-Processor Issues with Write Operations On multi-processor machines, each processor will usually have its own cache. When there is an update made to the data, information may become “stale” on the other caches, regardless of the write policy. One solution is to cache only non-shareable memory. Another solution is to update all caches on any write operation.

Memory to Cache Mapping Schemes There are a variety of ways to map blocks of memory to lines in the cache. We will discuss the following schemes: Associative mapping Direct mapping Set-associative mapping

Associative Mapping Also known as fully associative mapping, any block of memory can be mapped to any line in the cache. The block number is appended to the line as a tag. There is a bit to indicate whether the line is valid or not. This is difficult to implement. Associative memory is required, which is very expensive.

Direct Mapping Every block in main memory can be loaded to one and only one line in the cache. The lines in cache have a tag to indicate which block of main memory is in the cache. This is easy to implement, but inflexible. Care must be taken in how the blocks are divided.

Set-Associative Mapping Blocks can be mapped to a subset of the lines in cache. These mapping types are usually either “Two- way” or “Four-way”. This means that a block can be mapped to either 2 or 4 lines in cache. LRU is often used in this mapping process. Due to the low number of lines that must be considered, this becomes feasible. This is more flexible than direct mapping, but easier to implement than associative mapping.

Another way to look at associativity… Direct mapping can be thought of as one end of a spectrum, where a block can only be written to one line in the cache. Set-associative mapping is the middle ground, where a block can be stored to a subset of the lines in cache, but not to any line. (Fully) associative mapping is the opposite extreme – a block can be written to any line.

Associativity Illustration

Associativity tradeoff Checking more lines requires more power and area. However, caches with more associativity suffer fewer misses. A rule of thumb is that doubling the associativity has about the same effect on hit-rate as doubling the cache size. This rule breaks down beyond 4- way set associative mappings.

References “Assembly Language and Computer Architecture Using C++ and Java” by Anthony J. Dos Reis, section 14.8 (p. 637-643) http://en.wikipedia.org/wiki/CPU_cache http://www.karbosguide.com/hardware/ module3b2.htmhttp://www.karbosguide.com/hardware/ module3b2.htm http://www-courses.cs.uiuc.edu/ ~cs232/lectures/21-Cache-Performance.pdfhttp://www-courses.cs.uiuc.edu/ ~cs232/lectures/21-Cache-Performance.pdf

Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute.

Similar presentations

Presentation on theme: "Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute.

Similar presentations

Presentation on theme: "Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute."— Presentation transcript:

Similar presentations

About project

Feedback