Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to.

Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to certain localized areas of memory. This is particularly true for a program with frequent loop structure. Locality of reference property: Over a short interval of time, the addresses generated by a typical program refer to a few localized areas of memory repeatedly, while the remainder of memory is accessed infrequently. A widely held rule of thumb is that a program spends [about] 90% of its execution time in only [about] 10% of the code (Hennessy and Patterson, 38).

Evolution of Cache: If the active portions of the program and data are placed in a fast and small memory, the average memory access time can be reduced, thus by reducing the total execution time of the program. Cache is the fastest component in the memory hierarchy, which approaches CPU speeds, and is placed between CPU and memory. Cache stores the most frequently accessed instructions and data.

Objective is to use inexpensive, fast memory. Memory Hierarchy:

Main memory : – Large, inexpensive, slow memory stores entire program and data Cache – Small, expensive, fast memory stores copy of likely accessed parts of larger memory Can be multiple levels of cache. The fastest cache is closest to the CPU (in modern processors, on the die), and each subsequent layer gets slower, farther from the processor, and (generally), larger. Each layer keeps a copy of the information that is in the smaller/faster layer above it. The hard drive holds the information that's in the RAM, which holds information that is in the cache, and if there are multiple layers of cache, this process keeps going.

Usually on same chip as processor – space limited, so much smaller than off-chip main memory – faster access (1 cycle vs. several cycles for main memory) – Cache is usually designed with SRAM – SRAM is faster but more expensive than DRAM - it uses more transistors for each bit of information; it draws more power because of this; and it takes up more space for the very same reason. – What makes SRAM different? DRAM must be periodically "refreshed," because the electrical charge of the DRAM cells decays with time, losing the data. SRAM, on the other hand, does not suffer this electrical decay. It uses more transistors per bit, allowing it to operate without losing its charge while a current is flowing through it. SRAMs also have lower latencies (the amount of time that it takes to get information to the processor after being called upon). -refer to DRAM/SRAM slides

Cache operation Latency refers to the time it takes a task to be accomplished, expressed in clock cycles from the perspective of the device's clock In the case of cache and memory, it refers to the amount of time that it takes for the cache (or memory) to send data. Example: 100MHZ SDRAM with a latency of 9 cycles with a 1GHZ CPU means a latency of 90 cycles to the CPU!

A cache hit refers to an occurrence when the CPU asks for information from the cache, and gets it. A cache miss is an occurrence when the CPU asks for information from the cache, and does not get it from that level. The hit rate, is the average percentage of times that the processor will get a cache hit. Cache operation: – Request for main memory access (read or write) – First, check cache for copy - – copy is in cache, quick access - cache hit – copy not in cache - cache miss - copy the addressed and possibly its neighbors into cache Several cache design choices available based on: – cache mapping, replacement policies, and write techniques

Cache-Replacement Policy Technique for choosing which block to replace – when fully associative cache is full – when set-associative cache’s line is full Direct mapped cache has no choice Random – replace block chosen at random LRU: least-recently used – replace block not accessed for longest time FIFO: first-in-first-out – push block onto queue when accessed – choose block to replace by popping queue

Cache Write Techniques When written, data cache must update main memory Write-through – write to main memory whenever cache is written to – easiest to implement – processor must wait for slower main memory write – potential for unnecessary writes Write-back – main memory only written when “flagged” block replaced – extra flag bit for each block set when cache block written to – reduces number of slow main memory writes

Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to.

Similar presentations

Presentation on theme: "Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to.

Similar presentations

Presentation on theme: "Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to."— Presentation transcript:

Similar presentations

About project

Feedback