Cache memory Replacement Policy Prof. Sin-Min Lee Department of Computer Science.

Slides:



Advertisements
Similar presentations
CS 241 Spring 2007 System Programming 1 Memory Replacement Policies Lecture 32 Klara Nahrstedt.
Advertisements

Page Replacement Algorithms
Page Replacement Algorithms
Chapter 101 The LRU Policy Replaces the page that has not been referenced for the longest time in the past By the principle of locality, this would be.
Virtual Memory 3 Fred Kuhns
Virtual Memory II Chapter 8.
Virtual Memory: Page Replacement
Scribe for 7 th April 2014 Page Replacement Algorithms Payal Priyadarshini 11CS30023.
Chapter 8 Virtual Memory
Lecture 8: Memory Hierarchy Cache Performance Kai Bu
Virtual Memory Management G. Anuradha Ref:- Galvin.
Virtual Memory Background Demand Paging Performance of Demand Paging
Chapter 8 Virtual Memory Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice.
Memory/Storage Architecture Lab Computer Architecture Virtual Memory.
1 Lecture 20: Cache Hierarchies, Virtual Memory Today’s topics:  Cache hierarchies  Virtual memory Reminder:  Assignment 8 will be posted soon (due.
Chapter 101 Virtual Memory Chapter 10 Sections and plus (Skip:10.3.2, 10.7, rest of 10.8)
Virtual Memory Chapter 8.
Virtual Memory Chapter 8.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Memory Management – 4 Page Replacement Algorithms CS 342 – Operating Systems.
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
1 Virtual Memory Chapter 9. 2 Characteristics of Paging and Segmentation n Memory references are dynamically translated into physical addresses at run.
Virtual Memory Chapter 8.
1 Virtual Memory Chapter 8. 2 Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
1 Lecture 9: Virtual Memory Operating System I Spring 2007.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Virtual Memory Chapter 8.
Chapter 8 Virtual Memory
OS Spring’04 Virtual Memory: Page Replacement Operating Systems Spring 2004.
Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
Operating Systems ECE344 Ding Yuan Page Replacement Lecture 9: Page Replacement.
Virtual Memory Chapter 8. Characteristics of Paging and Segmentation A process may be broken up into pieces (pages or segments) that do not need to be.
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
1 Virtual Memory Chapter 8. 2 Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
Chapter 8 Virtual Memory Operating Systems: Internals and Design Principles Seventh Edition William Stallings.
Operating Systems CMPSC 473 Virtual Memory Management (3) November – Lecture 20 Instructor: Bhuvan Urgaonkar.
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts Essentials – 9 th Edition Chapter 9: Virtual-Memory Management.
CSC 360, Instructor: Kui Wu Memory Management II: Virtual Memory.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
CS307 Operating Systems Virtual Memory Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University Spring 2012.
Chapter 9 Memory Organization By Nguyen Chau Topics Hierarchical memory systems Cache memory Associative memory Cache memory with associative mapping.
Lecture 20 Last lecture: Today’s lecture: Types of memory
Cache Memory By Ed Martinez.  The fastest and most expensive memory on a computer system that is used to store collections of data.  Uses very short.
1 Lecture 8: Virtual Memory Operating System Fall 2006.
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
1 Chapter 10: Virtual Memory Background Demand Paging Process Creation Page Replacement Allocation of Frames Thrashing Operating System Examples (not covered.
10.1 Chapter 10: Virtual Memory Background Demand Paging Process Creation Page Replacement Allocation of Frames Thrashing Operating System Examples.
Operating Systems ECE344 Ding Yuan Page Replacement Lecture 9: Page Replacement.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 9: Virtual Memory.
Chapter 9: Virtual Memory. 9.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Background Virtual memory – separation of user logical memory.
Cache memory Replacement Policy, Virtual Memory Prof. Sin-Min Lee Department of Computer Science.
Page Replacement FIFO, LIFO, LRU, NUR, Second chance
Virtual Memory Chapter 8.
ITEC 202 Operating Systems
Computer Architecture
Chapter 8 Virtual Memory
Lecture 10: Virtual Memory
Chapter 9 Virtual Memory
Chapter 9: Virtual-Memory Management
Lecture 39 Syed Mansoor Sarwar
Operating Systems CMPSC 473
Operating Systems Concepts
Chapter 9: Virtual Memory CSS503 Systems Programming
Virtual Memory.
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Cache memory Replacement Policy Prof. Sin-Min Lee Department of Computer Science

Where can a block be placed in Cache? (2) Direct mapped Cache –Each block has only one place where it can appear in the cache –(Block Address) MOD (Number of blocks in cache) Fully associative Cache –A block can be placed anywhere in the cache Set associative Cache –A block can be placed in a restricted set of places into the cache –A set is a group of blocks into the cache –(Block Address) MOD (Number of sets in the cache) If there are n blocks in the cache, the placement is said to be n- way set associative

How is a Block Found in the Cache? Caches have an address tag on each block frame that gives the block address. The tag is checked against the address coming from CPU –All tags are searched in parallel since speed is critical –Valid bit is appended to every tag to say whether this entry contains valid addresses or not Address fields: –Block address Tag – compared against for a hit Index – selects the set –Block offset – selects the desired data from the block Set associative cache –Large index means large sets with few blocks per set –With smaller index, the associativity increases Full associative cache – index field is not existing

Which Block should be Replaced on a Cache Miss? When a miss occurs, the cache controller must select a block to be replaced with the desired data –Benefit of direct mapping is that the hardware decision is much simplified Two primary strategies for full and set associative caches –Random – candidate blocks are randomly selected Some systems generate pseudo random block numbers, to get reproducible behavior useful for debugging –LRU (Last Recently Used) – to reduce the chance that information that has been recently used will be needed again, the block replaced is the least-recently used one. Accesses to blocks are recorded to be able to implement LRU

What Happens on a Write? Two basic options when writing to the cache: –Writhe through – the information is written to both, the block in the cache an the block in the lower-level memory –Write back – the information is written only to the lock in the cache The modified block of cache is written back into the lower-level memory only when it is replaced To reduce the frequency of writing back blocks on replacement, an implementation feature called dirty bit is commonly used. –This bit indicates whether a block is dirty (has been modified since loaded) or clean (not modified). If clean, no write back is involved

The connection between the CPU and cache is very fast; the connection between the CPU and memory is slower

There are three methods in block placement: Direct mapped : if each block has only one place it can appear in the cache, the cache is said to be direct mapped. The mapping is usually (Block address) MOD (Number of blocks in cache) Fully Associative : if a block can be placed anywhere in the cache, the cache is said to be fully associative. Set associative : if a block can be placed in a restricted set of places in the cache, the cache is said to be set associative. A set is a group of blocks in the cache. A block is first mapped onto a set, and then the block can be placed anywhere within that set. The set is usually chosen by bit selection; that is, (Block address) MOD (Number of sets in cache)

A pictorial example for a cache with only 4 blocks and a memory with only 16 blocks.

Direct mapped cache: A block from main memory can go in exactly one place in the cache. This is called direct mapped because there is direct mapping from any block address in memory to a single location in the cache. cache Main memory

Fully associative cache : A block from main memory can be placed in any location in the cache. This is called fully associative because a block in main memory may be associated with any entry in the cache. cache Main memory

Memory/Cache Related Terms Set associative cache : The middle range of designs between direct mapped cache and fully associative cache is called set-associative cache. In a n-way set- associative cache a block from main memory can go into n (n at least 2) locations in the cache. 2-way set-associative cache Main memory

Replacing Data Initially all valid bits are set to 0 As instructions and data are fetched from memory, the cache is filling and some data need to be replaced. Which ones? Direct mapping – obvious

Operating Systems Page Replacement Algorithms

Replacement Policies for Associative Cache 1.FIFO - fills from top to bottom and goes back to top. (May store data in physical memory before replacing it) 2.LRU – replaces the least recently used data. Requires a counter. 3.Random

Graph of Page Faults vs. the Number of Frames

The FIFO Policy Treats page frames allocated to a process as a circular buffer: –When the buffer is full, the oldest page is replaced. Hence first-in, first-out: A frequently used page is often the oldest, so it will be repeatedly paged out by FIFO. –Simple to implement: requires only a pointer that circles through the page frames of the process.

FIFO Page Replacement

Exercise

First-In-First-Out (FIFO) Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 3 frames (3 pages can be in memory at a time per process): 4 frames: FIFO Replacement manifests Belady’s Anomaly: –more frames  more page faults page faults page faults 4 43

FIFO Illustrating Belady’s Anomaly

Optimal Page Replacement The Optimal policy selects for replacement the page that will not be used for longest period of time. Impossible to implement (need to know the future) but serves as a standard to compare with the other algorithms we shall study.

Optimal Page Replacement

Exercise

Optimal Algorithm Reference string : 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 4 frames example How do you know this? You don’t! Used for measuring how well your algorithm performs page faults 4 5

26 The LRU Policy Replaces the page that has not been referenced for the longest time: –By the principle of locality, this should be the page least likely to be referenced in the near future. –performs nearly as well as the optimal policy.

LRU Page Replacement

Least Recently Used (LRU) Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 8 page faults

29 Comparison of OPT with LRU Example: A process of 5 pages with an OS that fixes the resident set size to 3.

Comparison of FIFO with LRU LRU recognizes that pages 2 and 5 are referenced more frequently than others but FIFO does not.

Implementation of the LRU Policy Each page could be tagged (in the page table entry) with the time at each memory reference. The LRU page is the one with the smallest time value (needs to be searched at each page fault). This would require expensive hardware and a great deal of overhead. Consequently very few computer systems provide sufficient hardware support for true LRU replacement policy. Other algorithms are used instead.

LRU Implementations Counter implementation: –Every page entry has a counter; every time a page is referenced through this entry, copy the clock into the counter. –When a page needs to be changed, look at the counters to determine which are to change. Stack implementation – keep a stack of page numbers in a double link form: –Page referenced: move it to the top requires 6 pointers to be changed –No search for replacement.

Use of a stack to implement LRU

Comparison of Clock with FIFO and LRU (1) Asterisk indicates that the corresponding use bit is set to 1. Clock protects frequently referenced pages by setting the reference bit to 1 at each reference.

Exercise

Paging: 2-Level memory system with large disk and k pages in RAM Sequence of requests to pages If the requested page is not in RAM (miss), it must be fetched from disk (another page may need to be evicted) Objective: minimize the number of misses CPU RAM disk k

Paging example: k= 5 Initial RAM state: Requests: Questions: Can we reduce the number of misses? Can we reduce it to zero? Yes No

Main question: What page to evict on a miss? Common replacement strategies: LRU = least recently used FIFO = first-in first-out Requests: LRU Example: …

How to determine optimum? Belady Algorithm: at each step evict the page whose next request is farthest in the future …

How bad is LRU? Try k=2 each request is a miss every second request is a miss LRU: Optimum: So competitive ratio of LRU is ≥ 2

Replacement in Set-Associative Cache Which if n ways within the location to replace? FIFO Random LRU Accessed locations are D, E, A

Writing Data If the location is in the cache, the cached value and possibly the value in physical memory must be updated. If the location is not in the cache, it maybe loaded into the cache or not (write-allocate and write-noallocate) Two methodologies: 1.Write-through Physical memory always contains the correct value 2.Write-back The value is written to physical memory only it is removed from the cache

Cache Performance Cache hits and cache misses. Hit ratio is the percentage of memory accesses that are served from the cache Average memory access time T M = h T C + (1- h)T P Tc = 10 ns Tp = 60 ns

Associative Cache Access order A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns FIFO h = T M = ns

Direct-Mapped Cache Access order A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns h = T M = ns

2-Way Set Associative Cache Access order A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns LRU h = T M = ns

Associative Cache (FIFO Replacement Policy) DataABCADBEFACDBGCHIAB CACHECACHE AAAAAAAAAAAAAAAIII BBBBBBBBBBBBBBBAA CCCCCCCCCCCCCCCB DDDDDDDDDDDDDD EEEEEEEEEEEE FFFFFFFFFFF GGGGGG HHHH Hit? * * **** * Hit ratio = 7/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0

Two-way set associative cache (LRU Replacement Policy) Hit ratio = 7/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 DataABCADBEFACDBGCHIAB CACHECACHE 0A-0A-1 A-0 A-1E-0 E-1 B-0 B-1B-0 0 B-1 B-0B-1 A-0 A-1 A-0A-1 1 D-0 D-1 D-0 1 F-0 F-1 2 C-0 C-1 2 I-0 3 G-0 G-1 3 H-0 Hit? * * ** * **

Associative Cache with 2 byte line size (FIFO Replacement Policy) Hit ratio = 11/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H DataABCADBEFACDBGCHIAB CACHECACHE AAAAAAAAAAAAAAIIII JJJJJJJJJJJJJJHHHH BBBBBBBBBBBBBBBAA DDDDDDDDDDDDDDDJJ CCCCCCCCCCCCCCCB GGGGGGGGGGGGGGGD EEEEEEEEEEEE FFFFFFFFFFFF Hit? *** ******* *

Direct-mapped Cache with line size of 2 bytes Hit ratio 7/18 DataABCADBEFACDBGCHIAB CACHECACHE 0ABBABBBBAABBBBBBAB 1JDDJDDDDJJDDDDDDJD 2 CCCCCCCCCCCCCCCC 3 GGGGGGGGGGGGGGGG 4 EEEEEEEEEEEE 5 FFFFFFFFFFFF 6 IIII 7 HHHH Hit? * * * *** * A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H

Two-way set Associative Cache with line size of 2 bytes Hit ratio = 12/18 Data ABCADBEFACDBGCHIAB CACHECACHE 0A-0A-1 A-0A-1 E-0 E-1B-0 B-1B-0 1J-0J-1 J-0J-1 F-0 F-1D-0 D-1D-0 0 B-0 B-1B-0 B-1 A-0 A-1 A-0A-1 1 D-0 D-1D-0 D-1 J-0 J-1 J-0J-1 2 C-0 C-1 3 G-0 G-1 2 I-0 3 H-0 Hit? *** * * **** *** A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H

Page Replacement - FIFO FIFO is simple to implement –When page in, place page id on end of list –Evict page at head of list Might be good? Page to be evicted has been in memory the longest time But? –Maybe it is being used –We just don’t know FIFO suffers from Belady’s Anomaly – fault rate may increase when there is more physical memory!

Page Replacement Policy Working Set: –Set of pages used actively & heavily –Kept in memory to reduce Page Faults Set is found/maintained dynamically by OS Replacement: OS tries to predict which page would have least impact on the running program Common Replacement Schemes: Least Recently Used (LRU) First-In-First-Out (FIFO)

Replacement Policy Placement Policy –Which page is replaced? –Page removed should be the page least likely to be referenced in the near future –Most policies predict the future behavior on the basis of past behavior

Replacement Policy Frame Locking –If frame is locked, it may not be replaced –Kernel of the operating system –Control structures –I/O buffers –Associate a lock bit with each frame

Basic Replacement Algorithms Optimal policy –Selects for replacement that page for which the time to the next reference is the longest –Impossible to have perfect knowledge of future events

Basic Replacement Algorithms Least Recently Used (LRU) –Replaces the page that has not been referenced for the longest time –By the principle of locality, this should be the page least likely to be referenced in the near future –Each page could be tagged with the time of last reference. This would require a great deal of overhead.

Basic Replacement Algorithms First-in, first-out (FIFO) –Treats page frames allocated to a process as a circular buffer –Pages are removed in round-robin style –Simplest replacement policy to implement –Page that has been in memory the longest is replaced –These pages may be needed again very soon

Basic Replacement Algorithms Clock Policy –Additional bit called a use bit –When a page is first loaded in memory, the use bit is set to 1 –When the page is referenced, the use bit is set to 1 –When it is time to replace a page, the first frame encountered with the use bit set to 0 is replaced. –During the search for replacement, each use bit set to 1 is changed to 0

Page Replacement Policies Upon Replacement –Need to know whether to write data back –Add a Dirty-Bit Dirty Bit = 0; Page is clean; No writing Dirty Bit = 1; Page is dirty; Write back