Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cache memory Replacement Policy Prof. Sin-Min Lee Department of Computer Science.

Similar presentations


Presentation on theme: "Cache memory Replacement Policy Prof. Sin-Min Lee Department of Computer Science."— Presentation transcript:

1 Cache memory Replacement Policy Prof. Sin-Min Lee Department of Computer Science

2 Where can a block be placed in Cache? (2) Direct mapped Cache –Each block has only one place where it can appear in the cache –(Block Address) MOD (Number of blocks in cache) Fully associative Cache –A block can be placed anywhere in the cache Set associative Cache –A block can be placed in a restricted set of places into the cache –A set is a group of blocks into the cache –(Block Address) MOD (Number of sets in the cache) If there are n blocks in the cache, the placement is said to be n- way set associative

3 How is a Block Found in the Cache? Caches have an address tag on each block frame that gives the block address. The tag is checked against the address coming from CPU –All tags are searched in parallel since speed is critical –Valid bit is appended to every tag to say whether this entry contains valid addresses or not Address fields: –Block address Tag – compared against for a hit Index – selects the set –Block offset – selects the desired data from the block Set associative cache –Large index means large sets with few blocks per set –With smaller index, the associativity increases Full associative cache – index field is not existing

4 Which Block should be Replaced on a Cache Miss? When a miss occurs, the cache controller must select a block to be replaced with the desired data –Benefit of direct mapping is that the hardware decision is much simplified Two primary strategies for full and set associative caches –Random – candidate blocks are randomly selected Some systems generate pseudo random block numbers, to get reproducible behavior useful for debugging –LRU (Last Recently Used) – to reduce the chance that information that has been recently used will be needed again, the block replaced is the least-recently used one. Accesses to blocks are recorded to be able to implement LRU

5 What Happens on a Write? Two basic options when writing to the cache: –Writhe through – the information is written to both, the block in the cache an the block in the lower-level memory –Write back – the information is written only to the lock in the cache The modified block of cache is written back into the lower-level memory only when it is replaced To reduce the frequency of writing back blocks on replacement, an implementation feature called dirty bit is commonly used. –This bit indicates whether a block is dirty (has been modified since loaded) or clean (not modified). If clean, no write back is involved

6 The connection between the CPU and cache is very fast; the connection between the CPU and memory is slower

7

8 There are three methods in block placement: Direct mapped : if each block has only one place it can appear in the cache, the cache is said to be direct mapped. The mapping is usually (Block address) MOD (Number of blocks in cache) Fully Associative : if a block can be placed anywhere in the cache, the cache is said to be fully associative. Set associative : if a block can be placed in a restricted set of places in the cache, the cache is said to be set associative. A set is a group of blocks in the cache. A block is first mapped onto a set, and then the block can be placed anywhere within that set. The set is usually chosen by bit selection; that is, (Block address) MOD (Number of sets in cache)

9 A pictorial example for a cache with only 4 blocks and a memory with only 16 blocks.

10 Direct mapped cache: A block from main memory can go in exactly one place in the cache. This is called direct mapped because there is direct mapping from any block address in memory to a single location in the cache. cache Main memory

11 Fully associative cache : A block from main memory can be placed in any location in the cache. This is called fully associative because a block in main memory may be associated with any entry in the cache. cache Main memory

12 Memory/Cache Related Terms Set associative cache : The middle range of designs between direct mapped cache and fully associative cache is called set-associative cache. In a n-way set- associative cache a block from main memory can go into n (n at least 2) locations in the cache. 2-way set-associative cache Main memory

13 Replacing Data Initially all valid bits are set to 0 As instructions and data are fetched from memory, the cache is filling and some data need to be replaced. Which ones? Direct mapping – obvious

14 Operating Systems Page Replacement Algorithms

15 Replacement Policies for Associative Cache 1.FIFO - fills from top to bottom and goes back to top. (May store data in physical memory before replacing it) 2.LRU – replaces the least recently used data. Requires a counter. 3.Random

16 Graph of Page Faults vs. the Number of Frames

17 The FIFO Policy Treats page frames allocated to a process as a circular buffer: –When the buffer is full, the oldest page is replaced. Hence first-in, first-out: A frequently used page is often the oldest, so it will be repeatedly paged out by FIFO. –Simple to implement: requires only a pointer that circles through the page frames of the process.

18 FIFO Page Replacement

19 Exercise

20 First-In-First-Out (FIFO) Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 3 frames (3 pages can be in memory at a time per process): 4 frames: FIFO Replacement manifests Belady’s Anomaly: –more frames  more page faults 1 2 3 1 2 3 4 1 2 5 3 4 9 page faults 1 2 3 1 2 3 5 1 2 4 5 10 page faults 4 43

21 FIFO Illustrating Belady’s Anomaly

22 Optimal Page Replacement The Optimal policy selects for replacement the page that will not be used for longest period of time. Impossible to implement (need to know the future) but serves as a standard to compare with the other algorithms we shall study.

23 Optimal Page Replacement

24 Exercise

25 Optimal Algorithm Reference string : 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 4 frames example How do you know this? You don’t! Used for measuring how well your algorithm performs. 1 2 3 4 6 page faults 4 5

26 26 The LRU Policy Replaces the page that has not been referenced for the longest time: –By the principle of locality, this should be the page least likely to be referenced in the near future. –performs nearly as well as the optimal policy.

27 LRU Page Replacement

28 Least Recently Used (LRU) Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 8 page faults 5 2 4 3 1 2 3 4 1 2 5 4 1 2 5 3 1 2 4 3

29 29 Comparison of OPT with LRU Example: A process of 5 pages with an OS that fixes the resident set size to 3.

30 Comparison of FIFO with LRU LRU recognizes that pages 2 and 5 are referenced more frequently than others but FIFO does not.

31 Implementation of the LRU Policy Each page could be tagged (in the page table entry) with the time at each memory reference. The LRU page is the one with the smallest time value (needs to be searched at each page fault). This would require expensive hardware and a great deal of overhead. Consequently very few computer systems provide sufficient hardware support for true LRU replacement policy. Other algorithms are used instead.

32 LRU Implementations Counter implementation: –Every page entry has a counter; every time a page is referenced through this entry, copy the clock into the counter. –When a page needs to be changed, look at the counters to determine which are to change. Stack implementation – keep a stack of page numbers in a double link form: –Page referenced: move it to the top requires 6 pointers to be changed –No search for replacement.

33 Use of a stack to implement LRU

34 Comparison of Clock with FIFO and LRU (1) Asterisk indicates that the corresponding use bit is set to 1. Clock protects frequently referenced pages by setting the reference bit to 1 at each reference.

35 Exercise

36 Paging: 2-Level memory system with large disk and k pages in RAM Sequence of requests to pages If the requested page is not in RAM (miss), it must be fetched from disk (another page may need to be evicted) Objective: minimize the number of misses CPU RAM disk k

37 Paging example: k= 5 Initial RAM state: 4 3 5 6 1 4 3 5 6 1 4 3 2 6 1 4 3 2 6 1 2 Requests: 39 4 3 2 6 9 7 4 3 2 7 9 5 5 3 2 7 9 5 3 2 7 6 6 5 3 1 7 6 1 Questions: Can we reduce the number of misses? Can we reduce it to zero? Yes No

38 Main question: What page to evict on a miss? Common replacement strategies: LRU = least recently used FIFO = first-in first-out 4 3 5 6 Requests: LRU Example: …4 5 3 621574 2 3 5 6 2 3 1 6 2 5 1 6 2 5 1 7 4 5 1 7

39 How to determine optimum? Belady Algorithm: at each step evict the page whose next request is farthest in the future. 4 3 5 6 …4 5 3 621579 4 2 5 6 4 1 5 6 4 1 5 6 4 1 7 6 4 1 9 6 6 14 4 1 9 6 4 1 9 6 5 37

40 How bad is LRU? Try k=2 each request is a miss every second request is a miss 1 231231 2 31 2 1 2 3 2 3 1 2 1 2 3 1 3 1 2 3 2 3 1 LRU: 1 2 1 3 1 3 2 3 2 3 2 1 2 1 3 1 3 1 Optimum: So competitive ratio of LRU is ≥ 2

41 Replacement in Set-Associative Cache Which if n ways within the location to replace? FIFO Random LRU Accessed locations are D, E, A

42 Writing Data If the location is in the cache, the cached value and possibly the value in physical memory must be updated. If the location is not in the cache, it maybe loaded into the cache or not (write-allocate and write-noallocate) Two methodologies: 1.Write-through Physical memory always contains the correct value 2.Write-back The value is written to physical memory only it is removed from the cache

43 Cache Performance Cache hits and cache misses. Hit ratio is the percentage of memory accesses that are served from the cache Average memory access time T M = h T C + (1- h)T P Tc = 10 ns Tp = 60 ns

44 Associative Cache Access order A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns FIFO h = 0.389 T M = 40.56 ns

45 Direct-Mapped Cache Access order A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns h = 0.167 T M = 50.67 ns

46 2-Way Set Associative Cache Access order A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns LRU h = 0.31389 T M = 40.56 ns

47 Associative Cache (FIFO Replacement Policy) DataABCADBEFACDBGCHIAB CACHECACHE AAAAAAAAAAAAAAAIII BBBBBBBBBBBBBBBAA CCCCCCCCCCCCCCCB DDDDDDDDDDDDDD EEEEEEEEEEEE FFFFFFFFFFF GGGGGG HHHH Hit? * * **** * Hit ratio = 7/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0

48 Two-way set associative cache (LRU Replacement Policy) Hit ratio = 7/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 DataABCADBEFACDBGCHIAB CACHECACHE 0A-0A-1 A-0 A-1E-0 E-1 B-0 B-1B-0 0 B-1 B-0B-1 A-0 A-1 A-0A-1 1 D-0 D-1 D-0 1 F-0 F-1 2 C-0 C-1 2 I-0 3 G-0 G-1 3 H-0 Hit? * * ** * **

49 Associative Cache with 2 byte line size (FIFO Replacement Policy) Hit ratio = 11/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H DataABCADBEFACDBGCHIAB CACHECACHE AAAAAAAAAAAAAAIIII JJJJJJJJJJJJJJHHHH BBBBBBBBBBBBBBBAA DDDDDDDDDDDDDDDJJ CCCCCCCCCCCCCCCB GGGGGGGGGGGGGGGD EEEEEEEEEEEE FFFFFFFFFFFF Hit? *** ******* *

50 Direct-mapped Cache with line size of 2 bytes Hit ratio 7/18 DataABCADBEFACDBGCHIAB CACHECACHE 0ABBABBBBAABBBBBBAB 1JDDJDDDDJJDDDDDDJD 2 CCCCCCCCCCCCCCCC 3 GGGGGGGGGGGGGGGG 4 EEEEEEEEEEEE 5 FFFFFFFFFFFF 6 IIII 7 HHHH Hit? * * * *** * A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H

51 Two-way set Associative Cache with line size of 2 bytes Hit ratio = 12/18 Data ABCADBEFACDBGCHIAB CACHECACHE 0A-0A-1 A-0A-1 E-0 E-1B-0 B-1B-0 1J-0J-1 J-0J-1 F-0 F-1D-0 D-1D-0 0 B-0 B-1B-0 B-1 A-0 A-1 A-0A-1 1 D-0 D-1D-0 D-1 J-0 J-1 J-0J-1 2 C-0 C-1 3 G-0 G-1 2 I-0 3 H-0 Hit? *** * * **** *** A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H

52 Page Replacement - FIFO FIFO is simple to implement –When page in, place page id on end of list –Evict page at head of list Might be good? Page to be evicted has been in memory the longest time But? –Maybe it is being used –We just don’t know FIFO suffers from Belady’s Anomaly – fault rate may increase when there is more physical memory!

53 Page Replacement Policy Working Set: –Set of pages used actively & heavily –Kept in memory to reduce Page Faults Set is found/maintained dynamically by OS Replacement: OS tries to predict which page would have least impact on the running program Common Replacement Schemes: Least Recently Used (LRU) First-In-First-Out (FIFO)

54 Replacement Policy Placement Policy –Which page is replaced? –Page removed should be the page least likely to be referenced in the near future –Most policies predict the future behavior on the basis of past behavior

55 Replacement Policy Frame Locking –If frame is locked, it may not be replaced –Kernel of the operating system –Control structures –I/O buffers –Associate a lock bit with each frame

56 Basic Replacement Algorithms Optimal policy –Selects for replacement that page for which the time to the next reference is the longest –Impossible to have perfect knowledge of future events

57 Basic Replacement Algorithms Least Recently Used (LRU) –Replaces the page that has not been referenced for the longest time –By the principle of locality, this should be the page least likely to be referenced in the near future –Each page could be tagged with the time of last reference. This would require a great deal of overhead.

58 Basic Replacement Algorithms First-in, first-out (FIFO) –Treats page frames allocated to a process as a circular buffer –Pages are removed in round-robin style –Simplest replacement policy to implement –Page that has been in memory the longest is replaced –These pages may be needed again very soon

59 Basic Replacement Algorithms Clock Policy –Additional bit called a use bit –When a page is first loaded in memory, the use bit is set to 1 –When the page is referenced, the use bit is set to 1 –When it is time to replace a page, the first frame encountered with the use bit set to 0 is replaced. –During the search for replacement, each use bit set to 1 is changed to 0

60

61 Page Replacement Policies Upon Replacement –Need to know whether to write data back –Add a Dirty-Bit Dirty Bit = 0; Page is clean; No writing Dirty Bit = 1; Page is dirty; Write back


Download ppt "Cache memory Replacement Policy Prof. Sin-Min Lee Department of Computer Science."

Similar presentations


Ads by Google