Outline Objective for today: –Replacement policies.

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Basic memory management Swapping
Advertisements

CS 241 Spring 2007 System Programming 1 Memory Replacement Policies Lecture 32 Klara Nahrstedt.
4.4 Page replacement algorithms
Page Replacement Algorithms
Virtual Memory (Chapter 4.3)
Virtual Memory Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution -
Chapter 3.3 : OS Policies for Virtual Memory
Chapter 11 – Virtual Memory Management
Virtual Memory 3 Fred Kuhns
Module 10: Virtual Memory
Chapter 10: Virtual Memory
Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.
Virtual Memory II Chapter 8.
Memory Management.
Paging: Design Issues. Readings r Silbershatz et al: ,
Tutorial 8 March 9, 2012 TA: Europa Shang
Chapter 4 Memory Management Page Replacement 补充:什么叫页面抖动?
CS 241 Spring 2007 System Programming 1 Memory Implementation Issues Lecture 33 Klara Nahrstedt.
CSCC69: Operating Systems
Virtual Memory Operating System Concepts chapter 9 CS 355
Virtual Memory. 2 What is virtual memory? Each process has illusion of large address space –2 32 for 32-bit addressing However, physical memory is much.
Copyright ©: Lawrence Angrave, Vikram Adve, Caccamo 1 Virtual Memory III.
1 Virtual Memory Management B.Ramamurthy. 2 Demand Paging Main memory LAS 0 LAS 1 LAS 2 (Physical Address Space -PAS) LAS - Logical Address.
Chapter 101 Cleaning Policy When should a modified page be written out to disk?  Demand cleaning write page out only when its frame has been selected.
Chapter 9: Virtual-Memory Management. 9.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 9: Virtual Memory Background Demand Paging.
Virtual Memory Background Demand Paging Performance of Demand Paging
Virtual Memory Introduction to Operating Systems: Module 9.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.
Chapter 101 Virtual Memory Chapter 10 Sections and plus (Skip:10.3.2, 10.7, rest of 10.8)
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
Avishai Wool lecture Introduction to Systems Programming Lecture 8 Paging Design.
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
1 Virtual Memory Management B.Ramamurthy Chapter 10.
OS Spring’04 Virtual Memory: Page Replacement Operating Systems Spring 2004.
Page Replacement Algorithms
Demand Paging Virtual memory = indirect addressing + demand paging –Without demand paging, indirect addressing by itself is valuable because it reduces.
Operating Systems ECE344 Ding Yuan Page Replacement Lecture 9: Page Replacement.
Virtual Memory.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
Distributed Shared Memory Systems and Programming
Chapter 21 Virtual Memoey: Policies Chien-Chung Shen CIS, UD
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Copyright ©: Lawrence Angrave, Vikram Adve, Caccamo 1 Virtual Memory.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Paging.
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Memory Management Fundamentals Virtual Memory. Outline Introduction Motivation for virtual memory Paging – general concepts –Principle of locality, demand.
Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.
Lecture Topics: 11/24 Sharing Pages Demand Paging (and alternative) Page Replacement –optimal algorithm –implementable algorithms.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
1 Virtual Memory. Cache memory: provides illusion of very high speed Virtual memory: provides illusion of very large size Main memory: reasonable cost,
Distributed shared memory u motivation and the main idea u consistency models F strict and sequential F causal F PRAM and processor F weak and release.
Lectures 8 & 9 Virtual Memory - Paging & Segmentation System Design.
COMP091 – Operating Systems 1 Memory Management. Memory Management Terms Physical address –Actual address as seen by memory unit Logical address –Address.
1 Page Replacement Algorithms. 2 Virtual Memory Management Fundamental issues : A Recap Key concept: Demand paging  Load pages into memory only when.
Operating Systems ECE344 Ding Yuan Page Replacement Lecture 9: Page Replacement.
COT 4600 Operating Systems Spring 2011 Dan C. Marinescu Office: HEC 304 Office hours: Tu-Th 5:00-6:00 PM.
Outline for Today’s Lecture Administrative: –Enjoy Fall Break Objective for today: –Replacement policies.
CS 333 Introduction to Operating Systems Class 14 – Page Replacement
Virtual Memory What if program is bigger than available memory?
ITEC 202 Operating Systems
Virtual Memory Chapter 8.
Chapter 9: Virtual Memory
Page Replacement.
Outline Objective for today: Replacement policies.
Virtual Memory: Working Sets
Lecture 9: Caching and Demand-Paged Virtual Memory
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Chapter 9: Virtual Memory CSS503 Systems Programming
Chapter 9: Virtual Memory
Presentation transcript:

Outline Objective for today: –Replacement policies

Policies for Paged Virtual Memory The OS tries to minimize page fault costs incurred by all processes, balancing fairness, system throughput, etc. (1) fetch policy: When are pages brought into memory? prepaging: reduce page faults by bring pages in before needed on demand: in direct response to a page fault. (2) replacement policy: How and when does the system select victim pages to be evicted/discarded from memory? (3) placement policy: Where are incoming pages placed? Which frame? (4) backing storage policy: Where does the system store evicted pages? When is the backing storage allocated? When does the system write modified pages to backing store? Clustering: reduce seeks on backing storage

Page Replacement When there are no free frames available, the OS must replace a page (victim), removing it from memory to reside only on disk (backing store), writing the contents back if they have been modified since fetched (dirty). Replacement algorithm - goal to choose the best victim, with the metric for best (usually) being to reduce the fault rate.

Terminology Each thread/process/job utters a stream of page references. –Model execution as a page reference string: e.g., abcabcdabce.. The OS tries to minimize the number of faults incurred. –The set of pages (the working set) actively used by each job changes relatively slowly. –Try to arrange for the resident set of pages for each active job to closely approximate its working set. Replacement policy is the key. –Determines the resident subset of pages.

Page Reference String Addr trace Page str

A Example (Artificially Small Pagesize) #define dim int A[dim] [dim], B[dim] [dim]; int main() { int i, j, k; for (i= 0; i<dim; i++) for (j=0; j<dim; j++) B[i][j] = A[i][j]; exit; }. sw $0,16($fp) $L10: lw $2,16($fp) slt $3,$2,dim bne $3,$0,$L13 j $L11 $L13: sw $0,20($fp). B i j

Example Page Reference String …0 { 1 2 { } m 2 6 { } m 2 6 { } m 2 6 … { } m } n A B i j

Assessing Replacement Algs Model program execution as a reference string Metric of algorithm performance is fault rate Comparison to base line of Optimal Algorithm. For a specific algorithm: What is the information needed? How is that information gathered? When is it acted upon? At each memory reference At fault time At periodic intervals

Managing the VM Page Cache 1. Pages are typically referenced by page table entries. Must invalidate before reusing the frame. 2. Reads and writes are implicit; the TLB hides them from the OS. How can we tell if a page is dirty? How can we tell if a page is referenced? 3. Page cache manager must run policies periodically, sampling page state. Continuously push dirty pages to disk to launder them. Continuously check references to judge how hot each page is. Balance accuracy with sampling overhead.

Replacement Algorithms Assume fixed number of frames in memory assigned to this process: Optimal - baseline for comparison - future references known in advance - replace page used furthest in future. FIFO Least Recently Used (LRU) stack algorithm - dont do worse with more memory. LRU approximations for implementation Clock, Aging register

Example: Optimal …0 { 1 2 { } m 2 6 { } m 2 6 { } m 2 6 … { } m } n A B i j VAS

FIFO No extra hardware assistance needed, No per-reference overhead (we have no information about actual access pattern) At fault time: maintain a first-in first-out queue of pages resident in physical memory Replace oldest resident page Why it might make sense - straight-line code, sequential scans of data Beladys anomaly - fault rate can increase with more memory

Example Reference String …0 { 1 2 { } m 2 6 { } m 2 6 { } m 2 6 … { } m } n A B i j

LRU At fault time: replace the resident page that was last used the longest time ago Idea is to track the programs temporal locality To implement exactly: we need to order the pages by time of most recent reference (per-reference information needed HW, too $$) –timestamp pages at each ref, stack operations at each ref Stack algorithm - doesnt suffer from Beladys anomaly -- if i > j then set of pages with j frames is a subset of set of pages with i frames.

…0 { 1 2 { } m 2 6 { } m 2 6 { } m 2 6 … { } m } n 5 Example Reference String

LRU Approximations for Paging Pure LRU and LFU are prohibitively expensive to implement. –most references are hidden by the TLB –OS typically sees less than 10% of all references –cant tweak your ordered page list on every reference Most systems rely on an approximation to LRU for paging. –Implementable at reasonable cost –periodically sample the reference bit on each page visit page and set reference bit to zero run the process for a while (the reference window) come back and check the bit again –reorder the list of eviction candidates based on sampling

Approximations to LRU HW support: usage/reference bit in every PTE - set on each reference. –Set in TLB entry –Written back to page table (in memory) on TLB replacement We now know whether or not a page has been used since the last time the bits were cleared. –Algorithms differ on when & how thats done. –Unless its on every reference, there will be ties

A Page Table Entry (PTE) PFN valid bit: OS sets this to tell MMU that the translation is valid. write-enable: OS touches this to enable or disable write access for this mapping. reference bit: set when a reference is made through the mapping. dirty bit: set when a store is completed to the page (page is modified).

Clock Algorithm Maintain a circular queue with a pointer to the next candidate (clock hand). At fault time: scan around the clock, looking for page with usage bit of zero (thats your victim), clearing usage bits as they are passed. We now know whether or not a page has been used since the last time the bits were cleared Newest 1st Candidate

…0 { 1 2 { } m 2 6 { } m 2 6 { } m 2 6 … Example Reference String * 1* 2* 1 2 3* 3 4* 4 03*

Approximating a Timestamp Maintain a supplemental data structure (a counter) for each page Periodically (on a regular timer interrupt) gather info from the usage bits and zero them. for each page i {if (used i ) counter i = 0; else counter i ++; used i = 0;} At fault time, replace page with largest counter value (time intervals since last use)

Practical Considerations Dirty bit - modified pages require a writeback to secondary storage before frame is free to use again. Variation on Clock tries to maintain a healthy pool of clean, free frames –on timer interrupt, scan for unused pages, move to free pool, initiate writeback on dirty pages –at fault time, if page is still in frame in pool, reclaim; else take free, clean frame.

A Page Table Entry (PTE) PFN valid bit: OS sets this to tell MMU that the translation is valid. write-enable: OS touches this to enable or disable write access for this mapping. reference bit: set when a reference is made through the mapping. dirty bit: set when a store is completed to the page (page is modified).

The Paging Daemon Most OS have one or more system processes responsible for implementing the VM page cache replacement policy. –A daemon is an autonomous system process that periodically performs some housekeeping task. The paging daemon prepares for page eviction before the need arises. –Wake up when free memory becomes low. –Clean dirty pages by pushing to backing store. prewrite or pageout –Maintain ordered lists of eviction candidates. –Decide how much memory to allocate to file cache, VM, etc.

FIFO with Second Chance (Mach) Idea: do simple FIFO replacement, but give pages a second chance to prove their value before they are replaced. –Every frame is on one of three FIFO lists: active, inactive and free –Page fault handler installs new pages on tail of active list. –Old pages are moved to the tail of the inactive list. Paging daemon moves pages from head of active list to tail of inactive list when demands for free frames is high. Clear the refbit and protect the inactive page to monitor it. –Pages on the inactive list get a second chance. If referenced while inactive, reactivate to the tail of active list.

Illustrating FIFO-2C active list inactive list free list Consume frames from the head of the free list. If free shrinks below threshold, kick the paging daemon to start a scan. 2. Page at head of inactive list has not been referenced? page_protect and place on tail of free list. 3. Dirty page on inactive list? Push to disk and return to inactive list tail. Restock inactive list by pulling pages from the head of the active list: knock off the reference bit and inactivate. Inactive list scan: 1. Page on inactive list has been referenced? Return to tail of active list (reactivation). Paging daemon scans a few times per second, even if not needed to restock free list.

MMU Games Vanilla Demand Paging –Valid bit in PTE means non-resident page. Resulting page fault causes OS to initiate page transfer from disk. –Protection bits in PTE means page should not be accessed in that mode (usually means non-writable) What else can you do with them?

A Page Table Entry (PTE) PFN valid bit: OS sets this to tell MMU that the translation is valid. write-enable: OS touches this to enable or disable write access for this mapping. reference bit: set when a reference is made through the mapping. dirty bit: set when a store is completed to the page (page is modified). Useful in forcing an exception! Allows OS to regain control.

Simulating Usage Bits Turn off both valid bit and write-protect bit On first reference - fault allows recording the reference bit information by OS in an auxiliary data structure. Set it valid for subsequent accesses to go through HW. On first write attempt - protection fault allows recording the dirty bit information by OS in aux. data structure. PFN valid bit: OS sets this to tell MMU that the translation is valid. write-enable: OS touches this to enable or disable write access for this mapping.

Copy-on-Write Operating systems spend a lot of their time copying data. –particularly Unix operating systems, e.g., fork() –cross-address space copies are common and expensive Idea: defer big copy operations as long as possible, and hope they can be avoided completely. –create a new shadow object backed by an existing object –shared pages are mapped read-only in participating spaces read faults are satisfied from the original object (typically) write faults trap to the kernel –make a (real) copy of the faulted page –install it in the shadow object with writes enabled

Variable / Global Algorithms Not requiring each process to live within a fixed number of frames, replacing only its own pages. Can apply previously mentioned algorithms globally to victimize any processs pages Algorithms that make number of frames explicit. size of locality set time transitions stable

Variable Space Algorithms Working Set Tries to capture what the set of active pages currently is. The whole working set should be resident in memory for the process to bother running. WS is set of pages referenced during window w of time (now-w, now). –Working Set Clock - a hybrid approximation Page Fault Frequency Monitor fault rate, if pff > high threshold, grow # frames allocated to this process, if pff < low threshold, reduce # frames. Idea is to determine the right amount of memory to allocate.

Working Set Model Working set at time t is the set of pages referenced in the interval of time (t-w, t) where w is the working set window. –Implies per-reference information captured. –How to choose w? Identifies the active pages. Any page that is resident in memory but not in any processs working set is a candidate for replacement. Size of the working set can vary with locality changes

Example Reference String A B i j …0 { 1 2 { } m 2 6 { } m 2 6 { } m 2 6 … { } m } n

Clock Algorithm Maintain a circular queue with a pointer to the next candidate (clock hand). At fault time: scan around the clock, looking for page with usage bit of zero (thats your victim), clearing usage bits as they are passed. We now know whether or not a page has been used since the last time the bits were cleared Newest 1st Candidate

Working Set Model Working set at time t is the set of pages referenced in the interval of time (t-w, t) where w is the working set window. –Implies per-reference information captured. –How to choose w? Identifies the active pages. Any page that is resident in memory but not in any processs working set is a candidate for replacement. Size of the working set can vary with locality changes

WSClock The implementable approximation At fault time: scan usage bits of resident pages. For each page i {if (used i ) {time_of_ref i = vt owner[i] /*virtual time of owning process*/; used i = 0;} else if ( | vt owner[i] time_of_ref i | >= w ) replaceable; //else still in working set}

WSClock {a,b,c} {x,y} {a,f} {y,z} vt=3 vt=7vt=3vt=6 rt=3rt=6rt=10rt=13rt=0 W=2

WSClock (the picture) VT 0 = 8 VT 1 = 6 P1P1 P0P0 x x xx xx ref = 2ref = 5ref = 8 ref = 3 ref = 6 Let w = 5 Then this is no longer in working set of P 0

Thrashing Page faulting is dominating performance Causes: –Memory is overcommited - not enough to hold locality sets of all processes at this level of multiprogramming –Lousy locality of their programs –Positive feedback loops reacting to paging I/O rates # frames page fault freq. 1 N Load control is important (how many slices in frame pie?) –Admission control strategy (limit # processes in load phase) enough frames too few frames difference between repl. algs

Pros and Cons of VM Demand paging gives the OS flexibility to manage memory... –programs may run with pages missing unused or cold pages do not consume real memory improves degree of multiprogramming –program size is not limited by physical memory program size may grow (e.g., stack and heap) …but VM takes control away from the application. –With traditional interfaces, the application cannot tell how much memory it has or how much a given reference costs. –Fetching pages on demand may force the application to incur I/O stalls for many of its references.

Motivation: How to exploit remote memory? P $ Memory I/O bottleneck VM and file caching Remote Memory Low latency networks make retrieving data across the network faster than local disk

Distributed Shared Memory (DSM) Allows use of a shared memory programming model (shared address space) in a distributed system (processors with only local memory) network proc msg mmu mem

DSM Issues Can use the local memory management hardware to generate fault when desired page is not locally present or when write attempted on read-only copy. Locate the page remotely - current owner of page (last writer) or home for page. Page sent in message to requesting node (read access makes copy; write migrates) Consistency protocol - invalidations or broadcast of changes (update) –directory kept of caches holding copies

DSM States Forced faults are key to consistency operations Invalid local mapping, attempted read access - data copied from most recent writer, set write-protect bit for all copies. Invalid local mapping, attempted write access - migrate data, invalidate all other copies. Local read-only copy, write-fault - invalidate all other copies

Consistency Models Sequential consistency –All memory operations appear to execute one at a time. A write is considered done only when invalidations or updates have propagated to all copies. Weaker forms of consistency –Guarantees associated with synchronization primitives; at other times, it doesnt matter –For example: acquire lock - make sure others writes are done release lock - make sure all my writes are seen by others

Example - the Problem A = 0; A = 1; if (B 0) succ[0] = true; B = 0; B = 1; if (A 0) succ[1] = true; time (A 1 & B 1)

Example - Sequential Consistency A = 0; A = 1; if (B 0) succ[0] = true; B = 0; B = 1; if (A 0) succ[1] = true; time B = 1 delayed A = 1 delayed

Example - Weaker Consistency A = 0; A = 1; acquire (mutex0); if (B 0) succ[0] = true; release (mutex0); B = 0; B = 1; acquire (mutex1); if (A 0) succ[1] = true; release (mutex1);

False Sharing Page bouncing sets in when there are too frequent coherency operations. One cause: false sharing Half used by one proc. Other half by another Subpage management? Packaged & managed as one page