Gordon College Stephen Brinton

Name: Gordon College Stephen Brinton
Uploaded: 2017-08-18T20:31:31+00:00
Duration: PTM30S46
Description: Gordon College Stephen Brinton

Gordon College Stephen Brinton
Virtual Memory Gordon College Stephen Brinton GOAL OF MEMORY MANAGEMENT: keep many processes in memory simultaneously to allow multiprogramming CON: entire process must be in memory to execute GOAL OF VIRTUAL MEMORY: allow a process to execute even if it is not entirely in memory PRO: program can be larger than physical memory PRO: free the programmer from memory limitation concerns

Virtual Memory Background Demand Paging Process Creation
Page Replacement Allocation of Frames Thrashing Demand Segmentation Operating System Examples GOAL OF MEMORY MANAGEMENT: keep many processes in memory simultaneously to allow multiprogramming CON: entire process must be in memory to execute (extra precautions by programmer) GOAL OF VIRTUAL MEMORY: allow a process to execute even if it is not entirely in memory PRO: program can be larger than physical memory PRO: free the programmer from memory limitation concerns

Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needed Logical address space > physical address space. (easier for programmer) shared by several processes. efficient process creation. Less I/O to swap processes Virtual memory can be implemented via: Demand paging Demand segmentation Only part of the program needs to be in memory for execution. Some parts of a program may never be needed Rarely are all parts of program needed at the same time Logical address space can therefore be much larger than physical address space. (easier for programmer) Allows address spaces to be shared by several processes. Allows for more efficient process creation. Less I/O to swap processes in and out of memory

Larger Than Physical Memory
 VIEW OF PROCESS AND PROGRAMMER… LARGE AND CONTIGUOUS Stack grows down and heap grows up – this area is only needed if it is used…

Shared Library Using Virtual Memory
Check out the shared library – dll (dynamic and compile time shared libraries) Also can share memory – recall from chapter 3 Pages to be shared during the process fork or thread creation – this speeds up the process creation

Demand Paging Bring a page into memory only when it is needed
Less I/O needed Less memory needed Faster response More users (processes) able to execute Page is needed  reference to it invalid reference  abort not-in-memory  bring to memory Pager will guess which pages will be used before the process is finished – THESE pages are swapped in. (LIKE SWAPPING with contiguous memory locations) Valid and invalid hardware is needed valid: page is legal and in memory invalid: page is either not legal or just out of memory ACCESS to an invalid page  page-fault trap

Valid-Invalid Bit With each page table entry a valid–invalid bit is associated (1  in-memory, 0  not-in-memory) Initially valid–invalid bit is set to 0 on all entries During address translation, if valid–invalid bit in page table entry is 0  page-fault trap Example of a page table snapshot: Frame # valid-invalid bit 1 1 1 1 Valid and invalid hardware is needed valid: page is legal and in memory invalid: page is either not legal or just out of memory ACCESS to an invalid page  page-fault trap  page table

Page Table: Some Pages Are Not in Main Memory
Page Table When Some Pages Are Not in Main Memory Some pages are MEMORY RESIDENT

Page-Fault Trap Reference to a page with invalid bit set - trap to OS  page fault Must decide???: Invalid reference  abort. Just not in memory. Get empty frame. Swap page into frame. Reset tables, validation bit = 1. Restart instruction: what happens if it is in the middle of an instruction? PURE DEMAND PAGING: never bring a page into memory until it is required LOCALITY OF REFERENCE – pages tend to be local to each other (good performance for demand paging) HARDWARE needed: Page table Secondary storage – swap space (to swap programs in and out of memory)

Steps in Handling a Page Fault

What happens if there is no free frame?
Page replacement – find some page in memory, but not really in use, swap it out algorithm performance – want an algorithm which will result in minimum number of page faults Same page may be brought into memory several times LOCALITY OF REFERENCE – pages tend to be local to each other (good performance for demand paging)

Performance of Demand Paging
Page Fault Rate: 0  p  1 (probability of page fault) if p = 0 no page faults if p = 1, every reference is a fault Effective Access Time (EAT) EAT = (1 – p) x memory access + p (page fault overhead + [swap page out ] + swap page in + restart overhead) Most computers (ma) – 10 to 200 ns If p = then EAT = memory access FAULT SEQUENCE MAJOR COMPONENTS OF PAGE-FAULT SERVICE TIME Trap to OS service the page-fault interrupt Save state read in the page Determine page fault restart the process Check page is legal – but not present Issue a read from disk to free frame I/O wait: wait for device to seek and transfer page While waiting – let other process have CPU Receive interrupt from I/O Save current state Determine interrupt from I/O disk Correct page table Wait for CPU to be allocated to process again Restore the state

Demand Paging Example Memory access time = 1 microsecond
50% of the time the page that is being replaced has been modified and therefore needs to be swapped out Page-switch time: around 8 ms. EAT = (1 – p) x (200) + p(8 ms) = (1 – p) x (200) + p(8,000,000) = ,999,800p 220 > ,999,800p p < If we want performance degradation to be less than 10 percent we need p < We can allow fewer than one memory access out of 399,990 to page fault. SWAP SPACE on disk is much faster…

Process Creation Virtual memory allows other benefits during process creation: - Copy-on-Write - Memory-Mapped Files

Copy-on-Write Copy-on-Write (COW) allows both parent and child processes to initially share the same pages in memory If either process modifies a shared page, only then is the page copied COW allows more efficient process creation as only modified pages are copied Free pages are allocated from a pool of zeroed-out pages Shared pages are marked as COW – if either process writes to a shared page, a copy of the shared page is created. WIN XP, Linux

Need: Page Replacement
Prevent over-allocation of memory by modifying page-fault service routine to include page replacement Use modify (dirty) bit to reduce overhead of page transfers – only modified pages are written to disk Over-allocating memory – increase our degree of multiprogramming (over booking of flights) Page replacement completes separation between logical memory and physical memory – large virtual memory can be provided on a smaller physical memory

Basic Page Replacement
1. Find the location of the desired page on disk 2. Find a free frame: - If there is a free frame, use it - If there is no free frame, use a page replacement algorithm to select a victim frame 3. Read the desired page into the (newly) free frame. Update the page and frame tables. 4. Restart the process

Page Replacement Algorithms
GOAL: lowest page-fault rate Evaluate algorithm by running it on a particular string of memory references (reference string) and computing the number of page faults on that string Example string: 1,4,1,6,1,6,1,6,1,6,1 1,4,1,6,1,6,1,6,1,6,1 if we have 3 or more frames – we only have 3 faults. – then the table is full and ready if we have 1 frame – then we would have 11 page faults.

Graph of Page Faults Versus The Number of Frames
Number of frames available increases – number of page faults decreases 1,4,1,6,1,6,1,6,1,6,1 if we have 3 or more frames – we only have 3 faults. – then the table is full and ready if we have 1 frame – then we would have 11 page faults.

First-In-First-Out (FIFO) Algorithm
Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 3 frames (3 pages can be in memory at a time per process) 4 frames FIFO Replacement – Belady’s Anomaly more frames  more page faults 1 1 4 5 2 2 1 3 9 page faults 3 3 2 4 1 1 5 4 Belady’s Anomaly – when the size of page table increase the page faults increase 2 2 1 5 10 page faults 3 3 2 4 4 3

FIFO Page Replacement Reference string 7,0,1,2,0,3,0 How many faults? 15 faults.

FIFO Illustrating Belady’s Anomaly
Stopped 3/20

“Well, is it at least 4% as good as Optimal Algorithm?”
Goal: Replace page that will not be used for longest period of time 4 frames example 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 How do you know this? Used for measuring how well your algorithm performs: “Well, is it at least 4% as good as Optimal Algorithm?” 1 2 3 4 6 page faults 5 Does not exist! Never suffer from Belady’s anomaly.

Optimal Page Replacement
Better than FIFO: 15 faults Requires FUTURE knowledge of the reference string – therefore (just like SJF) – IMPOSSIBLE TO IMPLEMENT THEREFORE – used for COMPARISON STUDIES--- an algorithm is good because it is 12.3% of optimal at worst and within 4.7% on average Optimal: 9 faults FIFO: 15 faults 67% increase over the optimal

Least Recently Used (LRU) Algorithm
Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 Counter implementation Every page entry has a counter; every time page is referenced through this entry: counter = clock When a page needs to be changed, look at the counters to determine which are to change 1 2 3 4 5 Is an approximation of optimal possible? FIFO – uses the time when a page is brought into memory vs. OPT – uses the time when a page is to be used. Count – time of last use… How do you deal with the overflow of the clock???

LRU Page Replacement Often used and considered to be good.
etc (Shows the advancement of the counter)

LRU Algorithm (Cont.) Stack implementation – keep a stack of page numbers in a double link form: Page referenced: move it to the top (most recently used) Worst case: 6 pointers to be changed No search for replacement Head Use a deque… Does not suffer from Belady’s anomaly… Least recently used – on the bottom. Must use special HARDWARE! A B C Tail 3 5 Head 1 B A C Tail 2 4 Also B’s previous link

Use Of A Stack to Record The Most Recent Page References
Must have HARDWARE assistance to handle the overhead associated with these algorithms

LRU Approximation Algorithms
Reference bit With each page associate a bit, initially = 0 When page is referenced bit set to 1 Replacement: choose something with 0 (if one exists). We do not know the order, however. Second chance (Clock replacement) Need reference bit If page to be replaced (in clock order) has reference bit = 1 then: set reference bit 0 leave page in memory replace next page (in clock order), subject to same rules HARDWARE needed for LRU is hardly ever available – but check out the “reference bit” that is often provided… REFERENCE BIT – set whenever a page is referenced. Additional Refer-Bits Algo – use a byte per page – (OS interrupt after period: 100ms – shift reference bit into left and other bits right) more recent than Second Chance – use a circular queue Replace page at that point…

Second-Chance (clock) Page-Replacement Algorithm
A reference to the page will set the reference bit. Basic FIFO – if refbit = 0 then replace otherwise if refbit = 1 then give it a second chance (clear refbit and move to next page) NEW page is insert at the same position as the victim page. Enhanced Second-chance Algo (use reference bit and dirty bit as order pair) 0,0 best page to replace (PREFERRED) 0,1 not recently used but it is dirty therefore must be written to disk 1,0 recently used – but clean 1,1 recenttly used and dirty

Counting Algorithms Keep a counter of the number of references that have been made to each page LFU Algorithm: replaces page with smallest count - indicates an actively used page MFU Algorithm: based on the argument that the page with the smallest count was probably just brought in and has yet to be used Other possible replacement algorithms. LEAST FREQUENTLY USED (Not commonly used…. Due to overhead) MOST FREQUENTLY USD Other schemes – page-buffering algorithms Pool of free frames Victim is written to free frame Program can restart quicker Keep list of modified pages – when paging device is idle write out modified pages and change dirty bit

Allocation of Frames Each process needs minimum number of pages: depends on computer architecture Example: IBM 370 – 6 pages to handle SS MOVE instruction: instruction is 6 bytes, might span 2 pages 2 pages to handle from 2 pages to handle to Two major allocation schemes fixed allocation priority allocation How do we allocate the fixed amount of free memory among the various processes? When a process is restarted – it must have all the pages that a process’ instruction we need. minimum number of pages: depends on computer architecture maxium number of pages: depends on amount of available physical memory

Fixed Allocation Equal allocation – For example, if there are 100 frames and 5 processes, give each process 20 frames. Proportional allocation – Allocate according to the size of process S - total size of all processes PRIORITY ALLOCATION Use a proportional allocation scheme using priorities rather than size

Global vs. Local Allocation
Global replacement – process selects a replacement frame from the set of all frames - “one process can take a frame from another” Con: Process is unable to control its own page-fault rate. Pro: makes available pages that are less used pages of memory Local replacement – each process selects from only its own set of allocated frames If process Pi generates a page fault, select for replacement one of its frames select for replacement a frame from a process with lower priority number GLOBAL REPLACEMENT: Generally results in greater throughput – therefore more common method

Thrashing Number of frames < minimum required for architecture – must suspend process Swap-in, swap-out level of intermediate CPU scheduling Thrashing  a process is busy swapping pages in and out Thrashing - More time swapping than executing If a process has less frames than needed to support active use then it faults, replaces, faults, replaces, faults, etc.

Thrashing Consider this: CPU utilization low – increase processes
A process needs more pages – gets them from other processes Other process must swap in – therefore wait Ready queue shrinks – therefore system thinks it needs more processes If a process does not have “enough” pages, the page-fault rate is very high. This leads to: 1. low CPU utilization 2. operating system thinks that it needs to increase the degree of multiprogramming another process added to the system IN ORDER TO INCREASE CPU UTILIZATION – must decrease the degree of multiprogramming

Demand Paging and Thrashing
Why does demand paging work? Locality model as a process executes it moves from locality to locality Localities may overlap Why does thrashing occur?  size of locality > total memory size all localities added together Demand paging – what we have been using

Locality In A Memory-Reference Pattern
A process gets allocated enough frames to accommodate its current locality – it will fault until locality in memory – it will not fault again until it changes localities

Working-Set Model   working-set window  a fixed number of page references Example: 10,000 instruction WSSi (working set of Process Pi) = total number of pages referenced in the most recent  (varies in time) if  too small will not encompass entire locality if  too large will encompass several localities if  =   will encompass entire program D =  WSSi  total demand frames if D > m  Thrashing Policy if D > m, then suspend one of the processes Based on the assumption of locality m – memory size (available frames) Working-set – approximation of the program’s locality 3/24/

Working-set model Working set at time t1 – {1,2,5,6,7}

Keeping Track of the Working Set
Approximate WS with interval timer + a reference bit Example:  = 10,000 references Timer interrupts after every 5000 time units Keep in memory 2 bits for each page Whenever a timer interrupts: copy and sets the values of all reference bits to 0 If one of the bits in memory = 1  page in working set Why is this not completely accurate? Improvement = 10 bits and interrupt every 1000 time units NOT ACCURATE – because a page could be in and out of set within the 5000 references.

Page-Fault Frequency Scheme
Establish “acceptable” page-fault rate If actual rate too low, process loses frame If actual rate too high, process gains frame PAGE-FAULT FREQUENCY (PFF) – more direct approach to thrashing If frame-rate high – then process needs more frames (if none available then suspend it) If frame-rate low – then process needs less frames

Memory-Mapped Files Memory-mapped file I/O allows file I/O to be treated as routine memory access by mapping a disk block to a page in memory How? A file is initially read using “demand paging”. A page-sized portion of the file is read from the file system into a physical page. Subsequent reads/writes to/from the file are treated as ordinary memory accesses. Simplifies file access by treating file I/O through memory rather than read() write() system calls (less overhead) Sharing: Also allows several processes to map the same file allowing the pages in memory to be shared NOTE: writes to the file are not necessarily immediate – in fact the system does it periodically.

Memory Mapped Files

WIN32 API Steps: Create a file mapping for the file Establish a view of the mapped file in the process’s virtual address space A second process can the open and create a view of the mapped file in its virtual address space See example called “threadsharedfile”

Other Issues -- Prepaging
To reduce the large number of page faults that occurs at process startup Prepage all or some of the pages a process will need, before they are referenced But if prepaged pages are unused, I/O and memory was wasted Assume s pages are prepaged and α of the pages is used Is cost of s * α save pages faults > or < than the cost of prepaging s * (1- α) unnecessary pages? α near zero  prepaging loses

Other Issues – Page Size
Page size selection must take into consideration: fragmentation table size I/O overhead locality Fragmentation - table size I/O overhead – time required to I/O a page Locality Better resolution

Other Issues – Program Structure
Int[128,128] data; Each row is stored in one page Program 1 for (j = 0; j <128; j++) for (i = 0; i < 128; i++) data[i,j] = 0; 128 x 128 = 16,384 page faults Program 2 for (i = 0; i < 128; i++) for (j = 0; j < 128; j++) data[i,j] = 0; 128 page faults Pages – 128 words in size Stored row major – data[0][0] data[0][1] - each row takes one page Stack – good locality factor, hash – bad locality Compiler and loader can have a significant effect on paging. Studies – OO programs have poorer locality of reference (Pointers – opportunity for poor locality)

Other Issues – I/O interlock
I/O Interlock – Pages must sometimes be locked into memory Consider I/O. Pages that are used for copying a file from a device must be locked from being selected for eviction by a page replacement algorithm.

Reason Why Frames Used For I/O Must Be In Memory

Other Issues – TLB Reach
TLB Reach - The amount of memory accessible from the TLB TLB Reach = (TLB Size) X (Page Size) Ideally, the working set of each process is stored in the TLB. Otherwise there is a high degree of page faults. Increase the Page Size. This may lead to an increase in fragmentation as not all applications require a large page size Provide Multiple Page Sizes. This allows applications that require larger page sizes the opportunity to use them without an increase in fragmentation. TLB – translation lookaside buffer (cache)

Gordon College Stephen Brinton

Similar presentations

Presentation on theme: "Gordon College Stephen Brinton"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Gordon College Stephen Brinton

Similar presentations

Presentation on theme: "Gordon College Stephen Brinton"— Presentation transcript:

Similar presentations

About project

Feedback