Presentation on theme: "Memory Management. Background Memory consists of a large array of words or bytes, each with its own address. The CPU fetches instructions from memory."— Presentation transcript:
Background Memory consists of a large array of words or bytes, each with its own address. The CPU fetches instructions from memory according to the value of the program counter. These instructions may cause additional loading from and storing to specific memory addresses. Memory unit sees only a stream of memory addresses. It does not know how they are generated. Program must be brought into memory and placed within a process for it to be run. Input queue – collection of processes on the disk that are waiting to be brought into memory for execution. User programs go through several steps before being run.
Multistep Processing of a User Program
Binding of Instructions and Data to Memory Compile time: If memory location known a priori, absolute code can be generated; must recompile code if starting location changes. Example:.COM-format programs in MS-DOS. Load time: Must generate relocatable code if memory location is not known at compile time. Execution time: Binding delayed until run time if the process can be moved during its execution from one memory segment to another. Need hardware support for address maps (e.g., relocation registers). Address binding of instructions and data to memory addresses can happen at three different stages.
Logical vs. Physical Address Space The concept of a logical address space that is bound to a separate physical address space is central to proper memory management. –Logical address – address generated by the CPU; also referred to as virtual address. –Physical address – address seen by the memory unit. The set of all logical addresses generated by a program is a logical address space; the set of all physical addresses corresponding to these logical addresses is a physical address space. Logical and physical addresses are the same in compile-time and load-time address-binding schemes; logical (virtual) and physical addresses differ in execution-time address-binding scheme.
Memory-Management Unit ( MMU ) Hardware device that maps virtual address to physical address. In a simple MMU scheme, the value in the relocation register is added to every address generated by a user process at the time it is sent to memory. The user program deals with logical addresses; it never sees the real physical addresses.
Dynamic relocation using a relocation register
Dynamic Loading Routine is not loaded until it is called Better memory-space utilization; unused routine is never loaded. Useful when large amounts of code are needed to handle infrequently occurring cases. No special support from the operating system is required. Implemented through program design.
Dynamic Linking Linking is postponed until execution time. Small piece of code, stub, is used to locate the appropriate memory-resident library routine, or to load the library if the routine is not already present. Stub replaces itself with the address of the routine, and executes the routine. Operating system is needed to check if routine is in processes’ memory address. Dynamic linking is particularly useful for libraries.
Swapping A process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution. Backing store – fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these memory images. Roll out, roll in – swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executed. Major part of swap time is transfer time; total transfer time is directly proportional to the amount of memory swapped. Modified versions of swapping are found on many systems (i.e., UNIX, Linux, and Windows).
Schematic View of Swapping
Contiguous Allocation Main memory usually into two partitions: –Resident operating system, usually held in low memory with interrupt vector –User processes then held in high memory Single-partition allocation –Relocation-register scheme used to protect user processes from each other, and from changing operating-system code and data –Relocation register contains value of smallest physical address; limit register contains range of logical addresses – each logical address must be less than the limit register
HW support for relocation and limit registers
Memory Allocation First-fit: Allocate the first block that is big enough Best-fit: Allocate the smallest block that is big enough; must search entire list, unless ordered by size. Produces the smallest leftover block. Worst-fit: Allocate the largest block; must also search entire list. Produces the largest leftover block. How to satisfy a request of size n from a list of free blocks First-fit and best-fit better than worst-fit in terms of speed and storage utilization
Fragmentation External Fragmentation – total memory space exists to satisfy a request, but it is not contiguous. Internal Fragmentation – allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used. Reduce external fragmentation by compaction –Shuffle memory contents to place all free memory together in one large block. –Compaction is possible only if relocation is dynamic, and is done at execution time.
Paging Logical address space of a process can be noncontiguous; process is allocated physical memory whenever the latter is available. Divide physical memory into fixed-sized blocks called frames (size is power of 2, for example 512 bytes). Divide logical memory into blocks of same size called pages. Keep track of all free frames. To run a program of size n pages, need to find n free frames and load program. Set up a page table to translate logical to physical addresses. Internal fragmentation may occurs.
Address Translation Scheme Address generated by CPU is divided into: –Page number (p) – used as an index into a page table which contains base address of each page in physical memory. –Page offset (d) – combined with base address to define the physical memory address that is sent to the memory unit.
Address Translation Architecture
Paging Example page size: 4 bytes
Free Frames Before allocation After allocation
Hardware Support Most OS allocate a Page Table for each process. A pointer to the Page Table is stored with the other register values in the PCB When the dispatcher starts a process, it must reload the user registers and define the correct hardware page- table values from the stored user table. Hardware implementation can be done in these ways –Set of dedicated registers- built with high speed logic to make page-address translation efficient –Page table is kept in the main memory- Page-table base register (PTBR) points to the page table (In this scheme every data/instruction-byte access requires two memory accesses. One for the page-table entry and one for the byte.)
Hardware Support The two memory access problem can be solved by the use of a special fast-lookup hardware cache called associative registers or associative memory or translation look-aside buffers (TLBs). TLB entry consist of two parts: a key and a value. An item to be searched is compared with all keys simultaneously. If item is located the corresponding value is returned Fast but expensive. Typically, the number of entries in a TLB is between 64 and 1024.
Associative Memory Associative memory – parallel search Address translation (P, F) –If P is in associative register, get frame# out. –Otherwise get frame# from page table in memory Page #Frame #
Paging Hardware With TLB
Some TLBs Store Address-Space Identifiers (ASIDs) in each TLB entry, which uniquely identifies each process and is used to provide address space numbers for that process. When the TLB attempts to resolve virtual page numbers, it ensures that the ASID for the currently running process matches the ASID associated with the virtual page If the ASID do not match then it is treated as a TLB miss. ASID allows the TLB to contain entries for several processes simultaneously
Segmentation Memory-management scheme that supports user view of memory. A program is a collection of segments. Each segment has an name and a length. The addresses of segment specify both the segment name and the offset within the segment A segment is a logical unit such as: main program, procedure, function, method, object, local variables, global variables, common block, stack, symbol table, arrays
User’s View of a Program
Logical View of Segmentation user spacephysical memory space
Segmentation Architecture Logical address consists of a two tuple: Segment table – maps two-dimensional physical addresses; each table entry has: –base – contains the starting physical address where the segments reside in memory. –limit – specifies the length of the segment. Segment-table base register (STBR) points to the segment table’s location in memory. Segment-table length register (STLR) indicates number of segments used by a program; segment number s is legal if s < STLR.
Example of Segmentation
Sharing of Segments
Segmentation with Paging Both paging and segmentation have their advantages and disadvantages. Problems of external fragmentation and lengthy search times can be solved by paging the segments. Solution differs from pure segmentation in that the segment-table entry contains not the base address of the segment, but rather the base address of a page table for this segment.
Background Virtual memory – separation of user logical memory from physical memory. Allows an extremely large virtual memory to be provided for programmers when only a smaller physical memory is available. Only part of the program needs to be in memory for execution. Logical address space can therefore be much larger than physical address space. Allows address spaces to be shared by several processes. Allows for more efficient process creation. Virtual memory can be implemented via: Demand paging Demand segmentation
Virtual Memory That is Larger Than Physical Memory
Shared Library Using Virtual Memory
Demand Paging Technique of bringing a page into memory only when it is needed, is used in virtual memory systems Pager will bring the required pages rather than whole process, into the main memory. Benefits- Less I/O needed Less memory needed Faster response More users Page is needed reference to it invalid reference abort not-in-memory bring to memory
Transfer of a Paged Memory to Contiguous Disk Space
To distinguish between the pages that are in the memory and the pages that are on the disk, Valid-Invalid scheme is used. This bit is set to ‘valid’ if the page is both legal and in memory This bit is set to ‘invalid’ if the page is either not valid (not in logical address space of the process) or is valid but not in the main memory. The process executes and accesses pages that are memory resident, execution proceeds normally. If the page tries to access a page that is not in memory, (access to a page marked invalid causes a page fault trap- as a result of OS failure to bring the desired page into memory.
Page Table When Some Pages Are Not in Main Memory
Procedure for handling page fault 1. Check an Page table( in PCB) for this process to determine whether the reference was a valid or an invalid memory access. 2. If the reference was invalid, the process is terminated. If it was valid, but page is not brought in, it is paged in. 3. Free frame is located. 4. Disk operation is initiated to read desired page in the newly allocated frame 5. On completion of disk read, the page of process is modified to indicate that now the page is in memory 6. Instruction which was trapped in restarted. Process can now access the page as though it has always been there.
Steps in Handling a Page Fault
In the extreme case, a process starts executing with no pages in memory. The OS sets the instruction pointer to the first instruction of the process, which is on non-memory-resident page, the process immediately faults for the page. After this page is brought in the memory, the process continues to execute, faulting as necessary until every page is in the memory. When all the pages required are in the memory, process executes with no faults. This scheme is called pure demand paging- never bring a page into memory until it is needed Hardware support – Page table Secondary memory-to hold swapped pages not in main memory
Performance of Demand Paging Page Fault Rate 0 p 1.0 if p = 0 no page faults if p = 1, every reference is a fault Effective Access Time (EAT) EAT = (1 – p) x memory access +p x page fault time Where Page fault time= (page fault overhead + [swap page out ]+ swap page in + restart overhead)
Page fault causes the following sequence to occur Trap to the OS Save user Registers & Process state. Determine that the interrupt was a page fault. Check that the page reference was legal and determine the location of the page on the disk. Issue a read from the disk to frame –Wait in queue for this device until the read request is serviced –Wait for the device seek and/or latency time. –Begin the transfer of the page to a free frame. While waiting allocate CPU to other process Receive an interrupt from the disk I/O subsystem. Save the registers & process state for other user. Determine the interrupt was from he disk. Correct page table to show page is now in memory. Wait for CPU to be allocated to process again. Restore the user registers, process state and new page table and then resume interrupted instruction.
Example to calculate EAT Average page fault service time =8 milli sec Memory access time = 200 nano sec Effective Access Time= (1-p)x p x = x p EAT is directly proportional to the page-fault rate if p=1 out of 1000 then EAT= * 1 /1000 = Nano Sec = 8.2 Micro seconds If we want performance to be degraded by 10 % 220> x p 20> xp P< It is important to keep page fault rate low in order to have less effective access time
Page Replacement Prevent over-allocation of memory by modifying page-fault service routine to include page replacement Use modify (dirty) bit to reduce overhead of page transfers – only modified pages are written to disk Page replacement completes separation between logical memory and physical memory – large virtual memory can be provided on a smaller physical memory
Need For Page Replacement
Basic Page Replacement 1. Find the location of the desired page on disk 2. Find a free frame: If there is a free frame, use it If there is no free frame, use a page replacement algorithm to select a victim frame Write the victim frame to the disk, change the page and frames tables accordingly 3. Read the desired page into the (newly) free frame. Update the page and frame tables. 4. Restart the process To evaluate the page replacement algorithm a reference string is used which is a string of memory references
Graph of Page Faults Versus The Number of Frames
FIFO Page Replacement This algorithm associates the time that the page was bought into memory. When a page has to be replaced, the oldest page is chosen-can be implemented by a FIFO queue.In this scheme the new page that is bought in is inserted at the end of the queue.
FIFO Page Replacement
Easy to understand and program but performance is not always good. If page selected for replacement is in active use, every thing still works fine.After replacing an active page with new one, a fault occurs almost immediately to retrieve the active page. A bad replacement choice increases the page fault rate and slows down process execution.
FIFO Illustrating Belady’s Anomaly
Optimal Page Replacement Replace the page that will not be used for the longest period of time. Guarantees lowest possible page-fault rate for a fixed number of frames. Better than the FIFO page replacement. Difficult to implement as it requires future knowledge of the reference string
Optimal Page Replacement
Least Recently Used (LRU) Page Replacement This algorithm associates with each page the time of that page’s last use.When a page must be replaced, LRU chooses the page that has not been used for the longest duration of time.
LRU Page Replacement
Good performance, but difficult to implement, requires substantial hardware assistance Two types of implementations are feasible Counters- Each page is associated with page-table entry a time-of-use field and a logical clock or Counter with CPU. Whenever a reference to the page is made, the contents of the clock register are copied to the field of time-of-use field in the page table entry of that page.
Stack implementation (to record the most recent page references) - keep a stack of page numbers in a double link form: Page referenced: move it to the top Use doubly-linked list: requires 6 pointers to be changed No search for replacement
Use Of A Stack to Record The Most Recent Page References
LRU Approximation Algorithms Reference bit With each page associate a bit, initially = 0 When page is referenced bit set to 1 Replace the one which is 0 (if one exists). We do not know the order, however. Second chance Need reference bit Clock replacement If page to be replaced (in clock order) has reference bit = 1 then: set reference bit 0 leave page in memory replace next page (in clock order), subject to same rules
Counting Algorithms Keep a counter of the number of references that have been made to each page LFU Algorithm: replaces page with smallest count MFU Algorithm: based on the argument that the page with the smallest count was probably just brought in and has yet to be used
Thrashing If a process does not have “enough” pages, the page-fault rate is very high. This leads to: low CPU utilization operating system thinks that it needs to increase the degree of multiprogramming another process added to the system Thrashing a process is busy swapping pages in and out
Locality In A Memory- Reference Pattern
Working-Set Model working-set window a fixed number of page references Example: 10,000 instruction WSS i (working set of Process P i ) = total number of pages referenced in the most recent (varies in time) if too small will not encompass entire locality if too large will encompass several localities if = will encompass entire program D = WSS i total demand frames if D > m Thrashing Policy if D > m, then suspend one of the processes
Page-Fault Frequency Scheme Establish “acceptable” page-fault rate If actual rate too low, process loses frame If actual rate too high, process gains frame