Practical, transparent operating system support for superpages Juan Navarro, Sitaram Iyer, Peter Druschel, Alan Cox OSDI 2002.

Slides:



Advertisements
Similar presentations
CS 443 Advanced OS David R. Choffnes, Spring 2005 Practical, transparent operating system support for superpages Juan Navarro, Sitaram Iyer, Peter Druschel,
Advertisements

Chapter 4 Memory Management Basic memory management Swapping
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Virtual Memory Introduction to Operating Systems: Module 9.
Segmentation and Paging Considerations
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
Allocating Memory.
Memory Management Design & Implementation Segmentation Chapter 4.
Chapter 101 Virtual Memory Chapter 10 Sections and plus (Skip:10.3.2, 10.7, rest of 10.8)
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
Memory Management 2010.
Chapter 3.2 : Virtual Memory
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
Memory Management Chapter 5.
Computer Organization and Architecture
 2004 Deitel & Associates, Inc. All rights reserved. Chapter 9 – Real Memory Organization and Management Outline 9.1 Introduction 9.2Memory Organization.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Chapter 3 Memory Management: Virtual Memory
CS 346 – Chapter 8 Main memory –Addressing –Swapping –Allocation and fragmentation –Paging –Segmentation Commitment –Please finish chapter 8.
Review of Memory Management, Virtual Memory CS448.
Memory Management Chapter 7.
1 Memory Management Memory Management COSC513 – Spring 2004 Student Name: Nan Qiao Student ID#: Professor: Dr. Morteza Anvari.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Computer Architecture Lecture 28 Fasih ur Rehman.
Lecture 19: Virtual Memory
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
Memory Management – Page 1 of 49CSCI 4717 – Computer Architecture Memory Management Uni-program – memory split into two parts –One for Operating System.
1 Memory Management 4.1 Basic memory management 4.2 Swapping 4.3 Virtual memory 4.4 Page replacement algorithms 4.5 Modeling page replacement algorithms.
Subject: Operating System.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming  To allocate scarce memory resources.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Practical, Transparent Operating System Support for Superpages J. Navarro Rice University and Universidad Católica de Chile S. Iyer, P. Druschel, A. Cox.
CE Operating Systems Lecture 14 Memory management.
CS 149: Operating Systems March 3 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
Chapter 4 Memory Management Virtual Memory.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
1 Practical, transparent operating system support for superpages Juan Navarro, Sitaram Iyer, Peter Druschel, Alan Cox OSDI 2002.
1 Memory Management. 2 Fixed Partitions Legend Free Space 0k 4k 16k 64k 128k Internal fragmentation (cannot be reallocated) Divide memory into n (possible.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Virtual Memory Hardware.
Review °Apply Principle of Locality Recursively °Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings.
Swap Space and Other Memory Management Issues Operating Systems: Internals and Design Principles.
Informationsteknologi Wednesday, October 3, 2007Computer Systems/Operating Systems - Class 121 Today’s class Memory management Virtual memory.
Memory Management OS Fazal Rehman Shamil. swapping Swapping concept comes in terms of process scheduling. Swapping is basically implemented by Medium.
Chapter 7 Memory Management Eighth Edition William Stallings Operating Systems: Internals and Design Principles.
3.1 Advanced Operating Systems Superpages TLB coverage is the amount of memory mapped by TLB. I.e. the amount of memory that can be accessed without TLB.
1 Memory Management n In most schemes, the kernel occupies some fixed portion of main memory and the rest is shared by multiple processes.
CS161 – Design and Architecture of Computer
Virtual memory.
Jonathan Walpole Computer Science Portland State University
ITEC 202 Operating Systems
CS161 – Design and Architecture of Computer
CS703 - Advanced Operating Systems
Chapter 9 – Real Memory Organization and Management
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
PRACTICAL, TRANSPARENT OPERATING SYSTEM SUPPORT FOR SUPERPAGES
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
CSE 451: Operating Systems Autumn 2005 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE451 Virtual Memory Paging Autumn 2002
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CS703 - Advanced Operating Systems
CSE 542: Operating Systems
COMP755 Advanced Operating Systems
Operating Systems: Internals and Design Principles, 6/E
Presentation transcript:

Practical, transparent operating system support for superpages Juan Navarro, Sitaram Iyer, Peter Druschel, Alan Cox OSDI 2002

Introduction This paper addresses the issue of OS- level support for superpages A superpage is a page that is larger than the hardware base page. Superpage capability must be present in the hardware – cannot be software-based.

Background Page-based virtual memory Page tables Translation lookaside buffers (TLB) –Purpose –The problem with TLBs

Background Summary Virtual memory automates the movement of a process’s address space (code and data) between disk and primary memory. Virtual addresses must be translated to physical addresses using information stored in the page table. Page tables are stored in primary memory.

Page Tables and Address Translation Extra memory references due to page table degrades performance TLB (translation lookaside buffer) – faster memory; caches portions of the page table If most memory references “hit” in the TLB, the overhead of address translation is acceptable. TLB coverage: the amount of memory that can be accessed strictly through TLB entries.

TLB Coverage – the Problem Computer memories have increased in size faster than TLBs. TLB coverage as a percentage of total memory has decreased over the years. –At the time this paper was written, most TLBs covered a megabyte or less of physical memory –Many applications have working sets that are not completely covered by the TLB Result: more TLB misses, poorer performance.

The Situation Problem: Reduced TLB coverage makes virtual memory systems less efficient. A solution: Increase page size –Each TLB entry represents one page –Increasing page size increases TLB coverage But …pages that are too large are inefficient: –Weaken locality and waste storage –Increase fragmentation

Superpages Compromise: hardware supports more than one page size –Ordinary pages = base pages –Superpage: a power-of-2 multiple of the base page size One superpage maps several base pages. –Let base page = 4KB and superpage = 64KB. Using superpages, TLB coverage is up to 16 times greater with no increase in TLB size

Hardware Support for Superpages Superpage capability is common in modern computers; isn’t well supported by the OS. Most modern computers provide several different page sizes –e.g. 64KB, 512KB, 4MB for an Alpha processor whose base page size is 8KB (authors’ example & testbed) Although implemented for the Alpha chip, the design presented in this paper is general.

Why Several Page Sizes? Large page sizes reduce the size of the page table, increase TLB coverage, optimize I/O time. But … they can also greatly increase the memory requirements of a process –Some pages are only partially filled –Small localities = a kind of internal fragmentation (page only partially referenced) –If pages are not filled, paging traffic can actually increase instead of decrease.

Why Several Page Sizes? Small page sizes reduce internal fragmentation (amount of wasted space in an allocated block). But … they have all the problems that large pages solve, plus they also have the possibility of increasing page faults. Solution: Use multiple page sizes

Allocated Space Free Space External Fragmentation AlloAllo Allocated and used space Allocated but unused space Internal Fragmentation Allocated to A Allocated to B Allocated to C

Multiple Page Sizes Present Problems Memory management becomes more complex –Uniform page size is simple External fragmentation – reduces the opportunity to use superpages Consequently most general purpose OS’s don’t use superpages, at least for user space.

SP1 SP3 SP4 SP4 leaves, is replaced by SP5. SP2 leaves. No room for a superpage External fragmentation SP2 SP5 SP3 SP1

Hardware-Imposed Constraints Limited to page sizes provided by hardware Must have enough contiguous free memory to store each superpage Superpage addresses must be aligned on the superpage size: e.g., a 64KB SP must start at address 0, or 64KB, or 128KB, etc. TLB entry only has one set of bits (R, M, etc.) and thus can only provide coarse-grained info – not good for efficient page management.

Design Decisions Acquire base pages on demand and “promote” to superpage at some later date versus load entire superpage when one base page is faulted in What size superpage should be created? If base pages are acquired on demand –When to promote? –Reservation-based allocation: set aside space for a superpage when first base page is loaded versus Relocation-based: wait until a superpage is formed and then move existing pages to contiguous locations

Authors’ Assumptions The virtual address space of a process is a collection of virtual memory objects: code, data, stack, heap, memory-mapped files, etc. –Each object is mapped contiguously to virtual address space –The virtual address space may be sparse – there may be gaps between objects. OS will not automatically create a superpage when a new page is loaded – wait to see if it makes sense.

Issues: Allocation Allocation: when a page is loaded because of a page fault it must be mapped to a physical frame. –In non-superpage systems any frame will do –In a superpage system we may later decide to include this page in a superpage: now we have to find room for the other pages that are contiguous with the one already loaded

Allocation Approaches Relocation-based – incurs overhead of moving pages when superpages are created. Reservation-based – how much space should you reserve? –Find a contiguous range of page frames, aligned on the correct address, to match a SP size The authors developed a reservation-based system.

Reservation-based Allocation When a page is initially loaded choose a superpage size and reserve contiguous frames to hold the eventual superpage. –Consider size and alignment Don’t know if adjoining pages will ever be needed by the program Decide now what the final superpage size will be.

Reservation-based Allocation – choosing the page size Possibilities: –the largest superpage size available –a superpage size that most closely matches the VM object the page belongs to –a smaller size, based on memory availability. Tradeoff: performance gains from large page versus possible loss of contiguous memory space that is needed later

Figure 2: Reservation based allocation Allocated framesUnused page frame reservation Virtual address space Physical address space Superpage alignment boundaries Object mappingMapped pages

Relocation-based contiguity Relocation-based methods have to re-copy pages into other frames when they decide to create a superpage. –This is less likely to cause fragmentation than reservation based, but has a heavier processing overhead, similar to compaction schemes. –Find contiguous space, move existing pages

Review - 3/31/09 Superpage: a large page Purpose: improve TLB coverage Tradeoff: uniform (small) page size versus variable (large and small) page sizes –Simplicity versus complexity –No external fragmentation versus external fragmentation –Limited TLB coverage vs extended TLB coverage

Review - 3/31/09 Constraints –Maintain address alignment –Manage fragmentation (maintain contiguity) –Restrict overhead Relocation-based methods versus reservation-based methods –Copying overhead versus fragmentation

Outline Issues for a superpage management system: –Allocation and fragmentation control (already discussed) –Promotion –Demotion –Eviction –Storage management Details of Navarro, et al., system

Issues: Promotion Initially, base pages are placed in a reserved block of frames, but are treated separately. Promote when enough pages have been loaded to justify creating a superpage: –Combine TLB entries into one entry –Update page table to show new superpage size –Load remaining pages, if necessary Promotion may be incremental Tradeoff: early promotion (before all base pages have been faulted in) reduces TLB misses but wastes memory if all pages of the superpage are not needed

Issues: Demotion Reduce superpage size –To individual base pages –To a smaller superpage Required if memory is needed for new pages and unused base pages must be evicted (page replacement) Difficulty: use bits and dirty bits aren’t as helpful as they are in the base page table.

Issues: Eviction When memory is full, a superpage may be evicted All its base pages are released. If the dirty bit is set, the entire superpage must be written to disk, even if only part of it has changed. If one of the pages is faulted in later, the process starts over

Design of System Proposed by Navarro, et al. The system discussed in this paper is reservation-based. It supports multiple superpage sizes to reduce internal fragmentation It demotes infrequently referenced pages to reclaim memory frames It is able to maintain contiguous pages without using compaction

Design Issues Reservation-based allocation –Choosing a page size Fragmentation control Incremental promotions Speculative demotions Paging out dirty superpages How does this system address the issues which have been previously identified?

Allocation in this system A page fault triggers a decision: does the page have an existing reservation or not? If not, then –select a preferred superpage size, –locate a set of contiguous, aligned frames –load the page into the correct frame –enter the mapping in the page table –reserve the remaining frames Or, load the page into a previously reserved frame

Choosing a Superpage Size in This System Since the decision is made early, can’t decide based on process’s behavior. Base decision on the memory object type; prefer too large to too small –If the decision is too large, it is easy to reclaim the unneeded space –If the decision is too small, relocation is needed

Guidelines for Choosing Superpage Size For fixed size memory objects (e.g. code segments) reserve the largest super page possible, considering alignment and existing reservations, that does not extend beyond the end of the object. For dynamic-sized objects (stacks, heaps) that grow one page at a time: same guide- lines, but allow reservation to extend beyond end, to allow object to grow.

Preempting Reservations in This System After a page fault, if the guidelines call for a superpage that is too large for any available free block: –Reserve a smaller size superpage or –Preempt an existing reservation that has enough unallocated frames to satisfy the request This system uses preemption wherever possible.

Fragmentation Control When different sizes of superpages are used in the same system physical memory can become fragmented. Result: there are not enough large, properly aligned blocks of free memory. Navarro et al. propose several implementation techniques to address this problem

Fragmentation Control in This System The “buddy allocator” (free list manager) maintains multiple lists of free blocks, ordered by size When possible, coalesce adjacent blocks of free memory to form larger blocks. A page replacement daemon periodically selects pages to be swapped out. It is modified to include contiguity as one of the factors to be considered.

Promotion to Superpage Status in This System How does a set of base pages become a superpage? Suppose a superpage consists of 8 base pages – system reserves space for 8 when first is loaded. If other pages are referenced, load into reserved frames. At some point, decide to treat as a super- page instead of several base pages.

Promotion in This System If a sub-superpage is entirely populated, this system will promote it (incremental promotion); e.g. if 4 aligned pages of a 16 page superpage are faulted in, create a small superpage. This system promotes only regions that are fully populated. –In some systems, promotion occurs if some fraction of the superpage is loaded. Before promoting, must load other pages.

Key Observation Once a program accesses one page in a memory object, it is likely to access the rest of the pages shortly thereafter (or not at all): spatial locality –array references –mapped file Conclude: If a superpage is not created soon after the initial pages are loaded, it probably isn’t going to happen.

Demotion (preemption) Demotion is a side-effect of page replacement; when a base page is evicted, its superpage is demoted. Demotion in this system is also recursively incremental. Speculative demotion: demote active superpages to determine if the whole page is still in use or just parts. –When the paging daemon resets the R bit of a base page, demote accordingly if memory is scarce.

Paging Out Dirty Superpages If a dirty superpage is to be flushed to disk, there is no way to tell if one page is dirty or all pages. Writing out the entire superpage is a huge perfomance hit. Navarro, et. al’s solution: Don’t write to clean superpages. –If a process tries to write to a clean SP, demote it. –Repromote later if all base pages are dirty.

Alternate Approach The authors experimented with another method to allow their system to deduce if a base page had been modified. Compute the cryptographic hash digest of a page’s contents when it is loaded; do so again when a page is flushed. If there is no change, the page is clean –Conclusion: too time consuming, but experiments with modifications were planned.

Tracking Reservations Multi-list reservation scheme –One list for each hardware page size –A reserved block is placed on a list according to how large an extent could be preempted, without affecting allocated pages. A reservation for 64KB may have only 8KB contiguous, aligned, un-allocated memory –Each list sorted by how recently reserved –Preempt from head of list (least recently allocated) –Fully populated pages aren’t in reservation lists

More Design Issues Population map –Tracks allocated base pages –When a page fault occurs, can be used to find out if the page has a reservation –Also useful for deciding when to promote –Helps to identify unallocated regions in existing reservations.

Goal of Superpage Management Systems Good TLB coverage with minimal internal fragmentation Conclusion: create the largest superpage possible that isn’t larger than the size of the memory object (except for stack/heap). If there isn’t enough memory, preempt existing reservations (these pages had their chance)

Current Usage Superpages are most often used today to store portions of the kernel and various buffers. Reason: the memory requirements for these objects are static and can be known in advance. Superpage size can be chosen to fit the object. Superpage use in application space is the harder issue.

Current Research This paper focuses on the use of superpages in application memory, as opposed to kernel memory. An ongoing research area: memory compaction – whenever there are idle CPU cycles, work to establish large contiguous blocks of free memory –Compare to disk management

Summary: Potential Advantages of Superpages Ideally, superpages can improve performance –Without increasing size of TLB (which would be expensive and reduce TLB access time) –Without increasing base page size (which can lead to internal fragmentation) Superpages allow use of small (base) and large (super) page sizes at the same time.

Summary - Tradeoff Large superpages increase TLB coverage Large superpages are more likely to fragment memory. (Why?) Benefits of large superpages must be weighed against “contiguity restoration techniques” –Pages loaded into reserved areas must be loaded at the proper offset. –Must be enough space for the entire superpage –More overhead for free space management

Authors’ Conclusions Can achieve 30%-60% improvement in performance, based on tests using an accepted set of benchmark programs as well as actual applications. Must employ contiguity restoration techniques: demotion, preemption, compaction Must be able to support a variety of page sizes

Conclusion Superpage management can be transparently integrated into an existing OS (FreeBSD, in this case). –“hooks” connect the OS to the superpage module at critical events: page faults, page allocation, page replacement, etc. Tests show this technique scales well

Follow-up “Supporting superpage allocation without additional hardware support”, Mel Gorman, Patrick Healy, Proceedings of the 7th International Symposium on Memory Management, 2008