Efficient Virtual Memory for Big Memory Servers U Wisc and HP Labs ISCA’13 Architecture Reading Club Summer'131.

Slides:



Advertisements
Similar presentations
Operating Systems Lecture 10 Issues in Paging and Virtual Memory Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing.
Advertisements

EECS 470 Virtual Memory Lecture 15. Why Use Virtual Memory? Decouples size of physical memory from programmer visible virtual memory Provides a convenient.
16.317: Microprocessor System Design I
4/14/2017 Discussed Earlier segmentation - the process address space is divided into logical pieces called segments. The following are the example of types.
1 A Real Problem  What if you wanted to run a program that needs more memory than you have?
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
CS 153 Design of Operating Systems Spring 2015
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
Translation Buffers (TLB’s)
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
Mem. Hier. CSE 471 Aut 011 Evolution in Memory Management Techniques In early days, single program run on the whole machine –Used all the memory available.
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 24 Instructor: L.N. Bhuyan
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Operating Systems Chapter 8
CS 153 Design of Operating Systems Spring 2015 Lecture 17: Paging.
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
Chapter 4 Memory Management Virtual Memory.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Virtual Memory Hardware.
Review °Apply Principle of Locality Recursively °Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings.
Operating Systems Unit 7: – Virtual Memory organization Operating Systems.
Paging (continued) & Caching CS-3013 A-term Paging (continued) & Caching CS-3013 Operating Systems A-term 2008 (Slides include materials from Modern.
Lecture Topics: 11/24 Sharing Pages Demand Paging (and alternative) Page Replacement –optimal algorithm –implementable algorithms.
Redundant Memory Mappings for Fast Access to Large Memories
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Virtual Memory: Concepts Slides adapted from Bryant.
CSC 360, Instructor Kui Wu Memory Management I: Main Memory.
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a “cache” for secondary (disk) storage – Managed jointly.
CS203 – Advanced Computer Architecture Virtual Memory.
Chapter 7: Main Memory CS 170, Fall Program Execution & Memory Management Program execution Swapping Contiguous Memory Allocation Paging Structure.
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
CS161 – Design and Architecture of Computer
Lecture 11 Virtual Memory
Virtual Memory Chapter 7.4.
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
Virtual Memory User memory model so far:
CS703 - Advanced Operating Systems
From Address Translation to Demand Paging
143A: Principles of Operating Systems Lecture 6: Address translation (Paging) Anton Burtsev October, 2017.
Address Translation Mechanism of 80386
CSE 153 Design of Operating Systems Winter 2018
CSCI206 - Computer Organization & Programming
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Virtual Memory 4 classes to go! Today: Virtual Memory.
Segmentation Lecture November 2018.
CS 105 “Tour of the Black Holes of Computing!”
PRACTICAL, TRANSPARENT OPERATING SYSTEM SUPPORT FOR SUPERPAGES
Virtual Memory Hardware
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
CSE 451: Operating Systems Autumn 2005 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE451 Virtual Memory Paging Autumn 2002
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CS 105 “Tour of the Black Holes of Computing!”
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Lecture 7: Flexible Address Translation
CS 105 “Tour of the Black Holes of Computing!”
Lecture 35 Syed Mansoor Sarwar
Paging and Segmentation
CS703 - Advanced Operating Systems
CSE 451: Operating Systems Winter 2005 Page Tables, TLBs, and Other Pragmatics Steve Gribble 1.
CSE 153 Design of Operating Systems Winter 2019
Virtual Memory Lecture notes from MKP and S. Yalamanchili.
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Review What are the advantages/disadvantages of pages versus segments?
CSE 542: Operating Systems
Presentation transcript:

Efficient Virtual Memory for Big Memory Servers U Wisc and HP Labs ISCA’13 Architecture Reading Club Summer'131

Key points Big memory workloads  Memcached, databases, graph analysis Analysis shows  TLB misses can account for upto 51% of execution time  Rich features of Paged VM is not needed by most applications Proposal : Direct Segments  Paged VM as usual where needed  Segmentation where possible For big memory workloads – this eliminates 99% of data TLB misses ! Architecture Reading Club Summer'132

Main Memory Mgmt Trends The amount of physical memory has gone from a few MBs to a few GBs and then to several TBs now But at the same time the size of the DTLB has remained fairly unchanged  Pent III – 72 Pent IV – 64 Nehalem – 96 IvyBridge – 100 Also workloads were nicer in the days-gone-by (higher locality) So higher memory cap + const TLB + misbehaving apps = more TLB misses Architecture Reading Club Summer'133

So how bad is it really ? Architecture Reading Club Summer'134

Main Features of Paged VM FeatureAnalysisVerdict SwappingNo swappingNot required Per Page Access Perms99% of pages are read-writeOverkill Fragmentation mgmt.Very little OS visible fragmentation Per-page reallocation is not important Architecture Reading Club Summer'135

Main Memory Allocation Architecture Reading Club Summer'136

Paged VM – why is it needed ? Shared memory regions for Inter-Process-Communication Code regions protected by per-page R/W Copy on-write uses per-page R/W for lazy implementation of fork. Guard pages at the end of thread-stacks. Architecture Reading Club Summer'137 Dynamically allocated Heap region Paging Valuable Paging Not Needed Code ConstantsShared Memory Mapped Files Guard Pages VA * Stack

Direct Segments Hybrid Paged + Segmented memory (not one on top of the other). Architecture Reading Club Summer'138

Address Translation Architecture Reading Club Summer'139

OS Support : Handling Physical Memory Setup Direct Segment registers  BASE = Start VA of Direct Segment  LIMIT = End VA of Direct Segment  OFFSET = BASE – Start PA of Direct Segment  Save and restore register values as part of process metadata on context-switch Create contiguous physical memory region  Reserve at startup – big memory apps are cognizant of memory requirement at startup.  Memory compaction – latency insignificant for long running jobs Architecture Reading Club Summer'1310

OS Support : Handling Virtual Memory Primary regions  Abstraction presented to application  Contiguous Virtual address space backed by Direct Segment What goes in the primary region  Dynamically allocated R/W memory  Application can indicate what it needs to put in primary region The size of the primary region is set to a very high value to accommodate the whole of the physical memory if need be  64-bit VA support 128TB of VM, so pretty much never running out of VA space Architecture Reading Club Summer'1311

Evaluation Methodology Implement Primary Region in the kernel Find the number of TLB misses that would be served by the non-existent direct segments  x86 uses hardware page-table walker  they trap all TLB misses by duping the system into believing that the PTE residing in memory is invalid  In the handler They touch the page with the faulting address Again mark the PTE to invalid Architecture Reading Club Summer'1312

Results Architecture Reading Club Summer'1313

Results Architecture Reading Club Summer'1314

Why not large pages ? Huge pages does not automatically scale  New page size and/or more TLB entries TLBs dependent on access locality Fixed ISA-defined sparse page sizes  e.g., 4KB, 2MB, 1GB  Needs to be aligned at page size boundaries Multiple page sizes introduces TLB tradeoffs  Fully associative vs. set-associative designs Architecture Reading Club Summer'1315

Virtual Memory Basics Architecture Reading Club Spring'13 16 Core Cache TLB (Translation Lookaside Buffer) Process 1 Process 2 Virtual Address Space Physical Memory Page Table 16