Making Virtual Memory Real: The Linux-x86-64 way

Slides:



Advertisements
Similar presentations
Memory management.
Advertisements

Instructors: Randy Bryant and Dave O’Hallaron
Read vs. mmap Tan Li. Man mmap #include void *mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset); int munmap(void *start, size_t.
The ‘mmap()’ method Adding the ‘mmap()’ capability to our ‘vram.c’ device-driver.
– 1 – P6 (PentiumPro,II,III,Celeron) memory system bus interface unit DRAM Memory bus instruction fetch unit L1 i-cache L2 cache cache bus L1 d-cache inst.
Memory Mapping Sarah Diesburg COP5641.
1 Virtual Memory: Systems Level Andrew Case Slides adapted from jinyang Li, Randy Bryant and Dave O’Hallaron.
CS 153 Design of Operating Systems Spring 2015 Lecture 17: Paging.
Chapter 4 Memory Management Virtual Memory.
CSNB334 Advanced Operating Systems 4. Concurrency : Mutual Exclusion and Synchronization.
Processes and Virtual Memory
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Virtual Memory Implementation.
1 Pintos Virtual Memory Management Project (CS3204 Spring 2006 VT) Yi Ma.
CS 105 “Tour of the Black Holes of Computing!”
P6/Linux Memory System Topics P6 address translation Linux memory management Linux page fault handling memory mapping vm2.ppt CS 105 “Tour of the Black.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
COS 318: Operating Systems Virtual Memory Design Issues.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 4.
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Virtual Memory: Systems CENG331 - Computer Organization.
Virtual Memory: Systems
Virtual Memory Alan L. Cox Some slides adapted from CMU slides.
Lecture 24 – Paging implementation
CS161 – Design and Architecture of Computer
CS 105 “Tour of the Black Holes of Computing!”
Chapter 2: The Linux System Part 4
CS 105 “Tour of the Black Holes of Computing!”
CS161 – Design and Architecture of Computer
CSNB334 Advanced Operating Systems 4
Memory Caches & TLB Virtual Memory
Virtual Memory: Systems
Section 9: Virtual Memory (VM)
Structure of Processes
Andrew Hanushevsky: Memory Mapped I/O
Today How was the midterm review? Lab4 due today.
Virtual Memory: Concepts CENG331 - Computer Organization
PA1 is out Best by Feb , 10:00 pm Enjoy early
CS510 Operating System Foundations
CSE 153 Design of Operating Systems Winter 2018
Virtual Memory: Systems /18-213/14-513/15-513: Introduction to Computer Systems 18th Lecture, October 25, 2018.
Virtual Memory Partially Adapted from:
CSCI206 - Computer Organization & Programming
Virtual Memory: Systems
Virtual Memory: Systems
Windows CE Memory Management
Memory System Case Studies Oct. 13, 2008
Pentium/Linux Memory System
P6 (PentiumPro,II,III,Celeron) memory system
Virtual Memory Hardware
Pentium III / Linux Memory System April 4, 2000
Virtual Memory.
Instructor: Phil Gibbons
CSE 451: Operating Systems Autumn 2005 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Virtual Memory: Systems CSCI 380: Operating Systems
CS 105 “Tour of the Black Holes of Computing!”
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CS 105 “Tour of the Black Holes of Computing!”
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Lecture 7: Flexible Address Translation
Lecture 8: Efficient Address Translation
CS703 - Advanced Operating Systems
CSE 153 Design of Operating Systems Winter 2019
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Buddy Allocation CS 161: Lecture 5 2/11/19.
Structure of Processes
CS 105 “Tour of the Black Holes of Computing!”
Virtual Memory and Paging
Dirty COW Race Condition Attack
P6 (PentiumPro,II,III,Celeron) memory system
Presentation transcript:

Making Virtual Memory Real: The Linux-x86-64 way Arka Basu

Mechanics of Virtual Memory Address mapping and translation (Mostly H/W) Page tables, TLB, page table walkers Virtual address allocation Representing/managing virtual address spaces User interface to OS virtual address allocation Physical memory allocation Linux’s buddy memory allocation Page fault handling (on demand paging) Updating/Invalidating address mapping/permission Interface to request update/invalidation Mechanics of a TLB shootdown Focus

Typical virtual memory layout Kernel virtual address space 0x7ffffffff Stack Mmaped memory (dynamically allocation) Heap Static data Code 0x00000

Data structures representing VA space ptr to PT root VMAs or VM areas: Represents chunks of allocated virtual address ranges. start/end stack pid start/end code status start/end mmap ptr to VA space Ending VA Starting VA vma_area ptr list of open files Flags/Prot VM_READ VM_WRITE VM_SHARED ……………. list of signals struct mm_struct Represents a virtual address space struct task_struct Represents a process

Allocating memory a.k.a. virtual address User application or library requests VA allocation via system calls. void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); Length has to be multiple of 4KB Prot  PROT_NONE, PROT_READ, PROT_WRITE… Flags MAP_ANONYMOUS, MAP_SHARED, MAP_PRIVATE, MAP_SHARED, MAP_FIXED, MAP_HUGE_2MB, MAP_HUGE_1GB

mmap adds extends or add new VMA ptr to PT root start/end stack pid start/end code status start/end mmap ptr to VA space Ending VA Starting VA vma_area ptr list of open files Flags/Prot vma_cache VM_READ VM_WRITE VM_SHARED ……………. list of signals struct mm_struct Represents a virtual address space struct task_struct Represents a process

Allocating memory a.k.a. virtual address System call to extend heap int sbrk (increment _bytes) Heap – contiguous virtual address for dynamically allocated memory

sbrk updates VMA for the heap ptr to PT root start/end stack pid start/end code status start/end mmap ptr to VA space Ending VA Starting VA vma_area ptr list of open files Flags/Prot VM_READ VM_WRITE VM_SHARED ……………. list of signals struct mm_struct Represents a virtual address space struct task_struct Represents a process

Mechanics of Virtual Memory Address mapping and translation (Mostly H/W) Page tables, TLB, page table walkers Virtual address allocation Representing/managing virtual address spaces User interface to OS virtual address allocation Physical memory allocation Linux’s buddy memory allocation Page fault handling (on demand paging) Updating/Invalidating address mapping/permission Interface to request update/invalidation Mechanics of a TLB shootdown Focus

Demand paging of physical memory Events Processing (int *) a = mmap((void *)0, 8096, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0) OS creates/extends VMAs. Returns VA to user (value of *a). load a H/W TLB miss. H/W page walk. H/W raise page fault signal. OS check if the VA of load is valid by checking VMAs. If not, raise seg fault to app. If valid, find physical page frame(s) to map the fault VA.

Representing physical memory page descriptor (struct page) One for each 4KB of physical memory 32 bytes long (<1% overhead) All descriptor maintained in an array Important information contained in it: Number of virtual pages mapping to it Pointer back to virtual pages mapping (reverse mapping) Flags, e.g., if the page frame is locked, free, etc.

Managing free physical page frames OS keeps a pool of free pages Min. number of free pages is heuristic based but alterable Swapping is triggered when low on free pages Keeps free pages in “Buddy allocator” Goal: Keep contiguous physical page frames Why contiguous physical frames (address) matter?

The Buddy allocator A list of free list of contiguous physical pages of different sizes (2order x 4KB) 4KB Order=0 8KB Order=1 16KB Order=2 Order=3 Order=4 64KB Order=10

Demand paging of physical memory Events Processing (int *) a = mmap((void *)0, 8096, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0) OS creates/extends VMAs. Returns VA to user (value of *a). load a H/W TLB miss. H/W page walk. H/W raise page fault signal. OS check if the VA of load is valid by checking VMAs. If not, raise seg fault to app. If valid, find a free physical page frame(s). (Ask buddy allocator) Update page table entry to map faulting VA to the free page frame and return from fault. Retry load a H/W TLB miss, H/W page walker load VA->PA to TLB. Execution continues.

Mechanics of Virtual Memory Address mapping and translation (Mostly H/W) Page tables, TLB, page table walkers Virtual address allocation Representing/managing virtual address spaces User interface to OS virtual address allocation Physical memory allocation Linux’s buddy memory allocation Page fault handling (on demand paging) Updating/Invalidating address mapping/permission Interface to request update/invalidation Mechanics of a TLB shootdown Focus

Updating address mapping/permission Why update? OS Swapping, Copy-on-Write, Page migration User Change page permissions, unmap System call to change page permission mprotect(void *addr, size_t len, int new_prot) System call free/unmap memory int munmap(void *addr, size_t len);

Update VMA flags, delete/split VMAs ptr to PT root start/end stack pid start/end code status start/end mmap ptr to VA space Ending VA Starting VA vma_area ptr list of open files Flags/Prot Vm_cache VM_READ VM_WRITE VM_SHARED ……………. list of signals struct mm_struct struct task_struct Update page table entry, issue TLB shootdown

Steps of TLB shootdown OS on the initiator core updates the page table entry OS on the initiator core finds a set of other cores that may have stale entry in the TLB OS on the initiator core sends inter-process-interrupt (IPI) to other cores in the list and waits for ack OS on the initiator core uses invlpg instruction or writes to cr3 to invalidate local TLB entries, while waiting for ack Other cores context switch to OS thread and invalidate entries in their local TLBs via invlpg or write to cr3 Other cores sends ack to the initiator core TLB shootdown completes after initiator receives all ack

Special topic: Memory mapped files

Mapping parts of file to virtual address User application or library requests VA allocation via system calls. void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); Length has to be multiple of 4KB Prot  PROT_NONE, PROT_READ, PROT_WRITE… Flags MAP_ANONYMOUS, MAP_SHARED, MAP_PRIVATE, MAP_SHARED, MAP_FIXED, MAP_HUGE_2MB, MAP_HUGE_1GB

Mapping a file to virtual address Traditional way to access file content: int fd = open(const char *path, int oflag,..) Flags : O_RDONLY, O_CREAT, O_RDWR ssize_t read(int fd, void *buf, size_t count) Read file data in buf Mapping a file content: int * a = (int *) mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) Access file content as if accessing an array starting at address “a”