Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making Virtual Memory Real: The Linux-x86-64 way

Similar presentations


Presentation on theme: "Making Virtual Memory Real: The Linux-x86-64 way"— Presentation transcript:

1 Making Virtual Memory Real: The Linux-x86-64 way
Arka Basu

2 Mechanics of Virtual Memory
Address mapping and translation (Mostly H/W) Page tables, TLB, page table walkers Virtual address allocation Representing/managing virtual address spaces User interface to OS virtual address allocation Physical memory allocation Linux’s buddy memory allocation Page fault handling (on demand paging) Updating/Invalidating address mapping/permission Interface to request update/invalidation Mechanics of a TLB shootdown Focus

3 Typical virtual memory layout
Kernel virtual address space 0x7ffffffff Stack Mmaped memory (dynamically allocation) Heap Static data Code 0x00000

4 Data structures representing VA space
ptr to PT root VMAs or VM areas: Represents chunks of allocated virtual address ranges. start/end stack pid start/end code status start/end mmap ptr to VA space Ending VA Starting VA vma_area ptr list of open files Flags/Prot VM_READ VM_WRITE VM_SHARED ……………. list of signals struct mm_struct Represents a virtual address space struct task_struct Represents a process

5 Allocating memory a.k.a. virtual address
User application or library requests VA allocation via system calls. void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); Length has to be multiple of 4KB Prot  PROT_NONE, PROT_READ, PROT_WRITE… Flags MAP_ANONYMOUS, MAP_SHARED, MAP_PRIVATE, MAP_SHARED, MAP_FIXED, MAP_HUGE_2MB, MAP_HUGE_1GB

6 mmap adds extends or add new VMA
ptr to PT root start/end stack pid start/end code status start/end mmap ptr to VA space Ending VA Starting VA vma_area ptr list of open files Flags/Prot vma_cache VM_READ VM_WRITE VM_SHARED ……………. list of signals struct mm_struct Represents a virtual address space struct task_struct Represents a process

7 Allocating memory a.k.a. virtual address
System call to extend heap int sbrk (increment _bytes) Heap – contiguous virtual address for dynamically allocated memory

8 sbrk updates VMA for the heap
ptr to PT root start/end stack pid start/end code status start/end mmap ptr to VA space Ending VA Starting VA vma_area ptr list of open files Flags/Prot VM_READ VM_WRITE VM_SHARED ……………. list of signals struct mm_struct Represents a virtual address space struct task_struct Represents a process

9 Mechanics of Virtual Memory
Address mapping and translation (Mostly H/W) Page tables, TLB, page table walkers Virtual address allocation Representing/managing virtual address spaces User interface to OS virtual address allocation Physical memory allocation Linux’s buddy memory allocation Page fault handling (on demand paging) Updating/Invalidating address mapping/permission Interface to request update/invalidation Mechanics of a TLB shootdown Focus

10 Demand paging of physical memory
Events Processing (int *) a = mmap((void *)0, 8096, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0) OS creates/extends VMAs. Returns VA to user (value of *a). load a H/W TLB miss. H/W page walk. H/W raise page fault signal. OS check if the VA of load is valid by checking VMAs. If not, raise seg fault to app. If valid, find physical page frame(s) to map the fault VA.

11 Representing physical memory
page descriptor (struct page) One for each 4KB of physical memory 32 bytes long (<1% overhead) All descriptor maintained in an array Important information contained in it: Number of virtual pages mapping to it Pointer back to virtual pages mapping (reverse mapping) Flags, e.g., if the page frame is locked, free, etc.

12 Managing free physical page frames
OS keeps a pool of free pages Min. number of free pages is heuristic based but alterable Swapping is triggered when low on free pages Keeps free pages in “Buddy allocator” Goal: Keep contiguous physical page frames Why contiguous physical frames (address) matter?

13 The Buddy allocator A list of free list of contiguous physical pages of different sizes (2order x 4KB) 4KB Order=0 8KB Order=1 16KB Order=2 Order=3 Order=4 64KB Order=10

14 Demand paging of physical memory
Events Processing (int *) a = mmap((void *)0, 8096, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0) OS creates/extends VMAs. Returns VA to user (value of *a). load a H/W TLB miss. H/W page walk. H/W raise page fault signal. OS check if the VA of load is valid by checking VMAs. If not, raise seg fault to app. If valid, find a free physical page frame(s). (Ask buddy allocator) Update page table entry to map faulting VA to the free page frame and return from fault. Retry load a H/W TLB miss, H/W page walker load VA->PA to TLB. Execution continues.

15 Mechanics of Virtual Memory
Address mapping and translation (Mostly H/W) Page tables, TLB, page table walkers Virtual address allocation Representing/managing virtual address spaces User interface to OS virtual address allocation Physical memory allocation Linux’s buddy memory allocation Page fault handling (on demand paging) Updating/Invalidating address mapping/permission Interface to request update/invalidation Mechanics of a TLB shootdown Focus

16 Updating address mapping/permission
Why update? OS Swapping, Copy-on-Write, Page migration User Change page permissions, unmap System call to change page permission mprotect(void *addr, size_t len, int new_prot) System call free/unmap memory int munmap(void *addr, size_t len);

17 Update VMA flags, delete/split VMAs
ptr to PT root start/end stack pid start/end code status start/end mmap ptr to VA space Ending VA Starting VA vma_area ptr list of open files Flags/Prot Vm_cache VM_READ VM_WRITE VM_SHARED ……………. list of signals struct mm_struct struct task_struct Update page table entry, issue TLB shootdown

18 Steps of TLB shootdown OS on the initiator core updates the page table entry OS on the initiator core finds a set of other cores that may have stale entry in the TLB OS on the initiator core sends inter-process-interrupt (IPI) to other cores in the list and waits for ack OS on the initiator core uses invlpg instruction or writes to cr3 to invalidate local TLB entries, while waiting for ack Other cores context switch to OS thread and invalidate entries in their local TLBs via invlpg or write to cr3 Other cores sends ack to the initiator core TLB shootdown completes after initiator receives all ack

19 Special topic: Memory mapped files

20 Mapping parts of file to virtual address
User application or library requests VA allocation via system calls. void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); Length has to be multiple of 4KB Prot  PROT_NONE, PROT_READ, PROT_WRITE… Flags MAP_ANONYMOUS, MAP_SHARED, MAP_PRIVATE, MAP_SHARED, MAP_FIXED, MAP_HUGE_2MB, MAP_HUGE_1GB

21 Mapping a file to virtual address
Traditional way to access file content: int fd = open(const char *path, int oflag,..) Flags : O_RDONLY, O_CREAT, O_RDWR ssize_t read(int fd, void *buf, size_t count) Read file data in buf Mapping a file content: int * a = (int *) mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) Access file content as if accessing an array starting at address “a”


Download ppt "Making Virtual Memory Real: The Linux-x86-64 way"

Similar presentations


Ads by Google