Presentation is loading. Please wait.

Presentation is loading. Please wait.

Agile Paging: Exceeding the Best of Nested and Shadow Paging

Similar presentations


Presentation on theme: "Agile Paging: Exceeding the Best of Nested and Shadow Paging"— Presentation transcript:

1 Agile Paging: Exceeding the Best of Nested and Shadow Paging
Jayneel Gandhi, Mark D. Hill, Michael M. Swift

2 Can we get best of both for same address space (or same page walk)?
Executive Summary Problem: Virtualization valuable but have high overheads with larger workloads (at most 70% slower than native) Existing Choices: Nested Paging: slow page walk but fast page table updates Shadow Paging: fast page walk but slow page table updates Can we get best of both for same address space (or same page walk)? Yes, Agile Paging: use shadow paging and sometime switch to nested paging within the same page walk (at most 4% slower than native)

3 Outline Motivation  Agile Paging Results Summary

4 Virtualization Overview
APP APP Benefits: Foundation of our cloud infrastructure Provides on-demand virtual instances Helps server consolidation Guest OS VMM Problem: Overheads of virtualizing memory is high At most 70% slower than unvirtualized Hardware

5 Guest Physical Address
Virtualizing Memory APP APP gVA Guest Virtual Address Guest OS Guest Page Table gPA Guest Physical Address VMM Nested Page Table Hardware hPA Host Physical Address

6 Virtualizing Memory Two techniques to manage both page tables
gVA gPA Guest Page Table Nested Page Table Two techniques to manage both page tables Nested Paging -- Hardware Shadow Paging – Software Evaluated on two axis: Page Walk Latency & Page Table Updates

7 Unvirtualized x86-64 Translation
VA Virtual Address APP APP OS CR3 Hardware PA Physical Address At most mem accesses = 4

8 1. Nested Paging – Hardware
hPA gVA gPA Guest Page Table Nested Page Table gVA Longer Page Walk gCR3 hPA At most Mem accesses 5 + 5 + 5 + 5 + 4 = 24

9 2. Shadow Paging – Software
APP APP gVA Guest OS Guest Page Table (Read Only) Guest Page Table RO RO gPA Shadow Page Table VMM Nested Page Table Hardware hPA

10 2. Shadow Paging – Software
hPA Guest Page Table (Read Only) Nested Page Table gVA Shadow Page Table Shorter Page Walk sCR3 At most mem accesses = 4

11 Page Table Updates In-place fast update Slow meditated update
1. Nested Paging 2. Shadow Paging gVA gVA VMM Trap Guest Page Table Guest Page Table (Read Only) gPA Shadow Page Table Nested Page Table Nested Page Table hPA hPA In-place fast update Slow meditated update

12 Guest Virtual Address Space
Key Observation Fully static address space Reality !!! Guest Virtual Address Space Shadow Paging preferred Fully dynamic address space Small fraction of address space is dynamic Nested Paging preferred

13 Key Observation Guest Page Table gCR3 Nested Shadow

14 Outline Motivation Agile Paging Results Summary

15 Agile Paging Start page walk in shadow mode
-- Achieving fast TLB misses Optionally switch to nested mode -- Allowing fast in-place updates Two parts of design: 1. Mechanism Policy

16 1. Mechanism gVA gPA hPA Guest Page Table Shadow Page Table
Nested Page Table gCR3 Shadow Page Table Guest Page Table sCR3 1 Read only Nested Page Table

17 1. Mechanism: Example Page Walk
gVA gVA sCR3 gCR3 hPA Switch level 4 of guest page table At most Mem accesses 1 + 1 + 1 + 5 = 8

18 2. Policy: Shadow  Nested
Start Shadow Write to page table (VMM Trap) Shadow (1 Write) Write to page table (VMM Trap) Nested Subsequent Writes (No VMM Traps)

19 2. Policy: Nested  Shadow
Start Shadow Write to page table (VMM Trap) Shadow (1 Write) Write to page table (VMM Trap) Move non-dirty Timeout Use dirty bits to track writes to guest page table Nested Subsequent Writes (No VMM Traps)

20 Outline Motivation Agile Paging Results  Summary

21 Methodology Measure cost on page walks on real hardware
Intel 12-core Sandy-bridge with 96GB memory 64-entry L1 TLB entry L2 TLB 4-way associative for 4KB pages 32-entry L1 TLB 4-way associative for 2MB pages Prototype VMM and emulate hardware in Linux v BadgerTrap for online analysis of TLB misses and emulate agile paging Linear model to predict performance Workloads Big-memory workloads, SPEC 2006, BioBench, PARSEC

22 Performance Results Modeled based on emulator: BadgerTrap
B: Unvirtualized N: Nested Paging S: Shadow Paging A: Agile Paging Modeled based on emulator: BadgerTrap Measured using performance counters Solid bottom bar: Page walk overhead Hashed top bar: VMM overheads

23 Performance Results Nested Paging has high overheads of TLB misses
B: Unvirtualized N: Nested Paging S: Shadow Paging A: Agile Paging Nested Paging has high overheads of TLB misses Effect of longer page walk 28% 19% 18% 6% Solid bottom bar: Page walk overhead Hashed top bar: VMM overheads

24 Shadow Paging has high overheads of VMM interventions
Performance Results B: Unvirtualized N: Nested Paging S: Shadow Paging A: Agile Paging Shadow Paging has high overheads of VMM interventions 28% 70% 11% 19% 30% 18% 6% 6% Solid bottom bar: Page walk overhead Hashed top bar: VMM overheads

25 Agile paging consistently performs better than both techniques
Performance Results B: Unvirtualized N: Nested Paging S: Shadow Paging A: Agile Paging Agile paging consistently performs better than both techniques 28% 70% 11% 2% 19% 30% 18% 6% 2% 4% 6% 3% Solid bottom bar: Page walk overhead Hashed top bar: VMM overheads

26 Can we get best of both for same address space (or same page walk)?
Summary Problem: Virtualization valuable but have high overheads with larger workloads (At most 70% slower than native) Existing Choices: Nested Paging: slow page walk but fast page table updates Shadow Paging: fast page walk but slow page table updates Can we get best of both for same address space (or same page walk)? Yes, Agile Paging: use shadow paging and sometime switch to nested paging within the same page walk (At most 4% slower than native)

27 Questions ?

28 Can we get best of both worlds?
Nested Paging Shadow Paging Agile Paging Dimensions 2D 1D # of memory accesses 24 4 ~4-5 Page table updates Fast in-place Slow out of place

29 Short-Lived Processes
Issue: The cost of creating shadow page table is high Solution: Start shadow mode after 1 sec for agile paging Give user mode access to run only in nested mode

30 Accessed/Dirty Bits Issue: Shadow mode is slow for setting A/D bits
Coherence between shadow and guest page tables causes VMM traps. Solution: Hardware Optimization Intel sets accessed/dirty bits on both guest and nested page tables Broadwell supports multiple page table walkers per-core We propose to write A/D bits on all three page tables by hardware

31 Context-Switches Issue: Intra-guest context switches with shadow mode are slower Guest OS does not know existence of shadow page table --- VMM trap Solution: Hardware Optimization Add a small VMM managed cache of guest CR3  shadow CR3 Looked up by hardware for matching entry on context-switch If hits, does not require VMM trap

32 Why does agile paging work?
Switch Level Shadow L4 L3 L2 L1 Nested Mem. Acc. 4 8 12 16 20 24 Avg. graph500 99.8% 0.2% - 4.01 memcached 88.2% 4.5% 7.3% 4.76 canneal 94.7% 4.6% 0.7% 4.24 dedup 91.4% 2.2% 6.4% 4.60 Brings average number of memory accesses down to ~(4-5) from 24

33 Transparent Huge Page (2MB)
B: Unvirtualized N: Nested Paging S: Shadow Paging A: Agile Paging 68% 13% 14% 4% 2% 14% 5% 2% 10% 6% 3% 2% Solid bottom bar: Page walk overhead Hashed top bar: VMM overheads

34 Design Components Hardware VMM Three page table pointers
Points to each of the page tables Enhanced page table walker Interprets switching bit Bridges the two state machines Manage three page tables Incremental from shadow paging Policies for changing modes Encapsulate policies in VMM


Download ppt "Agile Paging: Exceeding the Best of Nested and Shadow Paging"

Similar presentations


Ads by Google