Presentation is loading. Please wait.

Presentation is loading. Please wait.

Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization.

Similar presentations


Presentation on theme: "Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization."— Presentation transcript:

1 Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization

2 10/1/20091 Virtualization virtualization

3  Classic Virtualization  Software Virtualization  Hardware Virtualization  Comparison and Results  Discussion Outline 10/1/20092

4 3  De-Privilege OS  Executes guest operating systems directly but at lesser privilege level, user-level Classic Virtualization(Trap-and-Emulate) OS apps kernel mode user mode

5 10/1/2009  De-Privilege OS  Executes guest operating systems directly but at lesser privilege level, user-level Classic Virtualization(Trap-and-Emulate) OS apps kernel mode user mode virtual machine monitor OS apps

6 10/1/20095  Runs guest operating system deprivileged.  All privileged instructions trap into VMM.  VMM emulates instructions against virtual state.  Resumes direct execution from next guest instruction. Trap-and-Emulate

7 10/1/20096  Architectural Obstacles  Traps are expensive. (~3000 cycles)  Many traps unavoidable. (e.g., page faults)  Not all architectures support the trap-and-emulate. (x86) Classic Virtualization (Cont’d)

8 10/1/2009 7 Classic Virtualization (Popek & Goldberg) System Virtualization Trap-and-emulate Software VMM Hardware VMM Enhancement Para-virtualization (Xen) Hardware Support for Virtualization (Intel VT & AMD SVM) Full-virtualization (VMware)

9  Classic Virtualization  Software Virtualization  Hardware Virtualization  Comparison and Results  Discussion Outline 10/1/20098

10 9  Until recently, the x86 architecture has not permitted classical trap-and-emulate virtualization.  Some privileged state is visible in user mode  Guest OS can observe that current privilege level (CPL) in code segment selection (%cs).  Not all privileged operations trap when run in user mode  Dual-purpose instructions don’t trap (popf).  Software VMMs for x86 have instead used binary translation of the guest code. Software Virtualization

11 10/1/200910  Translates the kernel code to replace privileged instructions with new sequences of instructions that have the intended effect on the virtual hardware.  The software VMM uses a translator with these properties.  Binary – input is machine-level code.  Dynamic – occurs at runtime.  On demand – code translated when needed for execution.  System level – makes no assumption about guest code.  Subsetting– translates from full instruction set to safe subset.  Adaptive – adjust code based on guest behavior to achieve efficiency. Binary Translation

12 10/1/200911  The translators input is full x86 instruction set, including all the privileged instructions; output is a safe subset of user-mode instructions Binary Translation (Cont’d)

13 10/1/200912 Binary Translation Translator Guest Code Translation Cache Callouts TC Index CPU Emulation Routines

14 10/1/200913 vPC mov ebx, eax cli and ebx, ~0xfff mov ebx, cr3 sti ret Guest Code Straight-line code Control flow Basic Block

15 10/1/200914 vPC mov ebx, eax cli and ebx, ~0xfff mov ebx, cr3 sti ret mov ebx, eax call HANDLE_CLI and ebx, ~0xfff mov [CO_ARG], ebx call HANDLE_CR3 call HANDLE_STI jmp HANDLE_RET start Guest CodeTranslation Cache

16 10/1/200915 vPC mov ebx, eax cli and ebx, ~0xfff mov ebx, cr3 sti ret mov ebx, eax mov [CPU_IE], 0 and ebx, ~0xfff mov [CO_ARG], ebx call HANDLE_CR3 mov [CPU_IE], 1 test [CPU_IRQ], 1 jne call HANDLE_INTS jmp HANDLE_RET start Guest CodeTranslation Cache

17 10/1/200916  Avoid privilege instruction traps  Example: rdtsc (read time-stamp counter) <- privileged instruction  Trap-and-emulate: 2030 cycles  Callout-and-emulate: 1254 cycles (not TC)  In TC emulation: 216 cycles Performance Advantages of BT

18  Classic Virtualization  Software Virtualization  Hardware Virtualization  Comparison and Results  Discussion Outline 10/1/200917

19 10/1/200918  Recent x86 extension  1998 – 2005: Software-only VMMs using binary translation  2005: Intel and AMD start extending x86 to support virtualization.  First-generation hardware  Allows classical trap-and-emulate VMMs.  Intel VT (Virtualization Technology)  AMD SVM (Security Virtual Machine)  Performance  VT/SVM help avoid BT, but not MMU ops. (actually slower!)  Main problem is efficient virtualization of MMU and I/O, Not executing the virtual instruction stream. Hardware Virtualization

20 10/1/200919  VMCB(Virtual Machine Control Block)  in-memory data structure  Contains the state of guest virtual CPU.  Modes  Non-root mode: guest OS runs at its intended privilege level(ring 0) (Not fully privileged)  Root mode: VMM is running at a new ring with an even higher privilege level(Fully privileged)  Instructions  vmrun: transfers from root to non- root mode.  exit: transfers from non-root to root mode. New Hardware Features

21 10/1/200920 Intel VT-x Operations Ring 0 VMX Root Mode VMX Non-root Mode... Ring 0 Ring 3 VM 1 Ring 0 Ring 3 VM 2 Ring 0 Ring 3 VM n VMLAUNCH VM Run VM Exit VMCB 2 VMCB n VMCB 1

22 10/1/200921  Hardware VMM reduces guest OS dependency  Eliminates need for binary translation  Facilitates support for Legacy OS  Hardware VMM improves robustness  Eliminates need for complex SW techniques  Simpler and smaller VMMs  Hardware VMM improves performance  Fewer unwanted (Guest  VMM) transitions Benefits of Hardware Virtualization

23  Classic Virtualization  Software Virtualization  Hardware Virtualization  Comparison and Results  Discussion Outline 10/1/200922

24 10/1/200923  BT tends to win in these areas:  Trap elimination – BT can replace most traps with faster callouts.  Emulation Speed – callouts jump to predecoded emulation routine.  Callout avoidance – for frequent cases, BT may use in-TC emulation routines, avoiding even the callout cost.  The hardware VMM wins in these area:  Code Density – since there is no translation.  Precise exceptions – BT performs extra work to recover guest state for faults.  System calls – runs without VMM intervention. Software VMM vs. Hardware VMM

25 10/1/200924  Software VMM – VMware Player 1.0.1  Hardware VMM – VMware implemented experimental hardware assisted VMM.  Host – HP workstation, VT-enabled  3.8 GHz Intel Pentium  All experiments are run natively, on software VMM and on Hardware-assisted VMM. Experiments

26 10/1/200925  Test to stress process creation and destruction  system calls, context switching, page table modifications, page faults, etc.  Results – to create and destroy 40,000 processes  Host – 0.6 seconds  Software VMM – 36.9 seconds  Hardware VMM – 106.4 seconds Forkwait Test

27 10/1/200926  Benchmark  Custom guest OS – FrobOS  Tests performance of single virtualization sensitive operation  Observations  Syscall (Native == HW << SW)  Hardware – No VMM intervention in so near native  Software – traps  in (SW << Native << HW)  Native – access a off-CPU register  Software VMM – translates “in” into a short sequence of instructions that access virtual model of the same.  Hardware – VMM intervention Nanobenchmarks

28 10/1/200927  Observations (Cont’d)  ptemod (Native << SW << HW)  Both use shadowing technique to implement guest paging using traces for coherency  PTE writes causes significant overhead compared to native Nanobenchmarks (Cont’d)

29  Classic Virtualization  Software Virtualization  Hardware Virtualization  Comparison and Results  Discussion Outline 10/1/200928

30 10/1/200929  Microarchitecture  Hardware overheads will shrink over time as implementations mature.  Measurements on desktop system using a pre-production version Intel’s Core microarchitecture.  Hardware VMM algorithmic changes  Drop trace faults upon guest PTE modification, allowing temporary incoherency with shadow page tables to reduce costs.  Hybrid VMM  Dynamically selects the execution technique  Hardware VMM’s superior system call performance  Software VMM’s superior MMU performance  Hardware MMU support  Trace faults, context switches and hidden page faults can be handled effectively with hardware assistance in MMU virtualization. Opportunities

31 10/1/200930  Hardware extensions allow classical virtualization on x86 architecture.  Extensions remove the need for Binary Translation and simplifies VMM design.  Software VMM fares better than Hardware VMM in many cases like context switches, page faults, trace faults, I/O.  New MMU algorithms might narrow the gap in performance. Conclusion

32 10/1/200931  Benchmarks  Apache ab benchmarking tool – on Linux installation of Apache http server and on Windows installation  Tests I/O efficiency  Observations  Both VMMs perform poorly  Performance on Windows and Linux differ widely  Reason: Apache Configuration  Windows – single address space (less paging)  Hardware VMM is better  Linux – multiple address spaces (more paging)  Software VMM is better Server Workload

33 10/1/200932  Benchmark  PassMark on Windows XP Professional  The suite of microbenchmarks test various aspects of workstation performance.  Observations  Large RAM test  Exhausts memory. (paging capabilities)  Intended to test paging capability.  Software VMM is better.  2D Graphics test  Involves system calls.  Hardware VMM is better. Desktop-Oriented Workload

34 10/1/200933  Compilation times Linux kernel and Apache (on Cygwin)  Observation  Big compilation jobs – lots of page faults.  Software VMM is better in handling page faults. Less Synthetic Workload

35 10/1/200934

36 10/1/200935

37 10/1/200936

38 10/1/200937

39 10/1/200938

40 10/1/200939


Download ppt "Keith Adams, Ole Agesen 1st October 2009 Presented by Chwa Hoon Sung, Kang Joon Young A Comparison of Software and Hardware Techniques for x86 Virtualization."

Similar presentations


Ads by Google