Presentation on theme: "Virtual Machine Monitors. Bibliography 1.“Virtual Machine Monitors: Current Technology And Future Trends”, Mendel Rosenblum and Tal Garfinkel, IEEE Computer,"— Presentation transcript:
Virtual Machine Monitors
Bibliography 1.“Virtual Machine Monitors: Current Technology And Future Trends”, Mendel Rosenblum and Tal Garfinkel, IEEE Computer, May “Xen and the Art of Virtualization”, P. Barham, R. Dragovic, K. Fraser, S. Hand, T. Harris, A Ho, R. Neugebauer, I. Pratt, A. Warfield, SOSP ’03. 3.The Definitive Guide to the Xen Hypervisor, David Chisnall, Prentice Hall, “Scale and Performance in the Denali Isolation Kernel”, Andrew Whitaker, Marianne Shaw, and Steven D. Gribble, in System Design and Implementation (OSDI), Boston, MA, Dec Denali: Lightweight virtual Machines for Distributed and Networked Applications”, Andrew Whitaker, Marianne Shaw, and Steven D. Gribble, Proc. USENIX annual Technical Conference, June Xen Homepage: 7.VMWare:
Outline Overview –What is a virtual machine? –What is a virtual machine monitor (VMM)? –System or application (process) virtual machines History of Virtual Machines Benefits of Virtual Machines Issues and Implementation Examples
What is it? (1) What is virtualization? an abstraction or simulation of hardware resources –e.g., virtual memory A virtual machine is an isolated environment that appears to be a whole computer, but actually only has access to a portion of the computer’s resources. –Similar to, but much more than, the illusion provided by a multitasking operating system.
What is it? (2) A virtual machine monitor (VMM) is the software layer that supports one or more virtual machines –Each VM appears to run on bare hardware, giving the appearance of multiple instances of the same computer, but all run on a single machine. –VMM is also called a hypervisor Guest operating system: an operating system that runs in a VM, supported by the VMM, rather than directly on the hardware.
System & Process VMs (1) System (hardware) virtual machine - See previous slides –Provides a complete system –Each VM can run its own OS, which in turn can run multiple applications Process or application virtual machine; e.g., JVM –Runs inside (under the control of) a normal OS –Provides a platform-independent host for a single application at a time (each platform needs a different JVM, however)
System & Process VMs (2) System virtual machine –One machine appears to be multiple identical machines, each running its own operating system which in turn runs user jobs which are compiled to run on the underlying hardware Process or application virtual machine –Source code is compiled into a “machine” code that represents the instruction set of a virtual (not real) machine. –The same byte code can be “executed” by any computer that has the appropriate interpreter/virtual machine, independently of the actual underlying hardware –Examples: Java byte code + JVM, Microsoft Common Language Infrastructure +.NET framework
System VMMs – Three Types Traditional: VMM is a thin software layer that runs directly on the host machine hardware –Main advantage: better performance than hosted –VMWare vSphere, ESXi Servers, Xen, OS370, Denali –Also called a “bare metal” VMM Hosted: VMM runs on top of an existing OS. –Main advantage: easier to build; easier to install –Examples: User-mode Linux Hosted/Hybrid: shares the hardware with existing OS –Example: VMWare Workstation
9 Unprivileged machine instructions: available to any program Privileged instructions: hardware interface for the OS/other privileged software System calls: interface to the operating system for applications & library functions API: An OS interface through library function calls from applications. Computer System Interfaces/Traditional Model
10 Two Ways to Virtualize Process Virtual Machine: program is compiled to intermediate code, executed by a runtime system Virtual Machine Monitor: software layer mimics the instruction set; supports an OS and its applications
Hosted/Hybrid versus Non-hosted VMM Hosted has 3 advantages  –VMM is no harder to install than any other application –The VMM can use the host OS scheduler, pager, etc. and focus primarily on isolation; (hybrid doesn’t use all host features.) –I/O support is better: the VMM can use the device drivers that are designed to work with the host OS rather than having to provide its own. (Hybrid may be limited to using host I/O)
Hosted versus Non-hosted VMM Disadvantage  –I/O overhead is “greatly increased”: requests go from guest OS to VMM to host OS and down eventually to the device driver. –Too inefficient for servers More difficult to guarantee complete isolation, so not appropriate for servers from a security perspective.
Hosted v Non-hosted VMM Conclusion: –Hosting is a good approach for individual work stations; reduces effort needed to get VMM up and running; performance isn’t a major issue. –Hosting is not advisable for servers. Security issues are the most important concern, followed by added overhead for I/O and any other host OS services that are used.
VM – How They Work (1) VMM runs in kernel mode (replacing tradtional OS) Guest OS runs in user mode –Some modern hardware has a third mode for the guest OS For the most part, applications run normally and execute machine code directly (direct execution) What about system calls or other attempts by user processes to execute privileged instructions?
VM – How They Work (2) If the guest OS runs in user mode how can it execute privileged code? It can’t. When it tries to execute a privileged instruction, the VMM traps the operation, and executes in place of the guest OS –e.g., when a guest OS appears to execute an I/O system call, the VMM is actually in charge of the actual I/O processing.
Virtualization versus Emulation Virtualization presents multiple copies of the same hardware system. –Direct execution of code on the hardware Emulation presents a model of another hardware system –Instructions are “emulated” in software – much slower than virtualization –Example: Microsoft’s VirtualPC could run on other chipsets than the x86 family; used on Mac hardware until Apple adopted Intel chips
Full Virtualization versus Paravirtualization Full virtualization: each virtual machine runs on an exact copy of the actual hardware. Paravirtualization: each virtual machine runs on a slightly modified copy of the actual hardware –Because some aspects of the hardware can’t be virtualized (see examples later) –To present a simpler interface; improve performance.
History - Why VMM’s? Early computers were large (mainframes) and expensive VMM approach allowed the machine to be safely multiplexed among many different applications An alternative to multiprogramming
Virtual Machines - History Early example: the IBM 370 –VM/370 is the virtual machine monitor –As each user logs on, a new “virtual machine” is created –CMS, a single-user, interactive OS was commonly run as the OS on each VM Separation of powers: –Virtual machine/guest OS interacts with user applications –Virtual machine monitor manages hardware resources – compare to exokernel concept
History – 1980s & 1990s As hardware got cheaper and operating systems became better equipped to handle multitasking, the original motivation went away. Hardware platforms gradually eliminated hardware support for virtualization. And then …
History – late 90s Massively parallel processors (MPPs) were developed during the 1990s; they were hard to program and did not support existing operating systems Researchers at Stanford used virtualization to make MPPs look more like traditional machines Other research groups explored different approaches to VMs Result: today, virtual machines are very common, although the MPPs of the 90s have been mostly replaced by clusters – and in some areas MPP is now used to refer to multicore chips. Hitachi MPP
Example Virtual Machine Systems VMware: commercial products, derived from research done at Stanford Xen: open source, Cambridge University, widely used in research and academia; xen.org Denali: University of Washington, focused on support for Internet services –Never commercialized
VMware VMware, a publicly held company, founded by Stanford developers Two lines of products: –Desktop : a range of products; advertised as a way for corporations to migrate and upgrade operating systems from a centralized IT center –VMware vSphere hypervisor is a “bare-metal hypervisor” that supports server consolidation –Vmware also virtualizes datacenters, networks and cloud applications (with Vmware vSphere and vCloud suite)
Xen: Xen: open-source VM system for x86, Itanium, ARM & others Originated at Cambridge University Computer Lab Now supported as an open-source product that has destktop, server, and cloud capabilities (Amazon uses it for its cloud services.) Designed to support execution of Linux, other Unix-like systems (Solaris, BSD), Windows OS’s simultaneously on the same platform Objective of original project: efficient hosting of up to 100 virtual machines
Hyper-V Hyper_V is Microsoft’s server virtualization software: –Each virtual machine (user program + guest OS) is encapsulated in a partition supported by the VMM –Three execution modes: Ring 0, 1, 2 –Requires special hardware to support virtualization.
Denali Research project – U of Washington –Time frame ~ Problem addressed: hosting Internet services economically Goal: to allow new, untrusted, services to be hosted on third-party servers. –Protection provided by VM concept lets servers safely host multiple different services. –Encapsulation lets services be swapped in and out of memory easily so multiple services can share one machine
Reasons for Adopting VMMs Flexibility in choice of operating system Encapsulation: of an operating system, (virtual) computer system, and one or more applications into a single unit Isolation/Security: supported by encapsulation; systems compromised by internal failure or external attack are isolated and their failure doesn’t affect other VMs.
OS Flexibility Support several operating systems at the same time on a single hardware platform Ability to experiment with new operating systems, or modifications of existing systems, while maintaining backward compatibility with existing systems. Hardware can change faster than software – now you can run an existing application and the OS that supports it on a new computer, thanks to the VMM layer.
Encapsulation Conventionally, servers ran on dedicated machines. –Protects against another server/application crashing the OS –But … wasteful of hardware resources Encapsulation means that the complete state of a given VM can be saved to one or a few files – similar to checkpointing an application. Furthermore, the state of one VM is totally separate from the state of any other VM. This is enforced by the VMM’s resource allocation policies.
Isolation Virtual machines are as separate from each other (isolated) as if they actually were separate computers. Applications in a VM are protected from faults in other VMs, in part because of encapsulation, and because the VMM controls resource allocation and usage by the guest OS’s –Viruses, buggy applications, other problems that cause crashes or corrupt the OS they run on will not affect other VMs
Virtualization in Distributed Systems Rosenblum and Garfinkel  point out that encapsulation supports the portability of virtual machines, which in turn means it is easy and safe to move (or replicate) servers –This supports load balancing and maintenance Or, multiple services can safely share a single computer thanks to encapsulation & isolation. –Since many services aren’t frequently used this can a great cost saver.
Desirable Qualities A good VMM –Doesn’t require applications to be modified –Doesn’t severely affect performance –Is not complex/error prone
Implementation Issues Virtualize CPU –Guest OS runs as if it is executing directly on the hardware CPU, but it isn’t Virtualize memory –Guest OS thinks it is managing memory directly, but it isn’t Paravirtualization versus binary translation Hardware-assisted virtualization
CPU Virtualization Basic technique: direct execution –As long as it is executing unprivileged instructions the virtual machine (guest OS + applications) executes hardware instructions directly. Note that in emulation direct execution isn’t possible since applications & the OS think they are running on a different ISA. –If the guest OS tries to execute a privileged instruction the CPU traps to the VMM which executes the privileged operation. VMM runs in privileged (kernel) mode, guest OS runs in user mode.
Example: Disable Interrupts  If a guest OS tries to disable interrupts, the instruction is trapped by the VMM which makes a note that interrupts are disabled for that virtual machine only. If interrupts arrive for the VM that disabled them, they are buffered at the VMM layer until the guest OS enables interrupts. Other interrupts are directed to VMs that have not disabled them.
Direct Execution Not Always Possible Modern CPUs, esp. x86 architectures, were not designed for virtualization. Example: POPF (pop CPU flags from stack) –If executed in user mode, no trap – it’s just ignored by the hardware –In this case, direct execution fails – Guest OS assumes flags have been popped, but they haven’t been because the VMM isn’t notified.
Two Ways to Handle Non- virtualizable Instructions Paravitualization –Xen, Denali Binary Translation –VMware Both use the same basic approach: catch non-virtualizable instructions and emulate them in software at the VMM level. –Difference: when they are detected
Paravirtualization Rewrite portions of the guest OS to replace non- virtualizable instructions with a trap to the VMM, which executes or emulates the instruction on behalf of the guest OS –e.g., remove POPFs; substitute a call to the VMM Paravirtualization affects the guest OS, but not applications that run on it – the API is unchanged Paravirtualization is also used sometimes to replace inefficient operations with more efficient ones.
Dynamic Binary Translation Dynamic binary translation looks at a short sequence of (binary) source code, translates it, and caches the resulting sequence. [ ]http://en.wikipedia.org/wiki/Binary_translation –Similar to JIT compilers. During this process VMware’s DBT replaces non- virtualizable instructions with equivalent code that can be virtualized. Compare to static binary translation, done by a compiler, which translates to binary at compile time.
Comparison Paravirtualization changes the source code of a guest OS; dynamic binary translation generates modified binary code only if needed. Paravirtualization is more efficient, but requires modification to the guest OS –Paravirtualization also allows more efficient interfaces, in some cases Binary translation is backward-compatible but has some extra overhead of run-time translation the first time an instruction is encountered.
Hardware-assisted Virtualization AMD-V and Intel VT are architecture extensions to support virtualization on AMD and Intel hardware. –New execution modes Allows guest OS to run in a different “ring” than user programs, and VMM in yet a higher privileged mode –Flags show which mode the CPU is currently running in –Essentially, the trap and emulate mode used in paravirtualization or binary translation is now done in hardware. Does away with need to modify guest OS; is faster than binary translation.
Memory Virtualization VMM maintains a shadow page table for each virtual machine. When the guest OS makes an entry in its own page table, the VMM makes the same entry in the shadow table. Shadow page table points to actual page frame –The hardware MMU uses the shadow page table when it translates virtual addresses.
Challenges Let the guest OS decide which of its pages to swap out VMware’s ESX Server used the concept of a balloon process, running inside the guest OS . When the VMM wants to swap out pages from a given VM it notifies the balloon process to allocate more memory to itself. The guest OS must “page out” unused portions of other processes to its virtual disk. The VMM now knows which pages the guest OS thinks it can do without.
Other Virtual Memory Challenges To share or not to share pages across VM boundaries: –VMware tracks duplicate pages in different virtual machines & stores only one copy of the actual page with pointers from the shadow page tables in sharing processes. –Copy-on-write policy Xen focuses on total isolation of each virtual machine, which means no sharing
Migrating Virtual Machines A virtual machine encapsulates an entire computing environment. If properly implemented, the VM provides strong mobility since local resources may be part of the migrated environment “Freeze” an environment (temporarily stop executing processes) & move entire state to another machine –e.g. In a server cluster, migrated environments support maintenance activities such as replacing a machine.
Migration of Virtual Machines Example: real-time (“live”) migration of a virtualized operating system with all its running services among machines in a server cluster on a local area network. Presented in the paper “Live Migration of Virtual Machines”, Christopher Clark, et. al.“Live Migration of Virtual Machines”, Christopher Clark Problems: –Migrating the memory image (page tables, in-memory pages, etc.) –Migrating bindings to local resources
Memory Migration in Virtual Machines Three possible approaches –Pre-copy: push memory pages to the new machine and resend the ones that are later modified during the migration process. –Stop-and-copy: pause the current virtual machine; migrate memory, and start the new virtual machine. –Let the new virtual machine pull in new pages as needed, using demand paging Clark et.al use a combination of pre-copy and stop-and-copy; claim downtimes of 200ms or less.
Looking Ahead … How useful will virtual machine technology be for multicore processors and cloud computing???
Summary & Review (1) A virtual machine is a copy of a real machine –Applications don’t know if they are running on real or virtual hardware, other than having fewer resources. A virtual machine is isolated: if several VMs execute on the same hardware they do not interact with each other directly or indirectly. The performance of a virtual machine should be about the same as that of the actual hardware. –So most instructions should be directly executed by the hardware as opposed to being emulated.
Summary and Review (2) Process virtual machines (JVM) virtualize at a higher level, do not necessarily even correspond to real machines. System virtual machines virtualize at the level of the hardware-software interface Variations of classic system virtual machine: –Hosted (run on another operating system –Emulation (provides virtual hardware and OS, as in Virtual PC) – not really a virtual machine
Summary & Review (3) Virtual Machine Monitor (hypervisor) runs on a bare machine, implements one or more virtual machines. The VMM allocates resources and controls resource sharing among all VMs Operation: –Each VM runs a guest OS –VMM runs in kernel mode –Guest OS and applications run in user mode –Privileged instructions trap to the VMM –Hypercalls (the VMM equivalent of system calls) may be used by a guest OS to request service from the VMM
Summary & Review (4) Benefits of VM technology for non-hosted (traditional, or native) VMs –Isolation and security Multiple servers on a single machine –Encapsulation of an entire environment: OS and application for the purpose of Migration Checkpointing Supporting system maintenance –Running several OS’s concurrently Older versions, experimental systems, Linux & Windows, … For hosted VMs, the major advantage is the ability to run two or more OS’s at once
Reading for Next Class Chapter 4 – Communication 9/25: First test Covers: Everything through virtual machines.
Xen – Intro Claim: virtualization is better than multi- tasking as a way to share hardware. –CPU requests, memory demand, disk accesses, other resource needs of one process impact the performance of other processes –Xen solution: multiplex resources at the OS level instead of the process level.
Xen Hardware layer Domain 0 Guest Application Domain U Guest OS2 Application Domain U Guest OS3 VM1VM2VM3 Xen implementation of VMM Domain 0 guest has privileged access to the Xen hypervisor and can be used by the system administrator to manage the system. Separation of powers Xen only has to worry about multiplexing hardware to multiple guests
Xen Design Principles Virtualize all architecture features that are required by standard binary interfaces. –To support existing applications without modification Support multi-application guest operating systems Use paravirtualization to get improved performance and resource isolation
Xen HVM (Hardware Virtual Machine) Some versions of Xen are designed to run on Intel VT and AMD-V chips with special virtualizing hardware. Able to run un-modified (no para- virtualization) operating systems. This implementation is known as a hardware virtual machine. –Windows requires an HVM environment; Linux, Solaris, and BSD systems don’t.
Xen Memory Management Unlike VMWare and Denali, Xen expects the guest OS’s to manage their own hardware page tables. To support this, each VM receives a fixed allocation of page frames which it can use as it wishes. New page tables must be registered with Xen and updates must be validated by Xen. –Make the page table write protected.
Xen CPU Management Xen is designed for the X86 architecture which supports 4 rings, or privilege levels. –Traditional OS’s execute in ring 0 (most privileged) and applications in ring 3 (least) –Xen executes in ring 0 (only level that can execute privileged instructions) –Guest OS runs in ring 1, which isolates it from applications. –Note: since this paper was written there have been some modifications to X86 to better support virtualization.
Xen CPU Management Privileged instructions must be validated (is it OK?) and executed by Xen Exceptions (page faults, system calls, other traps to OS) are handled as much as possible by the guest OS. –Exception handlers are registered & validated with Xen –System calls stop at the guest OS; Xen is involved only if the OS executes a privileged instruction.
Denali Isolation Kernel Authors define Denali as a small-kernel operating system with similarities to microkernels and exokernels –Once thought to be inefficient, modern hardware has improved performance of this kernel architecture They expected Denali to support multiple (up to 10,000) untrusted applications that are virtually independent.
Isolation Kernel Design Principles Expose low-level resources rather than high-level abstractions for greater security –Avoid “layer-below” attacks Prevent direct sharing by exposing only private, virtualized namespaces –Keeps one VM from “… even naming the resources of another VM, let alone modifying them”. 
Isolation Kernel Design Principles Design for scalability –Be able to support a work load that has a few popular services and many that are accessed infrequently. Modify the virtualized architecture for simplicity, scale and performance. –Paravirtualization for reasons other than necessity. –They do not believe isolation depends on providing an exact copy of hardware so they provide a hardware version that is modified to be more efficient and secure.
Zipf’s Law Given a table that ranks something on the basis of its frequency of occurrence, Zipf’s law states that the most frequent item occurs about twice as often as the next most frequent item, which in turn occurs twice as often as the next item, and so on. Zipf made this observation about words in a natural language. Here, we’re talking about accesses to various web services.
Statistically Multiplexing Services Studies showed that the popularity of most network services (server requests, document searches, etc) followed a Zipfian distribution. Implications: –Most requests go to a small number of services –Most services aren’t popular, but the total number of requests for unpopular services is non-trivial –With isolation it can be safe and efficient to run hundreds or even thousands of services concurrently on a single platform.
Proof-of-concept Denali is the virtualized architecture Yakima: a VMM which was designed to run in ring 0 on x86 hardware. Ilwaco: a simple prototype guest OS which provides a full set of abstractions to its applications while hiding the Denali architecture Reasonable performance in tests –1.4 μsec to 9 μsec context switch time, depending on number of VMs –End-to-end run times of network apps were “comparable” to those of a traditional operating system.
Conclusion The Denali research project terminated in the mid-2000’s. The Denali research group was right in supposing that virtual machine technology would be most useful today to enable efficient use of server hardware.