Presentation is loading. Please wait.

Presentation is loading. Please wait.

Operating System Support for Virtual Machines

Similar presentations


Presentation on theme: "Operating System Support for Virtual Machines"— Presentation transcript:

1 Operating System Support for Virtual Machines
Samuel King, George Dunlap, Peter Chen

2 Content Introduction VM and VMM Type II VMM UMLinux and UML
Bottlenecks & Solutions Performance Conclusion Evaluation

3 Virtual Machines Developed 1960’s Multiple VM on single machine
Test applications Program / debug OS Simulate networks Isolate applications Monitor for intrusions Inject faults Resource sharing/hosting

4 Virtual Machine Monitors
Layer that emulates hardware for an Operating System The simulated hardware is the Virtual Machine

5 Types of VMMs Type I VMM Type II VMM Efficient Simple VMM Low overhead
VMM runs as a process on the HOS Efficient Low overhead Simple VMM VMM: mediates between host OS & guest-machine

6 Examples Type I Type II Hybrid (physical hardware + Host I/O)
IBM VM/370, VMware ESX Server, Xen Type II SimOS, User-Mode Linux, UMLinux Hybrid (physical hardware + Host I/O) VMware Workstation, VirtualPC

7 Type II compared to Type I
Advantages Designers OS abstractions ~ VM OS signals ~ VM interrupts Virtual timer -> timer interrupt Disable interrupts -> disable signals using a flag to defer signals Users Watch and debug the VM execution from the host Disadvantages Performance 10+ x slower

8 Sample VMM Implementations

9 Memory Exception Example
Existing OS Abstractions and Signals can be used in VM A guest application attempts to access data that it doesn’t have access to An invalid memory operation occurs and SIGSEGV signal is thrown SIGSEGV makes the data available The data is brought in, transparent to the user

10 OS Abstraction Code OUTPUT int main (int ac, char *av[]) {
struct sigaction sa; int rc; char *p = calloc(16384, 4); int *buffer = (int*)((int)(p + PAGE_SIZE - 1) & ~(PAGE_SIZE-1)); rc = mprotect(buffer, PAGE_SIZE, PROT_NONE); if (rc == -1) { perror("mprotect PROT_NONE"); exit(EXIT_FAILURE); } sa.sa_sigaction = SIGSEGV_handler; sigemptyset (&sa.sa_mask); sa.sa_flags = SA_SIGINFO; if (sigaction (SIGSEGV, &sa, NULL) == -1) { printf ("errno set to: %d\n", errno); printf ("Error registering SIGSEGV sigaction.\n"); exit (EXIT_FAILURE); printf("\nACCESSOR: trying to access %p\n", buffer); *(int*)buffer = 42; printf("ACCESSOR: wrote %d\n", *(int*)buffer); if (*buffer = 42) printf("MAIN: read %d: success!\n", *(int*)buffer); return EXIT_SUCCESS; void SIGSEGV_handler (int signo, siginfo_t *info, void *context) { printf ("ACCESSOR: segfault at address: %p\n", info->si_addr); sigset_t mask; sigemptyset(&mask); sigaddset(&mask, SIGSEGV); sigprocmask(SIG_UNBLOCK, &mask, 0); printf("FIXER: now fixing %p...\n", info->si_addr); char *p = (char*)((int)info->si_addr & ~(PAGE_SIZE-1)); int rc = mprotect(p, PAGE_SIZE, PROT_READ | PROT_WRITE); if (rc == -1) { perror("mprotect PROT_READ | PROT_WRITE"); exit(EXIT_FAILURE); } printf("ACCESSOR: trying again...\n"); Shows how signals such as sigsegv can be used to implement a basic interrupt OUTPUT ACCESSOR: trying to access 0x89ab000 ACCESSOR: segfault at address: 0x89ab000 FIXER: now fixing 0x89ab000... ACCESSOR: trying again... ACCESSOR: wrote 42 MAIN: read 42: success!

11 Second Classification
VMM interface identical to hardware IBM VM/370, VMware Server & Workstation VMM added OS modifications Signal handlers SimOS, UML, UMLinux Virtualization drivers Disco, VAX VMM Microkernels & JVM XEN

12 UMLinux vs User-Mode Linux (UML)
Single machine process for all guest app Guest apps communicate via Guest OS Faster system calls, network transfers, web-server UML Separate machine process for each app Guest apps communicate via shared memory on host Faster context switches, kernel building UML used in project 1 More popular Faster in general

13 UMLinux Performance - guest OS must simulate crossing the top red line. System call to a library – vertical move Switch applications – horizontal move

14 User-Mode Linux server… Notice the separate VM instances and separation of guest applications

15 Goal Make Type II VMs useable in production
Reduce OH of Type II to that of Type I Done through extension of host OS Performance within 2x standalone

16 Three Switching Bottlenecks
High number of context-switching, to move from guest app to guest OS through VMM Ensuring address protection, switching guest user and guest kernel space Numerous memory mapping ops, switching guest applications Three bottlenecks in type II VMMs VM – VMM has to capture a switch and manage it though Host OS and guest OS calls

17 1. Guest App. to Kernel Switching
VMM uses ptrace to catch system calls and signals from the guest-machine process. Creates context switches between the VMM and guest-machine process for the Host OS. High context-switching Ptrace gives full control of the syscall

18 1. System Call Control Transfer
Guest application is transferring control to the GOS 1. guest application issues system call; intercepted by VMM process via ptrace 2. VMM process changes system call to no-op (getpid) 3. getpid returns; intercepted by VMM process 4. VMM process sends SIGUSR1 signal to guest SIGUSR1 handler 5. guest SIGUSR1 handler calls mmap to allow access to guest kernel data; intercepted by VMM process 6. VMM process allows mmap to pass through to make the guest kernel data available (2nd bottleneck) Ideally, steps 6&7 should have been filtered through automatically 7. mmap returns to VMM process 8. VMM process returns to guest SIGUSR1 handler, which handles the guest application’s system call Figure 4: Guest application system call. This picture shows the steps UMLinux takes to transfer control to the guest operating system when a guest application process issues a system call. The mmap call in the SIGUSR1 handler must reside in guest user space. For security, the rest of the SIGUSR1 handler should reside in guest kernel space. The current UMLinux implementation includes an extra section of trampoline code to issue the mmap; this trampoline code is started by manipulating the guest machine process’s context and finishes by causing a breakpoint to the VMM process; the VMM process then transfers control back to the guest-machine process by sending a SIGUSR1.

19 1. Optimization VMM process functionality >> VMM loadable kernel module Modify Host OS to give VMM control over the guest-machine process’s system calls and signals

20 1. Optimization Diagram 1. guest application issues system call; intercepted by VMM kernel module no contact switching, two mode switches as red line crossed twice, going and leaving 2. VMM kernel module calls mmap to allow access to guest kernel data 2&3 do the mmap call 3. mmap returns to VMM kernel module 4. VMM kernel module sends SIGUSR1 to guest SIGUSR1 handler Figure 5: Guest application system call with VMM kernel module. This picture shows the steps taken by UMLinux with a VMM kernel module to transfer control to the guest operating system when a guest application issues a system call.

21 2. Address Protection Guest-machine process switches between guest user and guest kernel mode Has to protect access to kernel addresses when switching to user mode Has to enable access to kernel addresses when switching to kernel mode This creates a large number of mmaps, reprogramming the page table to switch between R/W and inaccessible

22 x86 Segmentation and Paging

23 2. Protection using the Current Privilege Level
Ring 0 – used for Host Kernel Ring 1 – … VM Ring 2 – … Ring 3 – user level Supervisor-only bit in the page table prevents code running in CPU privilege ring 3 from accessing the host operating system’s data. Linux implements protection using the concept of rings instead of segments

24 Standalone Address Protection
Linux incurs little overhead when trapping to the kernel Segments allow access to all addresses (1 to 1 mapping, logical to local address) Supervisor-bit on each page table restricts Ring-3 processes from accessing kernel code and data

25 2. Segmentation Bounds for Address Protection Optimization
UMLinux calls map, unmap, and mprotect to simulate the switching on the guest os Linux Solution 1 Bound guest user mode to 0x segment Allow guest kernel access to user range

26 2. Alternate Optimization
Allow guest OS to occupy range from 0x to 0xc Separate guest kernel and user modes by using page table’s supervisor only bit Stops guest kernel pages from being run in ring 3 Runs the guest kernel in ring 1

27 2. Optimization Comparison
Linux Solution 1 Solution 2 Guest kernel can now occupy arbitrary regions instead of only a contiguous block

28 UML tt/skas3/skas0 Modes
Guest Process Layout Guest Process Address Space tt skas3 skas0 tt skas3 skas0 UML UML can separate kernel address spaces to keep processes away from guest kernel skas3 requires a kernel patch - to add a new file - (proc/mm) - allows separation of guest kernel and guest process’s address space in the host skas0 does the same thing, but without a kernel patch, requires two pages for the SIGSEGV signal and UML code Host UML kernel code and data Tracing Thread Process code and data

29 3. Guest Application Switching
Switching guest process address space requires swapping the current mapping between virtual pages and the VM’s physical memory file. munmap called for previous process’s virtual address space mmap called for each virtual page in the next process, as needed on page-faults Basically, switching between different physical files.

30 Costs of Switching ( --- and | )
Process 1 Process 2 user mode kernel mode Kernel Context switching will cause a change in the available memory addresses and in the current privilege level.

31 Context Switching intr_entry: (saves entire CPU state)
(switches to kernel stack) intr_exit: (restore entire CPU state) (switch back to user stack) iret Process 1 Process 2 user mode kernel mode Kernel switch_threads: (in) (saves caller’s state) switch_threads: (out) (restores caller’s state) (kernel stack switch) CS 3204 Fall 2007

32 Costs of Switching ( --- and | )
Horizontal switching (between applications) Expensive! Invalidate the first process’ mapping (unmap) Validate the second process’ mapping (map) Vertical switching (to and from OS) Saves the CPU state of the application Make the kernel’s address spaces available

33 Process 1 Active in user mode
FFFFFFFF Process 1 Active in user mode P1 C 1 GB kheap kbss user (2) kdata user (2) free C kcode user (2) ustack (1) user (1) user (1) user (1) kernel 3 GB kernel used kernel kernel udata (1) ucode (1) access possible in user mode CS 3204 Fall 2007

34 Process 1 Active in kernel mode
FFFFFFFF Process 1 Active in kernel mode P1 C access requires kernel mode 1 GB kheap kbss user (2) kdata user (2) free C kcode user (2) ustack (1) user (1) user (1) user (1) kernel 3 GB kernel used kernel kernel udata (1) ucode (1) access possible in user mode CS 3204 Fall 2007

35 Process 2 Active in kernel mode
FFFFFFFF Process 2 Active in kernel mode P2 C access requires kernel mode 1 GB kheap kbss user (2) kdata user (2) free C kcode user (2) ustack (2) user (1) user (1) user (1) Unmap process 1, map process 2 kernel 3 GB kernel used kernel kernel udata (2) ucode (2) access possible in user mode CS 3204 Fall 2007

36 Process 2 Active in user mode
FFFFFFFF Process 2 Active in user mode P2 C 1 GB kheap kbss user (2) kdata user (2) free C kcode user (2) ustack (2) user (1) user (1) user (1) kernel 3 GB kernel used kernel kernel udata (2) ucode (2) access possible in user mode CS 3204 Fall 2007

37 3. Guest App. Switching Solution
Allow a process to have 1024 different address spaces Each address space is defined by a set of page tables Host OS is modified to switch between address space definitions using switchguest switchguest only has to change the pointer to the current first-level page table Solution was basically proposed by UML creator Jeff Dike in “making Linux safe for virtual machines” UML uses a similar approach, a proc/mm patch New mm_struct for each process Entire set of VMA’s swapped with each struct switch This mm_struct is stored directly into the hardware page table ($cr3)

38 switchguest Example Page Table Ptr Host operating system Guest OS guest proc b switchguest syscall guest proc a switchguest has to change the hardware’s page table pointer to the next guess process’s page table inside the Host OS

39 Performance Testing Do the three solutions bring the performance of Type II VMM within 2x that of standalone systems? Test benchmarks: Null system calls Switching between guest applications Transferring data CPU intensive program Kernel building Web-server performance

40 Testing Setups Standalone (Host OS) VMware Workstation (Type I)
UMLinux With optimization 1 (kernel module) With optimization 1 & 2 (bounded segment) With optimization 1, 2, & 3 (address spaces)

41 Null System Call Guest App has to switch to guest kernel and then back
First optimization – less calls needed to switch to kernel Second optimization – switching address protections faster Standalone and Type I don’t have to go through the Host OS and VMM, makes it faster Failed, not within 2x

42 Switching Apps (Context Switch)
First optimization – less calls needed to switch to kernel Second optimization – switching address protections faster Third optimization – additional address spaces makes switching apps faster All three optimizations make context switching faster Notice Type II can be faster than Type I

43 Network Transfer Appears to hit a limit in transferring data across an Ethernet switch using TCP

44 CPU-Intensive Program (POV-Ray)
Mainly compute-bound Little interaction with the guest kernel Little virtualization overhead

45 Kernel-build Numerous guest kernel calls
Each call is trapped by VMM and signaled to guest kernel Second optimization no need to re-map and protect when switching to kernel Kernel compile benchmark: 22 million guest memory exceptions 1.4 million guest system calls

46 Web Server (SPECweb99) Numerous guest kernel calls
Few application switches The overheads for SPECweb99 and kernel-build are higher because they issue more guest kernel calls, each of which must be trapped by the VMM kernel module and reflected back to the guest kernel by sending a signal. (OPT 1)

47 Results Five successful benchmarks brought the performance within 2x standalone. One failed benchmark (null system call)

48 Conclusion from Paper Type II (UMLinux) VMM can be optimized to perform similar to Type I (VMware) Type II VMM can perform within 2x standalone systems in production

49 Recent Work Renamed FAUmachine
Development on FAUmachine continued through 2004 in Germany at the Univ. Erlangen-Nurnberg Virtually all research on UMLinux/FAUmachine was conducted by the CoVirt & ReVirt Project at Univ. Michigan (Usage of VMs for security services) CoVirt project now uses various VMs Fallen behind UML in performance and popularity

50 Evaluation UMLinux with optimizations or UML could be very useful in various commercial and educational situations. UMLinux - slower than standalone, Type I, and other Type II VMMs, it will not become a leading development or run-time platform in practice. Type II VMMs may dwarf Type I VMMs, due to similar performance and easier to design.


Download ppt "Operating System Support for Virtual Machines"

Similar presentations


Ads by Google