Presentation on theme: "Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines Joshua Le Vasseur, Volkmar Uhlig, Jan Stoess, Stefan Gotz – OSDI-2004."— Presentation transcript:
Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines Joshua Le Vasseur, Volkmar Uhlig, Jan Stoess, Stefan Gotz – OSDI-2004 Raju Kumar CS598C: Virtual Machines
Introduction Device Drivers - 70% of Linux 2.4.1 code for IA32 New OS Rewrite drivers Reuse drivers from other OS Unavailable code Undocumented features Extent of programming errors
Contribution Unmodified reuse of existing device drivers Strong isolation among device drivers Fault containment Extent of collocation
Related Work - Reuse Binary driver reuse – cohosting in VMware Workstation Both driver OS and VM run with all privileges!! Transplanting Uses glue Raises conflicts Leads to compromises in new OS Both driver and VM still run with all privileges
Related Work – Semantic Resource Conflicts Semantic Resource Conflicts Accidental denial of service Sharing Conflicts Transplanted driver and host OS prone to each others faults Since driver and OS both have all privileges, cooperation is required Cooperation not possible with transplanting Device driver disables interrupts
Related Work – Engineering Effort Are reused drivers functioning correctly ? Even with transplanting, 12% of OS-Kit code = glue Glue provides Ways to handle semantic differences Interface translation Donor OS knowledge required to write glue What if multiple donor OS-s ? Writing glue code is even more difficult. What if driver code in donor OS gets updated ?
Related Work - Dependability User level device drivers Used with some differences Nooks Isolates drivers within protection domains No privilege isolation Complete fault isolation not possible Detection of malicious drivers not possible Adds 22,000 lines of privileged code to Linux Uses interposition services to maintain integrity of resources shared between drivers No sharing of resources between drivers in this work – uses request messages
Approach Drivers are closely knit to kernel, applications are not Orthogonal drivers should be based on following principles Resource delegation Receive only bulk resources Separation of name spaces Driver has its own address space Separation of privilege Execute driver in unprivileged mode Secure isolation Among drivers, between drivers and applications Common API
Analysis of principles Most flouted for device drivers None flouted for OS Insight – transplant OS, rather than just driver
Architecture DD/OS – OS running a device driver DD/OS hosted in a VM Driver controls its device directly via a pass-through enhancement to VM hosting DD/OS Driver cannot access other DD/OS Translation module – added to DD/OS to interface with clients One translation module can be used for multiple DD/OS- s Hard disks, floppy disks, optical media, etc. Drivers execute in separate VMs Driver isolation from each other Simultaneous use of drivers from incompatible OS-s
Inter VM Low overhead communication Message notification Source VM raises communication interrupt in Destination VM Request completion Destination VM raises completion interrupt in Source VM Low overhead memory sharing Register memory areas of a VM into another VMs physical memory space
Requests and Responses Client signals DD/OS – VMM sends virtual interrupt to translation module DD/OS signals client – Translation module raises a trap in VMM
Enhancing Dependability Driver isolation Improve reliability By preventing fault propagation Improve availability Virtual machine reboot Continuum of configurations Individual drivers vs group of drivers
Driver Restart Asynchronous – Reset driver Fault detection Malicious driver Synchronous – Negotiations and quiescing Live upgrades Proactive restart Indirection captures accesses to a restarting driver Transparently started Fault signaled
Virtualization Issues DD/OS consumes more resources than drivers DMA operations Special timing needs of physical hardware violated Host OS has to collaborate with DD/OS to control driver
DMA address translation DMA addresses in DD/OS reference guest physical address not same as host physical address Translation VMM intercepts DMA access and translates
DMA and Security DD/OS can perform DMA to physical memory not allowed by memory protection system !! Use DMA to replace hypervisor code/data In absence of hardware support to restrict DMA access, device drivers are part of TCB
DMA and Trust Untrusted by hypervisor Client Client and DD/OS Client and DD/OS + they do not trust each other When DD/OS is untrusted Hypervisor enables DMA permissions to client memory Restricts DD/OSs actions in client memory When DD/OS and client do not trust each other Client pins its own memory DD/OS verifies pinning of clients memory via hypervisor
DMA and Trust contd… VM faults and restarts while device is using DMA !! All targeted memory cannot be reclaimed until all such DMA operations complete or abort What is targeted memory ? DD/OS memory ? Clients pinned memory ? No solution provided to this problem!! Client with memory pinned due to a DD/OS that faulted and is rebooting should not use pinned memory until restart has completed And then what ? Will the DD/OS signal completion ? What if DMA completes before the VM restarts ? What if VM fails to start at all ?
IO-MMU and IO Contexts IO-MMU Designed to overcome 32-bit address limitation for DMA in 64-bit systems Can be used to enforce access permissions for DMA operations and address translation Hence DD/OS are hardware isolated Hence device drivers can be excluded from TCB More questions – So does this work assume device drivers in TCB or not in TCB ? If in TCB, we cannot do anything. If not in TCB, then driver cannot do anything malicious due to hardware isolation, so we do not need to do anything. So?
IO-MMU contd… IO-MMU does not support multiple address contexts Time multiplex IO-MMU between PCI devices Timeouts may occur in several device drivers Question – How many PCI devices are there generally in a system ? But eventually the various device drivers will be the deciding granularity. So would it be a better idea to group all device drivers in one DD/OS and avoid all contention ? If yes, we have a tradeoff between performance and fault isolation. Impact on gigabit ethernet NIC proportional to bus access Decrease impact of multiplexing by using dynamic bus allocation based on Device utilization – prefer active and asynchronous devices Have to use IO-MMU to ensure device driver isolation. No options yet.
Resource Consumption OS size of driver modules Periodic tasks in DD/OS lead to cache and TLB footprints Question – paper claims periodic tasks in DD/OS impose overhead on clients even when not using any device driver. How ? Page Sharing uses schemes used in VMware ESX Server Steady state cache footprint of multiple DD/OS-s is low due to high sharing Swap out VM pages to disk Do not swap out pages for VM hosting DD/OS for swap device Do not swap out pages for VM hosting DD/OS used by swap device More questions When treating the DD/OS as a black box, we cannot swap unused parts of the swap DD/OS via working set analysis. All parts of the OS must always be in main memory to guarantee full functionality even for rare corner cases. Black Box - Do not know which pages are used. All parts of OS must always be in main memory. Then what can be paged out ? How do we find it ?
Reducing Memory Footprint In addition to memory sharing and swapping Memory ballooning inside DD/OS Does it acquire pages and zero them out ? Details not provided. Handles zero pages specially Compresses non-working set pages that cannot be swapped and uncompresses them upon access Periodic tasks increase DD/OS footprint Do not meet strict requirements
Timing Virtual Time vs Real Time Devices malfunction under violation of assumptions related to time Soft preemption If interrupts disabled, VMM does not preempt VM until interrupts are enabled Hard preemption Preempt even if interrupts disabled
Shared Hardware and Recursion Time sharing of devices is needed Time sharing PCI is difficult Let a DD/OS control PCI This DD/OS interposes access to the PCI and applies a policy
Results Implemented a driver reuse system Evaluated network, disk and PCI drivers Hypervisor and VMM are paravirtualized systems
Virtualization Environment Hypervisor L4 VMM User level L4 task DD/OS Linux kernel 2.4.22 Client OS Linux kernel 18.104.22.168
Translation Modules Disk interface Added to DD/OS as a kernel module Communicates with the block layer Network interface Added to DD/OS as a device driver Represents itself to DD/OS as a network device, attached to a virtual interconnect Asynchronous inbound packet delivery Outbound – transmitter from the client via DMA Inbound – L4 copies packets from DD/OS to client PCI interface More questions - When the PCI driver is isolated, it helps the other DD/OS instances discover their appropriate devices on the bus, and restricts device access to only the appropriate DD/OS instances. - ? Executed at a lower priority than all other components More questions - Priority is not privilege. Would not PCI performance affect system performance drastically ? Paper says PCI interface is not performance critical. Why ?