Presentation is loading. Please wait.

Presentation is loading. Please wait.

Differentiated I/O services in virtualized environments

Similar presentations


Presentation on theme: "Differentiated I/O services in virtualized environments"— Presentation transcript:

1 Differentiated I/O services in virtualized environments
Tyler Harter, Salini SK & Anand Krishnamurthy The project we have been working on is “Differentiated I/O services in virtualized environments”

2 Overview Provide differentiated I/O services for applications in guest operating systems in virtual machines Applications in virtual machines tag I/O requests Hypervisor’s I/O scheduler uses these tags to provide quality of I/O service The objective of this project is to provide differentiated I/O services for applications in guest operating systems running in virtual machines. The way we do this is by tagging I/O requests from applications and then use them at hypervisor to provide good quality of I/O service. Network IO already provides a mechanism to end to end qos via DSCP (Differentiated Services Code Point bits), MPLS,.. In virtualized settings, Software switches like OpenVSwitch from VMWare/Nicira offers qos, rate limiting,etc. and they can operate at Ip, transport port granularity. Leaving away the network part, in this presentation, I will be focusing on storage I/O

3 Motivation Variegated applications with different I/O requirements hosted in clouds Not optimal if I/O scheduling is agnostic of the semantics of the request Many types of applications with different I/O requirements run in clouds. It isn’t a good idea if we do I/O scheduling ignoring the semantics of the request. For example, let us consider the below example.

4 Motivation VM 1 VM 2 VM 3 Hypervisor

5 Motivation VM 2 VM 3 Hypervisor

6 Motivation We want to have high and low priority processes that correctly get differentiated service within a VM and between VMs Can my webserver/DHT log pusher’s IO be served differently from my webserver/DHT’s IO?

7 Existing work & Problems
Vmware’s ESX server offers Storage I/O Control (SIOC) Provides I/O prioritization of virtual machines that access a shared storage pool But it supports prioritization only at host granularity! Let us now look at the current state of the art technologies for I/O prioritization. Vmware’s ESX server provides I/O prioritization for a shared data store for virtual machines via SIOC But the problem is it supports only host-level granularity.

8 Existing work & Problems
Xen credit scheduler also works at domain level Linux’s CFQ I/O scheduler supports I/O prioritization Possible to use priorities at both guest and hypervisor’s I/O scheduler Xen credit scheduler also works at domain level Linux’s CFQ I/O scheduler also supports I/O prioritization, but there are a couple of issues which we will see in the forthcoming slides.

9 Original Architecture
QEMU Virtual SCSI Disk Syscalls I/O Scheduler (e.g., CFQ) Guest VMs Host High Low This is the vanilla KVM/Qemu architecture with CFQ IO schedulers at both guest and hypervisor.

10 Original Architecture
Virtual machines do not support priorities per guest process and hence I/O prios should be enforced at the guest scheduler level.

11 Problem 1: low and high may get same service
A high priority process in one VM and a low priority process in another VM may get treated equally since there is no global view of IO priorities at the hypervisor

12 Problem 2: does not utilize host caches
In addition to that if we block low priority read requests at guest to cater high priority ones, we might actually ignore the file system buffer cache in the hypervisor leading to unintended delay in latencies.

13 Existing work & Problems
Xen credit scheduler also works at domain level Linux’s CFQ I/O scheduler supports I/O prioritization Possible to use priorities at both guest and hypervisor’s I/O scheduler Current state of the art doesn’t provide differentiated services at guest application level granularity Takeaway is that current state of the art doesn’t provide differentiated services at guest application level granularity

14 Solution Tag I/O and prioritize in the hypervisor

15 Outline KVM/Qemu, a brief intro… KVM/Qemu I/O stack
Multi-level I/O tagging I/O scheduling algorithms Evaluation Summary This will be the outline of the presentation. Before delving into the details of multi-level IO tagging, I would like to take 2 mins to walk through the Qemu-IO stack for people who worked with other virtualization technologies like Xen, Virtual Box and not familiar with Qemu-KVM. Then we will explain the modifications that we made for providing differentiated I/O services like the multi-level I/O tagging and the scheduling algorithms. We will then analyze how our system behaves for different workloads and configurations. After that we will conclude with our learnings and then take questions.

16 KVM/Qemu, a brief intro.. kernel-mode: switch into guest-mode and handle exits due to I/O operations KVM module part of Linux kernel since version 2.6 Linux has all the mechanisms a VMM needs to operate several VMs. Has 3 modes:- kernel, user, guest user-mode: I/O when guest needs to access devices Relies on a virtualization capable CPU with either Intel VT or AMD SVM extensions guest-mode: execute guest code, which is the guest OS except I/O Linux Standard Kernel with KVM - Hypervisor Hardware n a normal linux environment each process runs either in user-mode or in kernel- mode. KVM introduces a third mode, the guest-mode. Therefore it relies on a virtualization capable CPU with either Intel VT or AMD SVM extensions

17 KVM/Qemu, a brief intro.. kernel-mode: switch into guest-mode and handle exits due to I/O operations KVM module part of Linux kernel since version 2.6 Linux has all the mechanisms a VMM needs to operate several VMs. Has 3 modes:- kernel, user, guest user-mode: I/O when guest needs to access devices Relies on a virtualization capable CPU with either Intel VT or AMD SVM extensions guest-mode: execute guest code, which is the guest OS except I/O Linux Standard Kernel with KVM - Hypervisor Hardware n a normal linux environment each process runs either in user-mode or in kernel- mode. KVM introduces a third mode, the guest-mode. Therefore it relies on a virtualization capable CPU with either Intel VT or AMD SVM extensions

18 Linux Standard Kernel with KVM - Hypervisor
KVM/Qemu, a brief intro.. Each Virtual Machine is an user space process Linux Standard Kernel with KVM - Hypervisor Hardware

19 Linux Standard Kernel with KVM - Hypervisor
KVM/Qemu, a brief intro.. libvirt Other user space ps Linux Standard Kernel with KVM - Hypervisor Hardware

20 KVM/Qemu I/O stack Application in guest OS Application in guest OS
Issues an I/O-related system call (eg: read(), write(), stat()) within a user-space context of the virtual machine. This system call will lead to submitting an I/O request from within the kernel-space of the VM read, write, stat ,… System calls layer VFS The I/O request will reach a device driver - either an ATA-compliant (IDE) or SCSI FileSystem BufferCache Block SCSI ATA

21 KVM/Qemu I/O stack Application in guest OS Application in guest OS
The device driver will issue privileged instructions to read/write to the memory regions exported over PCI by the corresponding device read, write, stat ,… System calls layer VFS FileSystem BufferCache Block SCSI ATA

22 Linux Standard Kernel with KVM - Hypervisor
KVM/Qemu I/O stack Qemu emulator A VM-exit will take place for each of the privileged instructions resulting from the original I/O request in the VM The privileged I/O related instructions are passed by the hypervisor to the QEMU machine emulator These instructions will trigger VM-exits, that will be handled by the core KVM module within the Host's kernel-space context Linux Standard Kernel with KVM - Hypervisor Hardware

23 Linux Standard Kernel with KVM - Hypervisor
KVM/Qemu I/O stack Qemu emulator Upon completion of the system calls, qemu will "inject" an interrupt into the VM that originally issued the I/O request These instructions will then be emulated by device-controller emulation modules within QEMU (either as ATA or as SCSI commands) Thus the original I/O request will generate I/O requests to the kernel-space of the Host QEMU will generate block-access I/O requests, in a special blockdevice emulation module Linux Standard Kernel with KVM - Hypervisor Hardware

24 Multi-level I/O tagging modifications

25 Modification 1: pass priorities via syscalls
We support two types of priorities via system calls, At file descriptor level via fcntl At read/write level via pread_p

26 Modification 2: NOOP+ at guest I/O scheduler
Next modification what we made is at the I/O scheduler at the guest. We call it ‘No-op+’ because apart from being a No-op it has extra logic to get the priorities from block io and then pass them down to the scsi driver layer. Salini will explain in detail regarding the rationale behind using No-Op+.

27 Modification 3: extend SCSI protocol with prio
Now that we have the tag in the scsi layer of the guest, we need to pass it down to the hypervisor. According to the scsi specification, there is a un-used byte in SCSI command descriptor block for READ_10 and we leverage that. By editing the scsi driver in guest, we get the priority from IO request and set it in the SCSI command. After all the interactions in Qemu I/O stack that I mentioned earlier, This tag will be recovered in qemu and then we again pass it down to the hypervisor’s IO scheduler via pread_p systemcall. Salini will now take over the scheduling algorithms and evaluation.

28 Modification 2: NOOP+ at guest I/O scheduler

29 Modification 4: share-based prio sched in host

30 Modification 5: use new calls in benchmarks

31 Scheduler algorithm-Stride
𝐼 𝐷 𝑖 ID of application 𝐴 𝑖 𝑆ℎ𝑎𝑟 𝑒 𝑖 = Shares assigned to 𝐼 𝐷 𝑖 V𝐼 𝑂 𝑖 – Virtual IO counter for 𝐼 𝐷 𝑖 𝑆𝑡𝑟𝑖𝑑 𝑒 𝑖 = Global_shares/ 𝑆ℎ𝑎𝑟𝑒 𝑠 𝑖 Dispatch request() { Select the ID 𝑘 which has lowest Virtual IO counter Increase 𝑉𝐼𝑂 𝑘 by 𝑆𝑡𝑟𝑖𝑑𝑒 𝑘 if ( 𝑉𝐼𝑂 𝑘 reaches threshold) Reinitialize all 𝑉𝐼𝑂 𝑖 to 0 Dispatch request in the queue 𝑘 }

32 Scheduler algorithm cntd
Problem: Sleeping process can monopolize the resource once it wakes up after a long time Solution: If a sleeping process k wakes up, then set 𝑉𝐼𝑂 𝑘 = max( min(all 𝑉𝐼𝑂 𝑖 which are non zero), 𝑉𝐼𝑂 𝑘 )

33 Evaluation Tested on HDD and SSD Configuration: Guest RAM size 1GB
Host RAM size 8GB Hard disk RPM 7200 SSD 35000 IOPS Rd, IOPS Wr Guest OS Ubuntu Server LK 3.2 Host OS Kubuntu LK 3.2 Filesystem(Host/Guest) Ext4 Virtual disk image format qcow2

34 Results Metrics: Benchmarks: Throughput Latency Filebench Sysbench
Voldemort(Distributed Key Value Store)

35 Shares vs Throughput for different workloads : HDD

36 Shares vs Latency for different workloads : HDD
Priorities are better respected if most of the read request hits the disk

37 Effective Throughput for various dispatch numbers : HDD
Priorities are respected only when dispatch numbers of the disk is lower than the number of read requests generated by the system at a time Downside: Dispatch number of the disk is directly proportional to the effective throughput

38 Shares vs Throughput for different workloads : SSD

39 Shares vs Latency for different workloads : SSD
Priorities in SSDs are respected only under heavy load, since SSDs are faster

40 Comparison b/w different schedulers
Only Noop+LKMS respects priority! (Has to be, since we did it)

41 Results Hard drive/SSD Webserver Mailserver Random Reads
Sequential Reads Voldemort DHT Reads Hard disk Flash

42 Summary It works!!! Preferential services are possible only when dispatch numbers of the disk is lower than the number of read requests generated by the system at a time But lower dispatch number reduces the effective throughput of the storage In SSD, preferential service is only possible under heavy load Scheduling at the lowermost layer yields better differentiated services

43 Future work Get it working for writes
Get evaluations on VMware ESX SIOC and compare with our results

44


Download ppt "Differentiated I/O services in virtualized environments"

Similar presentations


Ads by Google