Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,

Similar presentations


Presentation on theme: "Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,"— Presentation transcript:

1 Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing, IBM 2010

2 2 Talk Outline Introduction Measurement results & Observations –Data Placement & Provisioning –Workload Interference –Impacts of Virtualization Summary

3 3 Cloud & Virtualization Cloud Environment – Goals –Flexibility in resource configuration –Maximum resource utilization –Pay-per-use Model Virtualization – Benefits –Resource consolidation –Re-structuring flexibility –Separate protection domains Virtualization suits as one of the basic foundations of Cloud infrastructures

4 4 Fundamental Issues Could Service Providers (CSPs) vs. Customers –Customers purchase computing resources –CSPs provide virtual resources (VMs) –Customers perceive their resources as physical machines! Multiple VMs reside in single physical host –Resource Interference –End-user performance depends on other users End-user unaware of where their data physically exists

5 5 Goals of our Measurement For cloud service providers –How to place data such that end-user performance is maximized ? –How to co-locate workloads for least interference ? For End-Users –How to purchase resources in tune with requirement ? –How to tune applications for maximum performance ? General insights on storage I/O in virtualized environments

6 6 Benchmarks Used Postmark –Mail Server Workload –Create/Delete, Read/Append files –Parameters File Size # of files Read/Write ratio Synthetic Workload –Sequential vs. random accesses –Zipf Distribution

7 7 Data Provisioning & Placement

8 8 Workload Data footprint ~150MB 4GB Partition 40GB Partition Throughput : 2.1 MB/s Throughput : 1.4 MB/s Performance Difference : 33% Disk Provisioning Consider a 100GB Disk Case - ICase - II

9 9 Where to place VM disk ? Postmark benchmark –Read operation Cases : –Read from physical partitions in different zones Based on LBNs LBNs start from inner zone and proceeds towards outer zones. –Read from disk file (.vmdk)

10 10 Where to place multiple VM disks ? Postmark benchmark –2 instances (1 for each VM) Random reads Compare physical partitions placed in different zones –O -> Outer –I -> Inner –M -> Mid

11 11 Observations Customers should purchase storage based on workload requirement, not price Thin provisioning may be practiced Throughput intensive VMs can be placed in outer disk zones Multiple VMs that may be accessed simultaneously should be co-located on disk –CSPs can monitor access patterns and move virtual disks accordingly

12 12 Workload Interference

13 13 CPU-Disk Interference VM - 1VM - 2 CPU DISK Throughput : 23.4 MB/s CPU Throughput : 27.6 MB/s Performance Difference : 15.3% Physical Host

14 14 CPU-Disk Interference CPU allocation ratios has no effect on disk throughput across VMs Disk intensive job performs better along with a CPU intensive job

15 15 Reason ? Dynamic Frequency Scaling CPU-Disk Interference

16 16 CPU-Disk Interference CPU DFS is enabled in Linux by default Three ‘governors’ to control the DFS policy On-demand (default) Performance Power-save When 1 core is idle, entire CPU is down-scaled because overall CPU utilization falls

17 17 Disk-Disk Interference 1 instance of Postmark in each VMs 65.3% more time taken when compared to running Postmark in a single VM Overhead mainly attributed to disk seeks : No more sequential accesses CPU V.Disk-1 V.Disk-2 Physical Disk VM-1 VM-2 Physical Host

18 18 CPU V.Disk-1 V.Disk-2 Disk - 1Disk - 2 VM-1 VM-2 Disk-Disk Interference VMs using separate physical disks 17.52% more time taken when compared to running Postmark in a single VM Overhead attributed to contention in Dom-0’s queue structures Physical Host

19 19 Disk-Disk Interference Postmark Benchmark (Reads) Cases : –Running in a single VM –1 instance in each of two VMs 2 VMs reading from virtual disks in same physical disk 2 VMs reading from virtual disks in different physical disks

20 20 Disk-Disk Interference IO scheduling policy in Dom-0 has less effect ‘Ideal’ case is time taken when running Postmark in single VM Other cases are running 1 instance of Postmark in each of 2 VMs (separate physical disks)

21 21 Disk-Disk Interference Interference with respect to workload type Synthetic read workload VMs use separate physical disks Cases : –Mix of sequential versus random reads Sequential requests from both VMs flood Dom-0 queue - contention

22 22 Observations CPU-intensive and disk-intensive workloads can be co- located for optimal performance and power Virtual disks that may be accessed simultaneously must be placed in separate physical disks I/O scheduling in Dom-0 has less effect on disk workload interference Two sequential workloads, when co-located suffer in performance due to queue contention With separate disks, workload contention is generally minimal, other than the case of two sequential workloads

23 23 Impacts of Virtualization

24 24 Sequentiality Postmark benchmark (reads) No much overhead seen for random disk accesses VM overhead is mitigated by larger disk overhead More felt for sequential disk accesses

25 25 Block Size Postmark sequential reads Fixed overhead with every requests As block sizes increase, # of requests are reduced, hence overhead is reduced Efficient to read in larger blocks

26 26 Block size wrt. Locality

27 27 Observations VM overhead is not felt in random workloads – amortized by disk seeks Extra layers of indirection is the reason for VM overhead – when block size is large, overhead is amortized Block size may be increased only if there is sufficient locality in access

28 28 Summary Storage purchased must depend on requirement, not price! It is better to place sequentially accessed streams in outer disk zone Co-locate virtual disks that may be accessed simultaneously Co-locate CPU intensive task with disk intensive task for better power and performance Avoid co-locating two sequential workloads on single physical machine – even when it goes to separate physical disks! Read in large blocks only when there is locality in workload

29 29 Questions Contact : sankaran@gatech.edu


Download ppt "Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,"

Similar presentations


Ads by Google