Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2014 VMware Inc. All rights reserved. Characterizing Cloud Management Performance Adarsh Jagadeeshwaran CMG INDIA CONFERENCE, December 12, 2014.

Similar presentations

Presentation on theme: "© 2014 VMware Inc. All rights reserved. Characterizing Cloud Management Performance Adarsh Jagadeeshwaran CMG INDIA CONFERENCE, December 12, 2014."— Presentation transcript:

1 © 2014 VMware Inc. All rights reserved. Characterizing Cloud Management Performance Adarsh Jagadeeshwaran CMG INDIA CONFERENCE, December 12, 2014

2 Agenda Building Blocks of VMware’s Cloud Infrastructure The Software Defined Datacenter Cloud Management Performance at VMware Performance Challenges Tools and Benchmarks Role of Simulation Performance Testing Methodology Conclusion

3 Building Blocks of VMware’s Cloud Infrastructure

4 CONFIDENTIAL 4 It all started with x86 virtualization Traditional ArchitectureVirtual Architecture

5 And features like.. VM.migrate – Move the compute state of a Virtual Machine (VM) from one physical box to another – Typically used for resource load balancing VM.snapshot – Preserve state and data of a VM at a specific point in time – Snapshots are very helpful in avoiding damages to VMs during patch or upgrade problems. Distributed Resource Scheduling CONFIDENTIAL5

6 Building the cloud CONFIDENTIAL 6 60% Public CloudsPrivate Clouds Hybrid Cloud Seamlessly extend your data center to the public cloud Virtual Workspace Manage access to services, applications and data for any device The New Role for IT: IT as a Service Software-Defined Data Center Virtualize the entire data center Management and Automation Network and Security ComputeStorage and Availability

7 Cloud Infrastructure = Software Defined Data Center

8 Compute : cpu, memory resources CONFIDENTIAL8 Compute APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS


10 +Networking/Security CONFIDENTIAL10 Compute Storage Network/Security APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS

11 +Automation/Management – This is key CONFIDENTIAL11 Compute Storage Network/Security APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS Automation & Management

12 =Virtual Datacenter CONFIDENTIAL12 Software-defined Datacenter Automation & Management Compute Storage Network/Security APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS VDC 2VDC 1

13 Typical Deployment CONFIDENTIAL13 R&D Finance Software-defined Datacenter Services Grid Software-defined Datacenter

14 Cloud Management Performance at VMware

15 SDDC Management Suite CONFIDENTIAL 15 Operations Management Software-Defined Storage and Availability SDDC Virtual Virtual Networking and Security Cloud Service Provisioning VMware vCloud ® Suite

16 VMware Performance R&D MEASURE instrument, benchmark, analyze PUBLISH white papers, blogs, kb articles, flings OPTIMIZE design, fix code, tune settings CONFIDENTIAL16 PERFORMANCE

17 Performance Challenges

18 The Management Server CONFIDENTIAL18 Server 1 host_agent Server 2 host_agent Relational Database Single SignOn UI Client Stats Processing UI Server Inventory DB (xml) vm

19 Components affecting performance VM Resources like cpu and memory – shared across other VMs on same physical server (host) Virtual devices – storage, networking, VM devices – data stored in management server database #Managed Objects – data stored in management server database – ESXi hosts – VMs – Resource Pools – Clusters Performance statistics about objects – stored and processed in the database – Multiple levels of statistics from less to more detailed Incoming tasks and queries – cpu/mem usage on mgmt. server CONFIDENTIAL19

20 Deployment Size Overall Size: – Small – Up to 150 servers, 3000 VMs – Medium – up to 300 servers, 6000 VMs – Large – up to 1000 servers, 10000 VMs Single Cluster Size: – Resource Scheduling, Availability and Power Management work at a cluster level – Up to 32 servers or 4000 VMs in a single cluster A setup with 50 servers and 2000 VMs with least detailed statistics can result in a database size of approx. 16GB CONFIDENTIAL20

21 Identify Common Use Cases CONFIDENTIAL 21 Instantiate vApp Deploy vApp Edit vApp Undeploy vApp Delete vApp Cloud Management Workflow - 1 Cloud Solutions – Ex: vCloud Director (Spans multiple Management Servers) Clone vApp Delete vApp Cloud Management Workflow - 2

22 Identify Common Use Cases – Contd. Customer Support Data Software support bundle – logs, events, traces Identify common operation pattern and frequency Group patterns by deployment size CONFIDENTIAL22 Customer Usage Patterns

23 Build Tools for Stats and Monitoring Monitor Resource Usage – Server level – Management level – Components of the Management Server Build Internal Profiling Counters – Count of objects in memory – Aggregated stats about tasks, events, etc – Locking information CONFIDENTIAL23

24 Tools and Benchmarks

25 Microbenchmark Simulates load on server from a given operation – Example: 256 VM.powerOn operations in sequence Focus on specific operation (no background load) Study scaling trend for a given operation (latency) Study resource usage trend Performance of a specific server component CONFIDENTIAL25

26 Macro-benchmark In-house benchmark: VCBench Simulate (Admin) User Tasks – Issues management operations using public APIs Simulate Multiple Users – Multiple threads issuing a series of operations (User) Think time – User can specify “think” time between operations Realistic work-load – Operation mix & frequency extracted from customer data Measure throughput – Number of operations completed in given time Measure latency of operation in the presence of load and corresponding resource usage CONFIDENTIAL26

27 Benchmark Run Profile Two primary modes – “Light”: around 100 operations issued per minute – “Heavy”: around 500 operations issued per minute Light load slightly above most customer work loads – Lets us exercise the entire management stack – And anticipate increased realistic demand in the short term Heavy load for saturating the management server – The point where increasing the amount of resources for my server doesn’t result in throughput increase any more. CONFIDENTIAL27

28 Realistic Operation Mix OperationOperation/min. (light) Power On VM40 Power Off VM40 Clone VM10 Migrate VM40 Remove VM10 Create Snapshot5 Delete Snapshot5 Reconfigure VM10 CONFIDENTIAL28 Mix of operations revised constantly based on new features and changing datacenter use cases. Mix and frequency varied simply by editing a run list.

29 Tools for monitoring performance Resource Usage Tool – Tool built into hypervisor (esxtop) and management server – Monitoring at component level Profiling tools (post-process) – Uses management server’s internal profiling information from log bundle – Summarizes performance metrics of internal objects CONFIDENTIAL29

30 Role of Simulation

31 Why Simulation? 1024 physical servers running ESXi (host) is a management nightmare Plus 15K VMs and the associated networking and storage components Solution? – Have a simulated version of the hypervisor – Fake the existence of VMs and datastores – Management Server sees no difference CONFIDENTIAL31

32 Simulating the hypervisor Hypervisor agent is the Management server’s agent running on the ESXi server With the hardware and VMs simulated, we can have the real hypervisor agents run as separate threads in different containers We retain the agent to management server communication intact #Objects & properties to be managed by server remains the same Some Challenges: – Simulating performance statistics, events and alarms – Simulating VM IO Advantages: – Hypervisor layer is a black box with consistent performance – No hypervisor or storage performance bottleneck – Focus is purely on management server scaling and performance CONFIDENTIAL32

33 Performance Testing Methodology

34 Testing for Performance and Scale Testing at supported scale Hypervisor Scaling (Scale-up) – Stacking more VMs on the same physical box – Focus is on Hypervisor performance Management Server Scaling (Scale-out) – Managing more physical boxes and VMs – Focus is on Management Server performance – a) Single Cluster at scale – b) Overall large deployment CONFIDENTIAL34

35 Test configurations Scale-Up – 1 or 2 ESXi Hosts – 0.5-1K VMs per Host – Microbenchmark with focus on one operation at a time – 1, 32, 64, 128, 256, 512 vm.powerOn, vm.reconfigure, etc. – Metrics measured: end-to-end latency, cpu/mem. usage Scale-Out – 1024 ESXi Hosts managed by a single Management server – 15K VMs total – Benchmark with concurrently issued operations: datacenter.powerOn, vm.migrate, etc. – Metrics measured: Operation throughput, latency, cpu/mem. usage CONFIDENTIAL35

36 Regression Tracking Performance Automation automates processes for setup and regression tracking Tracking for different scale inventories Track benchmark data (throughput, latency), and resource usage of management server components for regression Analyze and fix regressions in performance Also useful for sizing guidelines CONFIDENTIAL36

37 Conclusion

38 Understand factors affecting performance Have a comprehensive stats/monitoring framework Build a realistic benchmark that replicates customer behavior Ideal benchmark run should – Include common use cases and user behavior – remove variability in a multi-tiered setup – Be able to focus on single component Simulation can help remove variability and with scaling Generate microbenchmarks that stress a single/small number of components CONFIDENTIAL38 Takeaways

39 References

40 Thanks To- VMware vCenter Server Performance Team “Benchmarking a Virtualized Platform” – Vijayaraghavan Soundararajan, et. al., IISWC 2014 ( CONFIDENTIAL40

41 Backup

42 Example SDDC Management Task: Distributed Resource Scheduling using VMotion CONFIDENTIAL42 VMware ESX VMware ESXi Resource Pool Balance VM Load in a cluster of ESXi servers Enforce Policy Based Rules Power Management

Download ppt "© 2014 VMware Inc. All rights reserved. Characterizing Cloud Management Performance Adarsh Jagadeeshwaran CMG INDIA CONFERENCE, December 12, 2014."

Similar presentations

Ads by Google