Managing the Capacity and Performance of a VMware Cluster environment Presented by: Pete Weilnau CTO PERFMAN
2 Agenda Quick Architecture Overview Cluster Level Performance Sources of additional information
3 VMware Architecture From VMware VI3 brochure – © 2008 VMware, Inc.
4 VMware Clusters Must have VirtualCenter to create and manage a “Cluster” Without VirtualCenter there are only ESX Servers (Hosts) When VirtualCenter is down, cluster benefits don’t exist Cluster Benefits: Host failover recovery via HA (High Availability) Workload balancing via DRS (Dynamic Resource Scheduling)
5 VMware Cluster From VMware VI3 brochure – © 2008 VMware, Inc. Cluster
6 Resource Pools A logical abstraction for hierarchically managing CPU and memory resources Used on a standalone host or VMware DRS-enabled clusters Provides resources for virtual machines and child pools
7 VMware Resource Pools From VMware VI3 brochure – © 2008 VMware, Inc. Resource Pools
8 VMware DRS Managed by VirtualCenter Balances virtual machine load across hosts in a cluster Enforces resource policies accurately (Reservations, limits, shares) Respects placement constraints Affinity and anti-affinity rules VMotion compatibility
9 VMware HA (High Availability) Provides automatic restart of virtual machines in case of a physical server failure. A feature of VirtualCenter
10 VirtualCenter Cluster - Hosts
11 VirtualCenter Cluster - VMs
12 VirtualCenter – Cluster Level Performance (CPU)
13 VirtualCenter – Cluster Level Performance (Memory)
14 About VI Performance Counters VirtualCenter statistics levels: Higher settings increase the amount of data collected. Level 1 – collects resource use averages (excludes devices) uptime, heartbeat and DRS data Level 2 – adds usage summation and rollup types to level 1 Level 3 – adds device metrics Level 4 – adds maximum and minimum rollup types
15 Cluster Resource Considerations CPU Memory Network IO Disk IO Disk Space Cluster Services
16 CPU Measurement Examples – VI SDK ClusterResource Pool HostVM Usage (%)11 Usagemhz (MHz)1111 System (ms)3 Wait (ms)3 Ready (ms)3 Extra (ms)3 Used (ms)33 effectiveCPU (MHz) (capacity rating) X Note – this list is not comprehensive
17 resCPU Measurements – VI SDK ClusterResource Pool HostVM Actav1/5/15 (%)33 Average active time for the CPU over the past minute/5 min/15 min Actpk1/5/15 (%) 33 The peak active time for the CPU over the past minute Runav1/5/15 (%)33 The average runtime for the CPU over the past minute/5 min/15 min Runpk1/5/15 (%)33 The peak runtime for the CPU over the past minute/5 min/15 min Note – this list is not comprehensive
18 Cluster CPU statistics
19 Cluster CPU statistics – Derived Utilization
20 Cluster CPU statistics – Derived CPU % Ready
21 Cluster CPU statistics – Derived VM Usage Summary
22 Cluster CPU statistics – Derived MultiHost Comparisons
23 Cluster CPU statistics – Derived MultiHost Comparisons
24 Cluster CPU statistics – Derived MultiHost Comparisons
25 Resource Pool CPU MHz Usage
26 VM Level CPU Usage
27 VM Level CPU Usage
28 VM Level CPU Usage
29 VM Consumption of Virtual CPU
30 VM Level - % Ready Time
31 VM Level – 1 Min Peak CPU
32 VM - Number of Virtual CPUs
33 VM Impact on Cluster - Derived
34 Memory Measurement Examples – VI SDK ClusterResource Pool HostVM Usage (%) Portion of memory in use (configured / available memory) 1111 Granted (KB) Memory granted to VMs 2222 Active (KB) Active = recently touched pages 2222 Comsumed (KB) Phys mem used by VMs, excluding shared and overhead 2222 Shared (KB) Shared between VMs 2222 Swapused (KB) Memory used by swap 22 Sharedcommon (KB) Memory shared in common between VMs 2 Swapped (KB) Amount of memory swapped 22 Reservedcapacity (MB) Memory reserved for VMs 222 Totalmem (KB) DRS Effective memory resources 1 Note – this list is not comprehensive
35 Cluster Memory statistics
36 Cluster Memory statistics - Derived
37 Cluster Memory statistics - Derived
38 Cluster Memory statistics - Derived
39 Cluster Memory Heap statistics - Derived
40 Cluster Memory – MultiHost Comparisons
41 Cluster Memory – MultiHost Comparisons
42 Cluster Memory – MultiHost Comparisons
43 Resource Pool Memory Stats
44 Network Measurement Examples – VI SDK ClusterResource Pool HostVM Usage (KBps) Sum of data transmitted 11 packetRx Packets received (by nic) 33 packetTx Packets transmitted (by nic) 33 Received Rate data is received (by nic) 33 Transmitted Rate data is transmitted (by nic) 33 Note – this list is not comprehensive
45 Cluster Network Activity - Derived
46 Cluster Network Activity – MultiHost Comparisons
47 Disk Measurement Examples – VI SDK ClusterResource Pool HostVM Usage (KB) Total data reads + writes 11 Read (KBps) Data read (by disk) 33 Write (KBps) Data Written (by disk) 33 totalReadLatency Avg time for a read by a guest OS 33 kernelReadLatency Time spent in the ESX Server VMkernel 33 deviceReadLatency Avg time taken to read from the physical device 33 queueReadLatency Avg time spent in VMkernel queue per read 33 Etc.33 Note – this list is not comprehensive
48 Cluster Disk Activity – Derived
49 Cluster Disk Activity – Derived
50 Cluster Disk Activity – Derived
51 Cluster Disk Activity – MultiHost Comparisons
52 Cluster Disk Activity – MultiHost Comparisons
53 Cluster Disk Activity – MultiHost Comparisons
54 Disk Space VI reports disk space at: Virtual Machine -> Disk Host -> Datastore
55 Cluster Disk Space Information by VM – Derived
56 Cluster Disk Space by Datastore – Host Level
57 Summary VMware Clusters provide a powerful virtualization platform Limited direct instrumentation is available from VirtualCenter and the SDK But with careful thought it is possible to derive powerful views of cluster activity
Thank You for your time. Presented by: Pete Weilnau Chief Architect PERFMAN