H OST F AILURES A C LUSTER T OLERATES ESX01ESX02ESX03 Shared storage – vm.vmdk
D EFAULT MINIMUM S LOT SIZE If you have not specified a CPU reservation for a virtual machine, it is assigned a default value of 32MHz. When the memory reservation is 0, the slot size equals the virtual machine overhead. 32 MHz 69 MB VM1 VM2 VM3 VM4VM..n
S LOT SIZE BASED ON RESERVATION vSphere HA calculates the CPU and memory slot size by obtaining the largest CPU and memory reservation of each powered-on virtual machine. 512 MHz 1093 MB VM1 VM2VM3VM4VM…n
HA ADVANCED SETTINGS das.slotcpuinmhz das.vmcpuminmhz Memory reservation CPU reservation SLOT das.slotmeminmb das.vmmemoryminmb
VM S REQUIRING MULTIPLE SLOTS 512 MHz 512 MB VM1 VM2 VM3 VM4VM5VM6 Reservation Slot size You can also determine the risk of resource fragmentation in your cluster by viewing the number of virtual machines that require multiple slots. VMs might require multiple slots if you have specified a fixed slot size or a maximum slot size using advanced options.
F RAGMENTED FAILOVER CAPACITY ESX1ESX2ESX3 Shared storage – vm.vmdk
A DMISSION CONTROL BASED ON RESERVATIONS vSphere HA uses the actual individual reservations of the virtual machines. The CPU component by summing the CPU reservations of the powered-on VMs.
C OMPUTING THE C URRENT F AILOVER C APACITY If you have not specified a CPU reservation for a VM, it is assigned a default value of 32MHz
R ESOURCES R ESERVED IS NOT U TILIZATION The Current CPU Failover Capacity is computed by subtracting the total CPU resource requirements from the total host CPU resources and dividing the result by the total host CPU resources.
P ERCENTAGE RESERVED ADVANCED SETTING The default CPU reservation for a VM can be changed using the das.vmcpuminmhz advanced attribute das.vmmemoryminmb defines the default memory resource value assigned to a VM
S PECIFY F AILOVER H OSTS A DMISSION C ONTROL P OLICY ESX01ESX02ESX03 Shared storage – vm.vmdk
S PECIFY F AILOVER H OSTS A DMISSION C ONTROL P OLICY Configure vSphere HA to designate specific hosts as the failover hosts
T HE FAILOVERHOST To ensure that spare capacity is available on a failover host, you are prevented from powering on virtual machines or using vMotion to migrate VMs to a failover host. Also, DRS does not use a failover host for load balancing If you use the Specify Failover Hosts admission control policy and designate multiple failover hosts, DRS does not attempt to enforce VM- VM affinity rules for virtual machines that are running on failover hosts.
S TATUS OF THE C URRENT F AILOVER H OSTS Red - The host is disconnected, in maintenance mode, or has vSphere HA errors. Green - The host is connected, not in maintenance mode, and has no vSphere HA errors. No powered-on VMs reside on the host. Yellow - The host is connected, not in maintenance mode, and has no vSphere HA errors. However, powered-on VMs reside on the host.
M YTH BUSTED VMware High Availability needs to be configured Be careful with reservations Always check run-time information
W HAT IS A S NAPSHOT ? Preserves state and data of a VM at a specific point in time Data includes virtual disks, settings, memory (optionally) Allows you to revert to a previous state Typically used by VM admins when doing changes and by backup software ESX3, ESX(i)4 had issues with deleting snapshots ESXi5 improved snapshot consolidation
W HAT IS A S NAPSHOT ? FileDescription.vmdkOriginal virtual disk delta.vmdkSnapshot delta disk.vmsdDB file with relations between snapshots.vmsnMemory file Snapshot grows in 16MB chunks – Requires locking
L OCKS Locks are necessary when creating, deleting and growing snapshot, power on/off, create VMDK ESX(i)4 used SCSI-2 reservation – Locks entire LUN
L OCKS ESXi5 uses Atomic Test & Set (ATS) VAAI primitive – Locks only individual VM – Requires VAAI enabled array and VMFS-5
P ERFORMANCE Locking – ATS increase performance up to 70% compared to SCSI-2 reservation Normal operations – Snapshot age – Number of snapshots – Snapshot size Be careful with snapshots in production!
Improvements to snapshots management and locking Snapshots still have impact on performance M YTH NOT B USTED
Disk provisioning type doesnt affect performance M YTH 3
B LOCK ALLOCATION VMDK Block VMDK File Size Written Blocks Thick Provision Lazy Zeroed VMDK Block VMDK File Size Written Blocks Thin Provision VMDK Block VMDK File Size Written Blocks Thick Provision Eager Zeroed VMDK
T HE I SCSI L ABORATORY Iomega StorCenter px6-300d with 6 SATA 7200 Disks Windows 2008 R2 4096 MB – 1 vCPU Hardware Version 9 VMware vSphere 5.1 Single Intel 1GB Ethernet Cisco 2960 switch MTU Size 1500
T HICK P ROVISION L AZY Z EROED Average Write 13.3 MB/s - Access time: 44.8 ms
T HIN P ROVISION Average Write 13.7 MB/s - Access time: 46.8 ms
T HICK P ROVISION E AGER Z EROED Average Write 86.6 MB/s - Access time: 9.85 ms
C OMPARISION Average Write 13.3 MB/s - Access time: 44.8 ms Average Write 13.7 MB/s - Access time: 46.8 ms Average Write 86.6 MB/s - Access time: 9.85 ms T HICK P ROVISION L AZY Z EROED T HIN P ROVISION T HICK P ROVISION E AGER Z EROED
M IGRATION Storage vMotion is able to migrate the disk format of a Virtual Machine
M YTH BUSTED Thin and Lazy Zeroed disks have the same speed Once allocated, these disks are as fast as Zeroed disks Thick Provision Eager Zeroed offer best performance from first write on
Always use VMware tools to sync the time in your VM M YTH 4
T IME S YNC P ROBLEMS VMs have not access to native physical HW timers Scheduling can cause time to fall behind CPU / Memory overcommit increases risk People are mixing different time sync options
VM WARE T OOLS ESX(i) 4 and prior – not possible to adjust time backwards ESXi 5 – Improved time sync to be more accurate and can also adjust time backwards Enable/Disable periodic sync in VMware Tools GUI, vCenter or VMX file
VM WARE T OOLS Default periodic sync interval is 60 sec Sync is forced even when periodic sync is disabled: – Resume, Revert Snapshot, Disk Shrink and vMotion In order to disable completely configure vmx file – Testing scenarios tools.syncTime = FALSE time.synchronize.continue = FALSE time.synchronize.restore = FALSE time.synchronize.resume.disk = FALSE time.synchronize.shrink = FALSE time.synchronize.tools.startup = FALSE time.synchronize.resume.host = FALSE
G UEST OS S ERVICES Windows (W32Time service) – Windows 2000 uses SNTP – Windows 2003+ uses NTP and provides better sync options and accuracy – Domain joined VMs sync from DC – Use Group Policy to control settings Linux (NTP) – Configure ntpd.conf – Start ntpd chkconfig ntpd on /etc/init.d/ntpd start
B EST P RACTICES ESX(i) hosts: – Configure multiple NTP servers – Start NTP Service Virtual Machines: – Disable VMware Tools periodic sync – DC: Configure multiple NTP servers (same as ESX(i) host) – Domain joined will sync with DC – If not domain joined then configure W32Time or NTP manually Do not use both VMware Tools periodic sync and Guest OS time sync simultaneously!
M YTH B USTED Use W32Time or NTP Do not use VMware Tools period sync
S UMMARY Myth 1: VMware High Availability needs to be configured, be careful with reservations and always check run-time information Myth 2:Improvements to snapshot management and locking but still performance impact Myth 3: Use Thick Eager Zeroed disks for best I/O performance Myth 4: Use W32Time or NTP to sync time instead of VMware Tools