© 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

Slides:



Advertisements
Similar presentations
Housekeeping Utilities for VMware. 11 June Housekeeping is preparing meals for oneself and family and the managing of other domestic concerns.
Advertisements

Capacity Planning in a Virtual Environment
Implementing vSphere David J Young. Implementing vSphere Agenda Virtualization vSphere ESXi vSphere Client vCenter Storage Implementation Benefits Lessons.
Virtual Machine Technology Dr. Gregor von Laszewski Dr. Lizhe Wang.
What’s New: Windows Server 2012 R2 Tim Vander Kooi Systems Architect
VSphere 4 Best Practices/ Common Support Issues Paul Hill Research Engineer, System Management VMware.
© 2010 VMware Inc. All rights reserved Confidential Performance Tuning for Windows Guest OS IT Pro Camp Presented by: Matthew Mitchell.
MCTS GUIDE TO MICROSOFT WINDOWS 7 Chapter 10 Performance Tuning.
VSphere vs. Hyper-V Metron Performance Showdown. Objectives Architecture Available metrics Challenges in virtual environments Test environment and methods.
Managing the Capacity and Performance of a VMware Cluster environment Presented by: Pete Weilnau CTO PERFMAN
© 2014 VMware Inc. All rights reserved. Performance Management Iwan ‘e1’ Rahabok Staff SE (Strategic Accounts) & CTO Ambassador
G Robert Grimm New York University Disco.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
Denny Cherry MVP, MCSA, MCDBA, MCTS, MCITP.
Virtual Machines. Virtualization Virtualization deals with “extending or replacing an existing interface so as to mimic the behavior of another system”
Virtualization 101.
Virtualization Infrastructure Administration Cluster Jakub Yaghob.
Storage Management Module 5.
VSphere Deepdive Magnus Bergman Joel Lindberg.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
VMware vCenter Server Module 4.
Scalability Module 6.
Virtualization Performance H. Reza Taheri Senior Staff Eng. VMware.
Hyper-V 3.0 – What’s New in Windows Server 2012? Brien Posey
Alleviating Constraints with Resource Pools & Live Migration with Enhanced VMotion* Breakout Session# 2823 Raghu Yeluri Sr. Architect Intel Corporation.
VMware vSphere 4 Introduction. Agenda VMware vSphere Virtualization Technology vMotion Storage vMotion Snapshot High Availability DRS Resource Pools Monitoring.
PowerVM and VMware. What this presentation is Basic Terms that can be used to discuss multiple forms of virtualization Concepts common to virtualization.
Tales from the Trenches About
1 Some Context for This Session…  Performance historically a concern for virtualized applications  By 2009, VMware (through vSphere) and hardware vendors.
Module 10 Configuring and Managing Storage Technologies.

MCTS Guide to Microsoft Windows 7
How to Resolve Bottlenecks and Optimize your Virtual Environment Chris Chesley, Sr. Systems Engineer
Sources of Performance Problems
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
Don’t Panic DBAs – Databases On VMware Made Easy Kathy Gibbs Senior Database Administrator, CONFIO Software.
VMware Infrastructure 3 The Next Generation in Virtualization.
Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
VSP1999 esxtop for Advanced Users Name, Title, Company.
© 2014 VMware Inc. All rights reserved My Slides from VMware vSphere: Optimize and Scale.
Virtual Machine and its Role in Distributed Systems.
The Top 10 Virtual Configurations You SHOULDN'T Implement Tom Howarth Owner PlanetVM.NET Pre-requisites for this presentation: 1) General understanding.
Microsoft Virtual Academy Module 8 Managing the Infrastructure with VMM.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Eric Burgener VP, Product Management A New Approach to Storage in Virtual Environments March 2012.
VSP3866 Performance Best Practices and Troubleshooting Name, Title, Company.
Hyper-V Performance, Scale & Architecture Changes Benjamin Armstrong Senior Program Manager Lead Microsoft Corporation VIR413.
VApp Product Support Engineering Rev E VMware Confidential.
VMware vSphere Configuration and Management v6
VMWare Troubleshooting Basics Lewis Talley. Memory ESXi incorporates a number of memory management techniques such as (transparent page sharing, Ballooning,
Full and Para Virtualization
FINDING THE “MAKE IT FASTER!” BUTTON AND HITTING IT! Ewan MacKellar Senior Premier Field Engineer Microsoft SESSION CODE: SVR306 (c) 2011 Microsoft. All.
Jérôme Jaussaud, Senior Product Manager
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Understanding Virtualization Overhead.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
1 Best Practices for Monitoring Databases on VMware Dean Richards Senior DBA, Confio Software.
REMINDER Check in on the COLLABORATE mobile app Best Practices for Oracle on VMware - Deep Dive Darryl Smith Chief Database Architect Distinguished Engineer.
VMware Certified Professional 6-Data Center Virtualization Beta 2V0-621Exam.
Module Objectives At the end of the module, you will be able to:
1 SQL Server on VMware? Rob Mandeville Senior DBA, Confio Software 1 Virtualizing Our Environment: Lessons Learned Rob Mandeville.
vSphere 6 Foundations Exam Training
1 SQL Server on VMware? Rob Mandeville Senior DBA, Confio Software.
Virtualization for Cloud Computing
vSphere 6 Foundations Beta Question Answer
VSPHERE 6 FOUNDATIONS BETA Study Guide QUESTION ANSWER
Don’t Panic, DBAs! Databases on Vmware made easy Janis Griffin Senior DBA, Confio Software 1.
Virtualization overview
Optimizing SQL Server Performance in a Virtual Environment
Successfully Virtualizing SQL Server on vSphere: Straight from the Source M2 technical Deck ( ) Randy Knight SQL Solutions Group Founder.
Presentation transcript:

© 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork, Ireland

2 Global Support Services and Customer Advocacy Bangalore, India Tokyo, Japan Cork, Ireland Burlington, Canada Palo Alto, CA Broomfield, CO Support offices Local language support Spanish, Portuguese, French, German, Japanese, Chinese Global Coverage 24x7, 365 days/year 6 Support Centers Support Engineers Follow-the-sun Support for Severity 1 Issues Support Relationships with 100% of the Fortune 100; 99% of Fortune 500

3 Customer Support Day Events Coming to a location near you: sharing of VMware best practices! Support Days are a collaboration between VMware Support, Sales and customers – you learn directly from the experts Topics are driven by customer input, and typically include: Best practices Tips/tricks Top issues Product roadmaps/demos Certification offerings

4 Overview What a performance problem sounds like: “My VM is running slow and I don’t know what to do!” “I tried adding more memory and CPUs but the problem got worse!”` “My VM is slow on one host but fast on another!” What to look for? Where to start? We will explore some of the most common performance-related issues that our support centers receive cases for

5 A word about performance….  Troubleshooting methodology must define: How to find root cause How to fix the problem  Must answer these questions: 1. How do we know when we are done? 2. Where do we start looking for problems? 3. How do we know what to look for to identify a problem? 4. How do we find the root-cause of a problem we have identified? 5. What do we change to fix the root-cause? 6. Where do we look next if no problem is found?

6 Agenda  Benchmarking & Tools  Best Practices and Troubleshooting  The 4 “food groups” Memory CPU Storage Network

© 2009 VMware Inc. All rights reserved BENCHMARKING & TOOLS

8 Benchmarking  Consistent and reproducible results  Important to have base level of acceptable performance Expectation vs. Acceptable  Determine baseline of performance prior to deployment Benchmark on a physical system if applicable  Avoid subjective metrics, stay quantitative “The system seems slower” “This worked better last year”

9 Benchmarking  Benchmarking should be done at the application layer Use application-specific benchmarking tools and load generators Check with the application vendor  Isolate variables, benchmark optimum situation before introducing load  Understand dependencies Human interaction Other “food groups” Compare apples-to-apples

10  Aggregates thousands of metrics into Workload, Capacity, Health scores  Self-learns “normal” conditions using patented analytics  Smart alerts of impending performance and capacity degradation  Identifies potential performance problems before they start Slide 10 Tools – vCenter Operations

11 Tools – vCenter Operations Slide 11

12 Tools – esxtop  Valuable tool built in to vSphere hosts  View or capture real-time data View or playback data later Import data in 3 rd party tools  vSphere Client performance graphs get their data from the kernel and VSI Presentation/unit may be different (e.g. %RDY)

© 2009 VMware Inc. All rights reserved MEMORY

14 Memory – Overhead  A VM’s RAM is not necessarily machine RAM vRAM + overhead = maximum machine RAM Source: vSphere 5.1 Resource Management Guide Note: These are estimated values

15 Memory – Transparent Page Sharing

16 Memory – Host Memory Management Occurs when memory is under contention  Ballooning  Compression  Swapping

17 Memory – Ballooning

18 Memory – Compression

19 Memory – Swapping

20 Memory – Swapping

21 Memory – VM Resource Allocation

22 Memory – Resource Pool Allocation

23 Memory – Ballooning vs. Swapping  Ballooning is better than swapping  Guest can surrender unused/free pages  Guest chooses what to swap, can avoid swapping “hot” pages

24 Memory – Rightsizing  Generally it is better to OVER-commit than UNDER-commit  If the running VMs are consuming too much host/pool memory… Some VMs may not get physical memory Ballooning or host swapping Higher disk IO All VMs slow down

25 Memory – Rightsizing  If a VM has too little vRAM… Applications suffer from lack of RAM The guest OS swaps Increased disk traffic, thrashing SAN slow down as a result of increased disk traffic  If a VM has too much vRAM… Higher overhead memory Possible decreased failover capacity Longer vMotion time Larger VSWP file Wasted resources

26 Memory – Troubleshooting  Wrong resource allocation  May not notice a limit, e.g. VM or template with a limit gets cloned  Custom share values  Ballooning or swapping at the host level Ballooning is a warning sign, not a problem Swapping is a performance issue if seen over an extended period  Swapping/paging at the guest level Under-provisioned guest memory  Missing balloon driver (Tools)

27 Memory – Best Practices  Avoid high active host memory over-commitment No host swapping occurs when total memory demand is less than the physical memory (Assuming no limits)  Right-size guest memory Avoid guest OS swapping  Ensure there is enough vRAM to cover demand peaks  Use a fully automated DRS cluster Use Resource Pools with High/Normal/Low shares Avoid using custom shares

© 2009 VMware Inc. All rights reserved CPU

29 CPU – Overview  Raw processing power of a given host or VM Hosts provide CPU resources VMs and Resource Pools consume CPU resources  CPU cores/threads need to be shared between VMs  Fair scheduling vCPU time Hardware interrupts for a VM Parallel processing for SMP VMs I/O

30 CPU – esxtop

31 CPU – esxtop  Interpret the esxtop columns correctly  %RDY - The percentage of time a VM is ready to run, but no physical processor is ready to run it which may result in decreased performance  %USED – Physical CPU usage  %SYS – Percentage of time in the VMkernel  %RUN – Percentage of total scheduled time to run  %WAIT – Percentage of time in blocked or busy wait states  %IDLE – %WAIT- %IDLE can be used to estimate I/O wait time

32 CPU – Performance Overhead & Utilization  Different workloads have different overhead costs (%SYS) even for the same utilization (%USED)  CPU virtualization adds varying amounts of system overhead Direct execution vs. privileged execution Non-paravirtual adapters vs. emulated adaptors Virtual hardware (Interrupts!) Network and storage I/O

33 CPU – vSMP  Relaxed Co-Scheduling: vCPUs can run out-of-sync  Idle vCPUs incur a scheduling penalty configure only as many vCPUs as needed Imposes unnecessary scheduling constraints  Use Uniprocessor VMs for single-threaded applications

34 CPU– Scheduling Over committing physical CPUs VMkernel CPU Scheduler

35 CPU– Scheduling Over committing physical CPUs VMkernel CPU Scheduler XX

36 CPU– Scheduling Over committing physical CPUs VMkernel CPU Scheduler X X X X

37 CPU – Ready Time  The percentage of time that a vCPU is ready to execute, but waiting for physical CPU time  Does not necessarily indicate a problem Indicates possible CPU contention or limits

38 CPU – NUMA nodes  Non-Uniform Memory Access system architecture  Each node consists of CPU cores and memory  A CPU core in one NUMA node can access memory in another node, but at a small performance cost NUMA node 1 NUMA node 2

39 CPU – Troubleshooting  vCPU to pCPU over allocation HyperThreading does not double CPU capacity!  Limits or too many reservations can create artificial limits.  Expecting the same consolidation ratios with different workloads Virtualizing “easy” systems first, then expanding to heavier systems Compare Apples to Apples Frequency, turbo, cache sizes, cache sharing, core count, instruction set…

40 CPU – Best Practices  Right-size vSMP VMs  Keep heavy-hitters separated Fully automated DRS should do this for you Use anti-affinity rules if necessary  Use a fully automated DRS cluster Test that vMotion works Use Resource Pools with High/Normal/Low shares Avoid using custom shares

© 2009 VMware Inc. All rights reserved STORAGE

42 Storage – esxtop Counters  Different esxtop storage views Adapter (d) VM (v) Disk Device (u)  Key Fields: DAVG + KAVG = GAVG QUED/USD – Command Queue Depth CMDS/s – Commands Per Second MBREADS/s MBWRTN/s

43 Storage – Troubleshooting with esxtop  High DAVG: issue beyond the adapter bad/overloaded zoning, over utilized storage processors, too few platters in the RAID set, etc.  High KAVG: issue in the kernel storage stack Driver issue Full queue  Aborts: GAVG exceeding 5000 ms Command will be repeated, storage delay for the VM

44 Storage – Benchmarking with iometer

45 Storage – Storage I/O Control  Allows the use of Shares per VMDK  Throttling occurs when datastore reaches latency threshold Higher share VMDKs perform IO first  vCenter monitors latency across all hosts Not effective if datastore shared with other vCenters

46 Storage – Storage DRS  Datastore clusters Maintenance mode Anti-affinity rules  vCenter monitors for latency and disk space Migrate VMDKs for better performance or utilization  Not effective with automated tiering SANs Check HCL to confirm these features are compatible

47 Storage – Troubleshooting  Snapshots  Excessive traffic down one HBA / Switch / SP can cause latency Consider using Round Robin in conjunction with ALUA Always be paranoid when it comes to monitoring storage I/O  Consider your I/O patterns Peak time for storage IO? Virus scans, database maintenance, user logins  Always consult with array vendor They know the best practices for their array!

48 Storage – Best Practices  Use different tiers of storage for different VM workloads Slower storage for OS VMDKs Faster storage for databases or other high-IO applications  Use the Paravirtual SCSI adapter Reduced overhead, higher throughput  Use path balancing where possible, either through 3 rd party plugins / Round Robin and ALUA, if supported.  Use Storage DRS with SIOC Balance for both free space and latency Simplified datastore management

© 2009 VMware Inc. All rights reserved NETWORK

50 Network – Load Balancing  Load balancing defines which uplink is used Route based on Port ID Route based on IP hash Route based on MAC hash Route based on NIC load (Load Based Teaming)  Probability of high-bandwidth VMs being on the same physical NIC  Traffic will stay on elected uplink until an event occurs NIC link state change, adding/removing NIC from a team, beacon probe timeout…

51 Network – Troubleshooting  Check counters for NICs and VMs Network load imbalance 10 Gbps NICs can incur a significant CPU load when running at 100%  Ensure hardware supports TSO Use latest drivers and firmware for your NIC on the host  For multi-tier VM applications, use DRS affinity rules to keep VMs on same host Same vSwitch / VLAN, rules out physical network  If using Jumbo Frames, ensure it is enabled end-to-end

52 Network – Best Practices  Use the vmxnet3 virtual adapter Less CPU overhead 10 Gbps connection to vSwitch  Use the latest driver/firmware for the NICs on the host  Use network shares Requires Virtual Distributed Switch 4.1  Isolate vMotion and iSCSI traffic from regular VM traffic Separate vSwitches with dedicated NIC(s) Most applicable with Gigabit NICs

53 How to measure the network?  scp from/to ESXi host is not valid check!  With scp we will involve underlying storage on source and destination VM/host  CPU can affect the test, scp will encrypt/decrypt the network flow  Copy to ESXi host can give false result as the management interface has very limited resources

54 How to check network performance?  VM – VM on same ESXi host. This will exclude physical network problems  VM –VM on different ESXi host. This will involve physical NICs and switch as well  Physical – VM. Will also test physical devices but we can focus on one VM  Physical – Physical: this will give us some number about what to expect  Use iperf/jperf/netperf. Free tool for network test

55 Iperf

56 Iperf  Windows and Linux version  Will not use storage  We can use different option for test (UDP/TCP)  Automatically calculates bandwith

57 In conclusion…

58 Key Takeaways – Performance Best Practices  Understand your environment Hardware, storage, networking VMs & applications  Advanced configuration values do not need to be tweaked or modified In almost all situations  Use fully automated DRS  Use Paravirtual hardware

59 Important Links

60 Important Links