3Introducing Exchange 2010 MP One of the most comprehensive management packs released by Microsoft.1423 Rules658 Performance Collection707 Alert Generating854 Unit Monitors57 Discovery Rules412 Relationships229 ClassesNew version ( ) released on 8/31 addresses several key bugsSupported on OM 2007 SP1/R2 and 2012Slide Notes:To compare and highlight the comprehensive nature of the Exchange MP, the AD 2008 MP has:508 Rules – 59 are performance collection, 337 alert rules, 12 misc.41 Unit Monitors33 Classes (4 are deprecated)5 Discovery rules
4What’s Monitored All core components in an Exchange 2010 deployment: Mailbox ServerClient Access ServerEdge ServerHub TransportUnified MessagingSynthetic monitoring using PowerShell scripts included with Exchange to proactively monitor (on-server not remote) such as:Test-OWAConnectivityTest-MAPIConnectivityTest-OutlookConnectivitySlide Notes:
5Correlation EngineFundamentally a custom connector written with the OM SDK.On OM 2007 R2, recommended its installed on RMSOn OM 2012, recommended its installed on MS with RMS EmulatorConfigured to auto-resolve alertsQueries RMS every 5 minutes to minimize performance impactDisabling CE prevents alert generationMaintenance mode works as expected.Slide Notes:The CE maintains the health model in memory and processes state change events, it then determines when to raise an alert based on the state of the system.If you wan to prevent auto-resolution of alerts, you need to modify the CE configuration XML file – “AutoResolveAlerts” value=“false”. The configuration file is found in %ProgramFiles%\Microsoft\Exchange Server\v14\Bin\Microsoft.Exchange.Monitoring.CorrelationEngine.Exe.Config.The CE will write events to the Application Event Log by default (Source is MSExchangeMonitoringCorrelation). However it can also write verbose debugging information to a log file to assist with troubleshooting. Again, modify the CE configuration XML file – “LogVerbose” value=“true”. Current logging to the file is summarized data.If the CE starts behaving badly (i.e. monitor/workflow with logic bug or incorrectly applied override, etc.), MonitoringHost.exe process will start consuming all available memory (grows over time and within a several hours, consumes all memory) and brings the RMS to its knees. Very important to properly apply overrides (1) and (2) review alerts generated in a prompt manner to ensure it is not inaccurate and doesn’t flip-flop constantly causing performance impact on management group and SQL Server hosting OperationsManager DB, and (3) take the time to properly tune this MP before importing into production!
6Correlation FactorsKHI Monitors watching for specific diagnostics from Exchange (event, performance, and script), change state but do not generate an alert.There are three different Alert Classification categories - Key Health Indicators (KHI), Non-Service Impacting (NSI), and Forensic.When NSI monitor changes health state, a corresponding alert is generated.KHI monitors are evaluated, and "chains" of critical severity KHI monitors are isolated and the dependency/relationships evaluated to raise alert for the root cause monitor in the chain.Forensic monitors do not generate alerts when they change state.
8Monitoring Prerequisites All Exchange servers must have an agent installed.Agents must be healthy to avoid any false-positive alerts from CE.All Exchange servers in an Exchange Site must be in the same management group.OM is a object oriented monitoring tool, and understanding number of managed objects/instances is important.Maximum number of Managed Objects: 800,000Maximum number of Relationships: ~1,000,000CE hard limit is 600,000 Relationships and a million Group Object MembersSlide Notes:Bullet Point 3 - Having only part of the whole Site in a single SCOM management group will cause a lot of noise as the Correlation Engine is expecting all servers in the site, but does not see them in OM. Plan accordingly to ensure that all active Exchange servers in each site are properly monitored in the same OM management group. (Best example is the North America Site is managed in one OM management group, while the South America Site is managed in another OM management group.)Bullet Point 4 – These numbers are based on extensive performance and scalability tests conducted by the OM PG. However, while they are not hard numbers and it can scale higher, OM performance can seriously degrade and monitoring will be impacted if you greatly exceed these numbers. Additional optimizations can be implemented to help improve scale/performance however, again you will not be able to scale that much higher. This depends on the size of the Exchange environment and may mean introducing an additional management group dedicated to Exchange monitoring.
9Optimizing Operations Manager Follow the guidance highlighted in KB Article –On the 2007 RMSEach MS in All MS Resource PoolRegistry HiveKeyTypeValueDescriptionHKLM\Software\Microsoft\Microsoft Operations Manager\3.0GroupCalcPollingIntervalMillisecondsDWord000dbba0Changes the Group Calculation processing to 15 minutes.HKLM\Software\Microsoft\Microsoft Operations Manager\3.0\Config ServicePolling Interval SecondsDwordChanges the Config Service Polling to 2 MinutesSlide Notes:These recommendations highlighted in the KB article are applicable to 2007 R2 and 2012.
10Optimizing Operations Manager Only applicable for all 2007 SP1/R2 management servers:Registry KeyTypeValueDescriptionHKLM\System\CurrentControlSet\Services\HealthService\Parameters\Persistence Cache MaximumDwordAllows more memory usage for the Health Service’s Data store on the local system.HKLM\System\CurrentControlSet\Services\HealthService\Parameters\Persistence Version Store MaximumHKLM\System\CurrentControlSet\Services\HealthService\Parameters\Persistence Checkpoint Depth MaximumHKLM\System\CurrentControlSet\Services\HealthService\Parameters\State Queue ItemsAllows more data be allowed to store in the Health Service’s Data store on the local systemSlide notes:These recommendations are only applicable to 2007 R2 and not If you pay attention, you will notice that the default values set for these Registry keys are a higher value than what is stated here. With exception though is the “State Queue Items” Registry setting, that can be set on your 2012 management servers that have agents reporting to them (no need to apply at this time unless otherwise noted, on MS’s that are dedicated for network device monitoring or cross-platform).
11Optimizing Operations Manager Update the Data Warehouse processing timeout from 5 minutes to 15, perform the following on all management servers (including RMS/RMSe):[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Data Warehouse] "Command Timeout Seconds "=dword:
12Tuning Recommendations KHI Unit Monitors do not generate alerts, the corresponding alert rules do.Disable the monitor via override and then disable the alert rule.If you only disable the rule, bad things can happen!Modification of alert Severity or Priority is done by overriding the corresponding alert rule.KHI Alert rules target the RMS class!Slide Notes:On OM 2007 R2, the KHI Alert rules target the Root Management Server class, but on OM 2012 they target the Root Management Server Emulator class.If you want to disable the alert, you cannot simply disable the corresponding alert rule. This can cause a negative side effect that causes the SDKPendingDatasource table to grow and consume most if not all of the available OperationsManager DB space. You must disable the monitor and corresponding rule together!
13Tuning Recommendations Enable only the performance collection rules you need.New-TestCASConnectivityUser.ps1 may fail if there are multiple “Users” OUs defined in AD.When running script, specify the OU as argumentDisable event collection rules as they consume a lot of unnecessary DB space.Review thresholds and revise for monitors using performance counter data source.Custom Alert fields 5, 6, 7, 8, & 10 are used by the CE.Slide Notes:In the latest version of the Exchange 2010 MP ( ), all performance collection rules, except those that feed the pre-canned reports that come with the MP, are disabled by default. So only enable the perf collection rules that you require to proactively report on performance for capacity or trend-analysis.
14Tuning Recommendations May see false-positive alerts from:KHI: HTTP Connectivity with Autodiscover - Unexpected ExceptionKHI: HTTP Connectivity Against Local Server - Address Book failure (ABREF)KHI: HTTP Connectivity Against Local Server - Address Book failure (NSPI)KHI: HTTP Connectivity Against Local Server - RPC Client Access failure (Connect)KHI: TCP Connectivity Against Local Server - Unexpected Exception - Outlook Connectivity (Local Server)Slide Notes:These monitors appear in a number of customer environments to generate false-positive alerts. If you determine that you are experiencing the same behavior and confirmed they are false in nature, disable them.