Presentation is loading. Please wait.

Presentation is loading. Please wait.

EXCHANGE SERVER 2010 HIGH AVAILABILITY CONCEPTS Scott Schnoll Principal Technical Writer Microsoft Corporation SESSION CODE: EXL303 (c) 2011 Microsoft.

Similar presentations


Presentation on theme: "EXCHANGE SERVER 2010 HIGH AVAILABILITY CONCEPTS Scott Schnoll Principal Technical Writer Microsoft Corporation SESSION CODE: EXL303 (c) 2011 Microsoft."— Presentation transcript:

1

2 EXCHANGE SERVER 2010 HIGH AVAILABILITY CONCEPTS Scott Schnoll Principal Technical Writer Microsoft Corporation SESSION CODE: EXL303 (c) 2011 Microsoft. All rights reserved.

3 Agenda ► Exchange Server 2010 High Availability Concepts – Terminology – Quorum – Witness, Witness Server and Alternate Witness Server – Active Manager – AutoDatabaseMountDial – Activation Preference – Circular Logging and Continuous Replication – Replay Lag and Truncation Lag – Continuous Replication Modes (c) 2011 Microsoft. All rights reserved.

4 Exchange Server 2010 High Availability Concept: Terminology

5 Exchange Server High Availability Terminology ► High Availability must meet three criteria – Service Availability – Data Availability – Automatic recovery from most failures ► Site Resilience – Manual switchover process (datacenter switchover) used to activate service and data access in an alternate datacenter when the primary datacenter can no longer provide the required level of service

6 Exchange Server High Availability Terminology ► *overs (pronounced “star overs”) – Short for Failovers and Switchovers Failover is automatic, performed by the system Switchover is manual, performed by an administrator – Database *overs – Server *overs – Datacenter switchover (c) 2011 Microsoft. All rights reserved.

7 Exchange Server 2010 High Availability Concept: Quorum

8 Quorum ► A consensus of voters used to ensure that only one subset of members is functioning at one time ► A majority of members must be active and have communications with each other ► Represents a shared view of members – Voters and resources ► Dual Usage – Data shared between the voters representing configuration, etc. – Number of voters required for the solution to stay running (majority)

9 Quorum ► Quorum is necessary for cluster functions and for DAG functions ► Exchange 2010 uses two cluster quorum models – Node Majority (DAGs with an odd number of members) – Node and File Share Majority (DAGs with an even number of members) ► Quorum = (N/2) + 1 (whole numbers only) – 6 members: (6/2) + 1 = 4 votes for quorum (can lose 3 voters) – 9 members: (9/2) + 1 = 5 votes for quorum (can lose 4 voters) – 13 members: (13/2) + 1 = 7 votes for quorum (can lose 6 voters) – 15 members: (15/2) + 1 = 8 votes for quorum (can lose 7 voters)

10

11 Exchange Server 2010 High Availability Concept: Witness, Witness Server and Alternate Witness Server

12 Witness ► A witness is a share with a file on a server that is external to the DAG that participates in determining quorum by providing a weighted vote for the DAG member that has a lock on the witness.log file – Configured for all DAGs – Used only by DAGs that have an even number of members ► Witness server does not maintain a copy of quorum data, does not vote, and is not a member of the DAG or cluster

13 Witness ► Part of cluster core resource group (Cluster Group) – Represented by File Share Witness resource (UNC path) File share witness cluster resource, directory, and share automatically created and removed as needed – Uses Cluster IsAlive check for monitoring and availability If witness IsAlive fails, Cluster Group is failed and moved to another DAG member If other DAG member cannot bring witness resource online, the resource will remain in a Failed state, with restart attempts every 60 minutes See for detailshttp://support.microsoft.com/kb/978790

14 Witness ► If Failed and needed for quorum, cluster will try to online File Share Witness resource once – If witness is Failed and cannot be brought Online, quorum is lost ► If witness can be restarted or is already Online: – An SMB lock is placed on witness.log typically by node that owns Cluster Group (“locking node”) – The Locking Node increments PAXOS and writes the updated PAXOS tag to the witness.log file

15 Witness ► When locked, the Locking Node retains a weighted vote – Members in contact with locking node are in majority and maintain quorum – Members not in contact with locking node are in minority and lose quorum ► When witness is no longer needed to maintain quorum, SMB lock on witness.log is released

16 XXX Witness ► Witness used only when needed for quorum

17 XX Witness ► Witness used only when needed for quorum

18 Witness Server ► No pre-configuration typically necessary – Exchange Trusted Subsystem must be member of local Administrators group on Witness Server if Witness Server is not running Exchange 2010 ► Cannot be a member of the DAG (present or future) ► Must be in the same Active Directory forest as DAG

19 Witness Server ► Can be Windows Server 2003 or later – File and Printer Sharing for Microsoft Networks must be enabled ► Replicating witness directory/share with DFS not supported ► Not necessary to cluster Witness Server – If you do cluster witness server, you must use Windows 2008 ► Single witness server can be used for multiple DAGs – Each DAG requires its own unique Witness Directory/Share

20 Witness Server ► Using a non-Exchange 2010 witness server: – You will receive a spurious warning when running New- DatabaseAvailabilityGroup, Set-DatabaseAvailabilityGroup or Restore-DatabaseAvailabilityGroup The Exchange Trusted Subsystem is not a member of the local Administrators group on specified witness server. – Code bug described at

21 Alternate Witness Server ► Witness server used by a DAG after a datacenter switchover Restore- DatabaseAvailabilityGroup Set-DatabaseAvailabilityGroup ► DAG is configured to use alternate witness server when you run Restore- DatabaseAvailabilityGroup or ahead of time by using Set-DatabaseAvailabilityGroup ► DAGs do not dynamically switch witness servers – Alternate witness server does not provide redundancy for witness server or FSW resource

22 Exchange Server 2010 High Availability Concept: Active Manager

23 Active Manager ► Exchange component that manages high availability platform – Runs inside the Microsoft Exchange Replication service on every Mailbox server – Is the definitive source of information on where a database is active Stores this information in cluster database Provides this information to Active Manager client running on other server roles (Client Access and Hub Transport)

24 Active Manager Roles ► Standalone Active Manager ► Primary Active Manager (PAM) ► Standby Active Manager (SAM) ► Active Manager Client – Runs in RPC Client Access service on CAS and Transport service on Hub

25 Active Manager ► Primary Active Manager (PAM) – Runs on the node that owns the cluster core resources (cluster group) – Gets topology change notifications – Reacts to server failures – Selects the best database copy on failovers and targetless switchovers – Detects failures of local Information Store and local databases

26 Active Manager ► Standby Active Manager (SAM) – Runs on every other node in the DAG – Detects failures of local Information Store and local databases Reacts to failures by asking PAM to initiate a failover – Responds to queries from CAS/Hub about which server hosts the active copy ► Both roles are necessary for automatic recovery – If the Microsoft Exchange Replication service is stopped, automatic recovery will not happen

27 Exchange Server 2010 High Availability Concept: AutoDatabaseMountDial

28 AutoDatabaseMountDial ► When a replicated mailbox database is affected by a failure (e.g., disk, network, service, server), what two types of recovery are possible? ► Answer: – Manual – the Administrator performs recovery (switchover) – Automatic – Exchange 2010 performs recovery (failover) ► Exchange 2010 uses AutoDatabaseMountDial setting as part of its automatic recovery logic ► AutoDatabaseMountDial is configured using Set- MailboxServer

29 AutoDatabaseMountDial ► Configured on a per- Mailbox server basis ► Used to determine if activating passive copy can be mounted, based on number of missing log files ► Can be overridden by an administrator performing a switchover

30 AutoDatabaseMountDial ► Best Availability – 12 or fewer missing log files ► Good Availability – 6 or fewer missing log files (Default) ► Lossless – 0 missing log files ► Best Effort – Boundless; available for switchovers only ► Custom value – Configured using ADSIEdit – msExchDataLossForAutoDatabaseMount – msExchDataLossForAutoDatabaseMount attribute of Mailbox server object – if custom value >12 is used, increase transport dumpster size accordingly

31 AutoDatabaseMountDial ► Dial setting consulted as part of a failover or targetless switchover during Best Copy Selection (BCS) ► During a failover, the passive copy being activated attempts to copy any missing log files from previous active – If successful, then the database will mount with zero data loss – If unsuccessful (lossy failure), then the database will mount based on the AutoDatabaseMountDial setting ► If data loss is outside of AutoDatabaseMountDial setting, another copy (if available) will be tried – If another copy is not available, administrator must intervene

32 Exchange Server 2010 High Availability Concept: Activation Preference

33 Activation Preference ► A whole number value assigned to each copy of a given database (e.g., 1, 2, 3, 4, etc.), where 1 is at the top of the preference order ► Different use in Exchange 2010 RTM vs. SP1 and later – RTM uses it as tie-breaking mechanism when multiple activation targets have the same copy queue length – SP1 uses it for sorting initial list of potential activation targets when AutoDatabaseMountDial is set to Lossless on all servers that host a copy of the database

34 Activation Preference ► Simple example – Multiple copies of a database in multiple datacenters

35 Activation Preference ► Activation preference is not a guarantee ► Based on nature of the failure and health and status of passive copies ► Can be reconfigured on the fly – Use Set-MailboxDatabaseCopy – Does not require any restarts – The number cannot be larger than the number of copies of the mailbox database ► Also used as part of DAG re-balancing

36 Exchange Server 2010 High Availability Concept: Circular Logging and Continuous Replication

37 Circular Logging and Continuous Replication ► Exchange 2010 includes two forms of circular logging – ESE (aka JET) circular logging (CL) – Continuous replication circular logging (CRCL) ► Over time, each Exchange database generates a set of log files known as the log stream ► When enabled, circular logging allows Exchange to overwrite transaction log files after the data contained in the log files is committed to the database (c) 2011 Microsoft. All rights reserved.

38 Circular Logging and Continuous Replication ► Exchange 2007 continuous replication also included CRCL, which is different from CL – CL is a function of JET and is performed by the Information Store service (store.exe) – CRCL is a function of continuous replication and is performed by the Exchange Replication service (msexchangerepl.exe) (c) 2011 Microsoft. All rights reserved.

39 Circular Logging and Continuous Replication ► For truncation to occur on highly available (non-lagged) mailbox database copies, the answer must be "Yes" to the following questions: – Has the log file been backed up, or is CRCL enabled? – Is the log file below the checkpoint? – Do the other non-lagged copies of the database agree with deletion? – Has the log file been inspected by all lagged copies of the database? ► For truncation to occur on lagged database copies, the answer must be "Yes" to the following questions: – Is the log file below the checkpoint? – Is the log file older than ReplayLagTime + TruncationLagTime? – Is the log file deleted on the active copy of the database? (c) 2011 Microsoft. All rights reserved.

40 Circular Logging and Continuous Replication ► MSExchangeRepl manages CRCL so that log continuity is maintained and logs are not deleted if they are still needed for replication ► MSExchangeRepl and Store communicate via RPCs regarding which log files can be deleted (c) 2011 Microsoft. All rights reserved.

41 Circular Logging and Continuous Replication ► Both features enabled and disabled same way – By using the checkbox on the Maintenance tab of the database Properties dialog that says Enable circular logging – By using the Set-MailboxDatabase cmdlet with the -CircularLoggingEnabled parameter set to $true (c) 2011 Microsoft. All rights reserved.

42 Circular Logging and Continuous Replication ► Once enabled, the feature you get depends on whether or not database is replicated – If the mailbox database is not replicated, it will use JET circular logging. In this case, enabling or disabling JET circular logging will require a dismount and mount of the database – If the mailbox database is replicated, it will use CRCL. In this case, enabling or disabling CRCL takes effect dynamically; there is no need to dismount and re- mount the database (c) 2011 Microsoft. All rights reserved.

43 Circular Logging and Continuous Replication ► No way to transition a database from CL to CRCL or vice versa – If you have circular logging enabled for a non-replicated database and you add your first passive copy, Exchange will block that action to prevent switching from JET CL to CRCL This is to prevent logs from being truncated while the new copy seed starts (the newly added passive copy would not seed correctly if required logs were removed) – When you try to remove the last passive copy, that would result in a transition from CRCL to CL, and is blocked because that transition would require a dismounting and re-mounting of the active copy of the database (c) 2011 Microsoft. All rights reserved.

44 Circular Logging and Continuous Replication ► You must therefore disable CRCL/CL when – You add the second copy of a database (e.g., add the first passive copy) – You are trying to remove the last passive copy of a database Database "{0}" has circular logging enabled. It is not possible to add or remove database copies while circular logging is enabled. Please disable circular logging before adding or removing mailbox database copies. (c) 2011 Microsoft. All rights reserved.

45 Exchange Server 2010 High Availability Concept: Replay Lag and Truncation Lag

46 Replay Lag Time ► Enables you to take a database copy back to a specific point in time ► Configure a Replay Lag Time >0 – Configure using Set-MailboxDatabaseCopy – Any database copy configured with a replay lag time >0 is considered a “lagged copy” ► Lagged copies are only for point-in-time protection, but they are not a replacement for point-in-time backups – Logical corruption and/or mailbox deletion prevention scenarios – Provide a maximum of 14 days protection

47 Replay Lag Time ► When should you deploy a lagged copy? – Useful only to mitigate a risk – Might not be needed if deploying a backup solution ► Lagged copies are not HA database copies – Lagged copies should never be automatically activated! – Steps for manual activation documented at ► ESE single page restore feature does not support lagged copies – If a lagged copy has database page corruption, it will have to be reseeded (which will lose the lagged aspect of the copy)

48 Truncation Lag Time ► Enables you to use the logs on a passive database copy to recover from the loss of log files on the active database copy ► Configure a Truncation Lag Time >0 – Configure using Set-MailboxDatabaseCopy

49 Using Replay Lag or Truncation Lag ► Using Replay Lag or Truncation Lag affects your storage design ► By design, both cause logs to build up, even when continuous replication circular logging (CRCL) is enabled ► Plan your storage design accordingly ► See us/library/dd aspx for more infohttp://technet.microsoft.com/en- us/library/dd aspx

50 Exchange Server 2010 High Availability Concept: Continuous Replication Modes

51 Continuous Replication Modes ► Two Modes of Continuous Replication in SP1 – File Mode (traditional log shipping) – Block Mode (ESE log buffer data shipping) ► Operates on a per-database level ► Initial Mode is File Mode ► Block Mode triggered when copy queue length is 0 ► Reverts to File Mode if copy queue length grows (c) 2011 Microsoft. All rights reserved.

52 Log File 2 Log File 1 Log File 2 Log File 1 Log File 4 Log File 3 Send me the latest log files … I have log 2 Log File 5 Log File 4 Log File 5 Log File 3 Database copy up to date Continuous Replication – File Mode Continuous Replication – Block Mode ESE Log Buffer Replication Log Buffer Log File 6 Log is built and inspected Log File 7 Log fragment detected and converted to complete log Continuous Replication Modes

53 Get-Counter -ComputerName - Counter "\MSExchange Replication(*)\Continuous replication - block mode Active“ ► Is database copy in block mode or file mode? Get-Counter -ComputerName - Counter "\MSExchange Replication(*)\Continuous replication - block mode Active“ TimestampCounterSamples /21/2011 2:15:29 PM \\ex1\\msexchange replication(db9)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db8)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db8)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db6)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db6)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db5)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db5)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db3)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db3)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db2)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db2)\continuous replication - block mode active : 1 \\ex1\\msexchange replication(db7)\continuous replication - block mode active : 0 \\ex1\\msexchange replication(db7)\continuous replication - block mode active : 0 \\ex1\\msexchange replication(db4)\continuous replication - block mode active : 0 \\ex1\\msexchange replication(db4)\continuous replication - block mode active : 0 \\ex1\\msexchange replication(db1)\continuous replication - block mode active : 0 \\ex1\\msexchange replication(db1)\continuous replication - block mode active : 0 \\ex1\\msexchange replication(_total)\continuous replication - block mode active : 6 \\ex1\\msexchange replication(_total)\continuous replication - block mode active : 6

54 Resources Exchange Team Blog - Exchange 2010 Documentation - My Blog –

55 Enrol in Microsoft Virtual Academy Today Why Enroll, other than it being free? The MVA helps improve your IT skill set and advance your career with a free, easy to access training portal that allows you to learn at your own pace, focusing on Microsoft technologies. What Do I get for enrolment? ► Free training to make you become the Cloud-Hero in my Organization ► Help mastering your Training Path and get the recognition ► Connect with other IT Pros and discuss The Cloud Where do I Enrol? Then tell us what you think.

56 © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. (c) 2011 Microsoft. All rights reserved.

57 Sessions On-Demand & Community technet.microsoft.com/en-au Resources for IT Professionals Resources for Developers Microsoft Certification & Training Resources Resources (c) 2011 Microsoft. All rights reserved.


Download ppt "EXCHANGE SERVER 2010 HIGH AVAILABILITY CONCEPTS Scott Schnoll Principal Technical Writer Microsoft Corporation SESSION CODE: EXL303 (c) 2011 Microsoft."

Similar presentations


Ads by Google