Avanade: Tid for oppetid

Name: Avanade: Tid for oppetid
Uploaded: 2017-12-27T07:29:45+00:00
Duration: PTM18S2
Channel: Pedro Wickson
Description: Avanade: Tid for oppetid

Avanade: Tid for oppetid
Bernt Lervik Infrastructure Architect Avanade

Additional information can be found at www.avanade.com
4/6/ :36 AM Avanade is the leading technology integrator specialising in the Microsoft platform. Our people help customers around the world maximise their IT investment and create comprehensive solutions that dive business results. Additional information can be found at © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

Agenda Failover Clustering Database Mirroring
Basic principles of failover clustering Best practices for failover clustering Best practices for geo clustering Database Mirroring How database mirroring works What influences performance Failover considerations Deployment considerations Best practices database mirroring

Basic Principle of Failover Clustering
Public Network SQL Server Virtual Server Microsoft Cluster Server Microsoft Cluster Server Heartbeat Node A Node B Shared Disk Array* * Disks in a shared-nothing configuration – each drive letter is owned by only one node (at a time) DAT203

Best Practices for Failover Clustering Increasing reliability for site/server/database failure
Understand the technology – multiple nodes access a shared disk array in a shared-nothing configuration Pros Provides redundancy at the server level Supports automatic detection and automatic failover Clients unaware of where SQL Server is running – SQL Server runs as a “Virtual Server” on the Windows Cluster Entire server is protected (there is ONLY one database/server, no external dependencies) Relatively easy once it’s set up and configured Cons Only *one* copy of the database Single point of failure, failover when shared disk unaffected No standby server (aka, no copy) for reporting Proprietary hardware can be expensive Only choose hardware from the Microsoft Windows Server Catalog, Cluster Solutions: =22&xslt=categoryProduct&pgn=8b b91c-4a7d e9cd3ae95b DAT203

Best Practices for Geo Clustering Increasing reliability for site/server/database failure
Understand the Technology – based on a common hardware strategy called remote mirroring, where disk activity is mirrored remotely to a second copy of the database. Can cause the secondary system to be corrupt if write order and block size are not preserved (be sure to choose a supported configuration) Pros Provides zero to minimal data loss through redundant storage area networks/arrays, providing redundancy at the system level Removes single point of failure when compared to failover clustering Cons Performance may be impacted in synchronous (no data loss) configurations but data loss is possible in high performance configurations (there’s your key trade-off in a mirroring solution) Proprietary hardware can be expensive Only choose hardware from the Windows Server Catalog, Geographically Dispersed Cluster Solution Category: DAT203

How Database Mirroring Works No Mirroring
Application 1 3 Commit Principal SQL Server 2 >2 Log Data

How Database Mirroring Works Asynchronous Mirroring
Application 1 3 Commit Principal Mirror >2 SQL Server SQL Server >>>2 2 >>2 >>>2 Log Data Log Data

How Database Mirroring Works Synchronous Mirroring
Witness Application 1 5 Commit Principal Mirror 2.1 SQL Server SQL Server 4 2 >2 3 >3 Log Data Log Data

Transaction Safety Synchronous Asynchronous SAFETY FULL (Default)
ALTER DATABASE <database name> SET SAFETY FULL Guaranteed protection of data High availability / High protection (with a witness) Allows automatic failover (with a witness) Asynchronous SAFETY OFF ALTER DATABASE <database name> SET SAFETY OFF Potential loss of data in the event of failure High Performance mode Force service for failover

What Influences Performance? Synchronous Mirroring
The most important factor is the log generation rate Witness Application 1 5 Commit Principal Mirror 2.1 SQL Server SQL Server 4 >2 2 3 >3 Log Data Log Data

What Influences Performance ?
Log generation rate Network latency and bandwidth Transaction safety level Number of concurrent user connections Transaction size and volume Hardware sizing (spindles spindles spindles)

Test Workloads Characteristic Workload1 Workload2 Database size (GB)
40 20 Number of concurrent user connections 1000 Maximum think time between transactions (sec) 4 Baseline (No Mirroring) %CPU Baseline (No Mirroring) Transactions / sec 241 215 Baseline (No Mirroring) log generation rate (KB / sec) 720 12000

Transaction Safety vs. Performance Workload1
Marginal impact when log generation rate is low

Transaction Safety vs. Performance Workload2
More impact when log generation rate is high

Impact of Network Latency Synchronous with Workload1

Impact of Network Bandwidth Synchronous with Workload1

Failover Considerations
Failover is at a database level No group / instance failover Data outside the database is not propagated Master: logins, user written stored procedure, etc. MSDB: Jobs, histories, etc.

Events During an Automatic Failover
Failure occurs Time to coordinate with witness. Database available Time to detect failure. Fixed overhead Failure detected Redo Complete Decide to failover Redo Phase Undo Phase Time The time from detecting the failure of the principal to the time the mirror assumes the role of the principal is the database failover time

Failure Detection for Automatic Failover
Two different types of failures SQL Server Ping each other once a second By default if 10 “pings” are missed, then declare a failure Outside SQL Server Operating system Network errors IO errors Process errors

Examples of Failures Fast
SQL Server instance crashes Endpoint closes port quickly Network retry from partner quickly fails OS says that the port is closed Fast failure! Failover begins in seconds

Examples of Failures Not as fast
Catastrophic server failure Power supply fails Network retry from partner waits for timeout SQL Server “ping” will most likely fail first Failover begins in 10 seconds

Examples of Failures Slower…
Someone pulls the log drive on principal Pending IOs to the log drive queue up SQL Server “pings” are working fine After 20 seconds, SQL Server issues IO warning After 40 seconds, SQL Server declares IO failure Failover begins 40 seconds after log drive is pulled

Examples of Failures Either No failover or Fast failover
Database page fails checksum Client connection is broken Transaction rolls back automatically No failover Transaction was in the middle of a rollback Now the database is inconsistent Database goes SUSPECT Fast failover!!!

Issues with Extended Disconnects
Long Disconnects Mirror unavailable → DISCONNECTED Mirroring session suspended → SUSPENDED Log records keep accumulating at the principal Transaction log can NOT be truncated, even if you backup transaction log May eventually fill up the transaction log space and the database comes to halt Look at LOG_REUSE_WAIT_DESC column in sys.databases RESUME the mirroring session, or break it (manually resynchronize via backup/copy/restore, resume mirroring – just as when you setup mirroring)

Deployment Considerations 2
Customer stories Mission critical applications deploying synchronous with witness For DR, customers deploy asynchronous with great success Some customers want synchronous, but prefer manual failover Multiple databases Corporate IT policies demand human involvement Start simple with asynchronous mirroring Increase complexity as needed – one at a time Turn on synchronous Add a witness

Summary Performance Considerations
Applications generating more transaction log experience higher performance impact with database mirroring Applications with fewer connections experience more impact on transaction throughput when synchronous mirroring is turned on Applications with smaller transaction size experience relatively larger performance impact with database mirroring Applications with low transaction log rate may sustain acceptable throughput with slight reduction in network bandwidth or slight increase in the network latency Applications with high transaction log rate may experience severe performance degradation with lower network bandwidth or higher network latency While using asynchronous mirroring, monitor send queue to determine the possible data loss in the event of failure of the principal

Best Practices for Database Mirroring Increasing reliability for site/server/database failure
Understand the technology – Communicates through a dedicated TCP endpoint and continuously sends transactional information to the mirror (copy) database Pros Provides zero to minimal data loss through a configured database mirroring partnership – which includes a copy of the database, database redundancy Removes single point of failure when compared to failover clustering No hardware dependencies Includes transparent client redirect for better client connection management Cons Performance may be impacted in synchronous (no data loss) configurations but data loss is possible in high performance configurations (again, the key trade-off in any mirroring solution) Understand the Configurations High Availability-Synchronous mirroring, Automatic detection/failover, no data loss High Protection-Synchronous mirroring, manual failover, no data loss High Performance-Asynchronous mirroring, manual failover, some data loss possible DAT203

Summary Best Practices Recommendations
Start simple (asynchronous) and then gradually increase complexity to synchronous without witness (therefore without automatic detection/automatic failover) and then add the witness If you are not interested in automatic failover, don’t setup a witness Understand performance and availability requirements of the application Synchronous database mirroring is “generally” not recommended for a remote mirror Keep the mirror prepared for a failover, but transferring the logins, jobs, etc. Test performance implications thoroughly before setting up in production Test performance over network before deploying mirroring between two geographically distant servers Test failover with different failure scenarios

Avanade: Tid for oppetid

Similar presentations

Presentation on theme: "Avanade: Tid for oppetid"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Avanade: Tid for oppetid

Similar presentations

Presentation on theme: "Avanade: Tid for oppetid"— Presentation transcript:

Similar presentations

About project

Feedback