Presentation on theme: "Database Mirroring in the Real World"— Presentation transcript:
1 Database Mirroring in the Real World Craig Purnell
2 About Me Database Administrator with Baker Hostetler Experience with SQL Server 7/2000 and upPreviously consulted in the transportation industry, programming custom UNIX ERP solutionsMCSE / MCSA / MCITP (DBA) etc.
3 Overview Mirroring is a high availability technology Continuous stream of log records are sent to the mirror and “replayed”Mirror is unavailable for client connectionsProtects a single database at a timeIt is a software solution that increases database availability by providing a hot standby database on another instance of SQL Server.- Mirroring is log shipping on steroids! log shipping is a series of Backup/restores , mirroring is realtime / near realtime.-You can have only 1 mirrored copy of each database.- Mirroring always happens at the database level not instance.Unavailable but You can, however, create a snapshot on the mirror databaseMIRRORING should be just one of your tools in your H.A. toolkit.
4 Mirroring Demystified Synchronous with WitnessNo data lossAuto detection of Failure / FailoverSynchronous without WitnessPossibility of downtimeAsynchronousManual failoverPossibility of data loss.Synchronous operates in FULL transaction safety mode.The Key difference is that Sync writes to both sets of log drives. Async is a best effort.Having a witness gives you automatic failover capability.The witness arbitrates between the 2 servers to determine who is the Principal.- WE WILL BE FOCUSING MOSTLY ON ASYNC MIRRORING
5 Asynchronous Mirroring The key point here is that in asynchronous mirroring, SQL does not wait for writes to complete on the mirror.The mirror server could be 2000 miles away.This is only available in Enterprise editon.Works great for geographic DR.Think of it as a poor man’s SAN replication. Not as integrated but a whole lot cheaper.
6 Synchronous Mirroring In Synchronous mirroring, the principal waits for an achnowlegement from the Mirror before signaling the client that the tranaction is complete.Not recommended for WAN
7 Synchronous Mirroring with Witness Microsoft’s SAP systemPrimary Data CenterLog ShippingDisaster Recovery Data CenterPrincipalMirrorWitnessSynchronousDatabaseMirroringLog Shipping SecondaryReplaces the clustering single point of failure!- Downside : You now need 2 SANs- Nice Side-effect: You now have to use Log Shipping for DR.
8 Mirroring with Certificates Different domains without a trust (mergers, acquisitions, DMZ)Internet (partners, affiliates)NOT VERY COMMONOnly devote 1 slide to this. Over 5 years of working with mirroring, I have never come across the need to use Certs.Common scenarios include domains without trust relationships.Internet – collocated server facility.OR – business partner , OR to a stand alone server
9 Mirroring at Baker Hostetler 11 Offices; All SQL is centralizedEarly adopter of the technologyWas mirroring from Cleveland to DenverCurrently in a state of transition….Future: Eagan, MN to Cleveland-Early adopter: We’ve been using the technology since we went live on SQL 2005-SP1 .- We mostly use asynchronous mirroring – coupled with Failover Clustering for H.A.Our Sharepoint/SCCM instance uses synchronous mirroring.
10 Mirroring at Baker Hostetler We have 5 instances of SQL on a 4 Node SQL 2008 cluster.This is our core LOB application : Document Management System. This is just one small part of our environment.will be mirroring it back to cleveland.Other instance include Sharepoint, General Purpose, Citrix family, and Legal specific.Hardware is DELL 710 with Netapp SAN with iSCSI to the Netapp
11 Endpoints (1) Required by mirroring SQL Server Object that maps in TCP socketAuthenticationEncryption (RC-4 etc)See BOL for options- An endpoint is an object in SQL Server that maps in a TCP socket connection from the operating system.Database mirroring sessions use this as their communications PIPE.Endpoints are at the instance level, for all databases.We will review the specific syntax in the demo
12 Limitations (32bit) Best Practice: 10 mirrored databases per server. Effects of mirroring on the target: SQL runs out of VAS (32 bit)Band-aid: -g512 (increase memtoleave)- Best Practice as noted in the DB mirroring white paper is 10 (32 bit)On one server, we had around 40. That is the limit.We have seen VAS memory exhaustion as a result of high # of databases specifically on 32 bit.- Memtoleave is outside of the buffer pool, (for Linked servers, extended stored procedures, CLR)-Mirroring is very thread intensive -> each thread uses memory buffers-g startup option – default is 256
13 Mirror Configuration Identical Hardware not required Memory / Disk requirementsDrive Letter layoutsFile growths/shrinksUnlike clustering identical hardware is not required- memory and disk space should be close however.It’s best to lay out your mirror server disks exactly the same – this makes things easier for restores.File growths / Dbcc shrinkfiles also happen on the mirror.. So watch out for your drive space!
14 Demo #1 (T-SQL) Megadata from I1 to I2 DISCUSS HOW THE DEMOS ARE SET UP DC / SERVERA / SERVERBWe will be setting up Megadata using T-SQLComplete set up (I1-I2)SecurityEndpointsRestore DB / Restore T-LOGMirror
15 WAN acceleration (1) Riverbed Steelhead (hardware based) Turn off encryption and compressionSQL 2008 Effects (native compression)Trace Flag -T1462- We are using Riverbed steelhead for wan optimization .Operates at Layer 2 in the OSIThink of the Steelhead as ZIP file for the WAN. (26 a’s = 1a + checksum),- ~75% compression ratio on log stream (as reported by the Steelhead)- Comparison : SQL 2008 offers about 73% compression at a cost of higher CPU.- So it’s really nearly a wash.- T is available in SQL 2008 to disable compression- Explain the hazards of disabling compression in SQL 2008 (no compressed backups)
16 WAN Acceleration (2) Index Rebuilds : 70-75% compression We have had good luck with the Steelheads, with the exception of problems with their dual WAN partitioning algorithm.Index Rebuilds : 70-75% compression
17 MPLS Network QOS QOS = Quality of Service - Basically we tag the packets sent to AT&T and they are supposed to honor their priority.- Voice – IP phones- Interactive – Terminal services / Citrix- Business Critical – LOB applications- Default - try and put mirroring here--You really want to stay out of scavenger here. Can’t stress enough.
18 Mirroring Timeouts-What we believe finally was the issue was how the Steelhead operates in a dual-WAN environment.-How it partitions the bandwith.
19 Mirroring TimeoutsWe were able to reproduce this in the lab with a 10MB crossover and a big database.Test: 2 servers / 10Mbps crossover, dedicated endpoints
20 Demo #2 (GUI) Adventureworks from I1 to I2 GUI DEMO to MIRROR ADVENTUREWORKSSTART AT FULLY RESTORED VERSION OF ADVENTUREWORKS;CONFIGURE SECURITY DIALOG: AUTOMATICALLY CREATES THE ENDPOINTS IF THEY ARE NOT THERE.MAKE SURE WE SHOW THE BAD DIALOG.THEN:DEMONSTRATE FAILOVER WITH THE GUISHOW HOW THE ROLES ARE REVERSEDTRY AND QUERY THE MIRROR DATABASETHEN CREATE A SNAPSHOT ON THE MIRROR.
21 Mirror on a Dedicated IP Assign IP to 2nd NICCreate Endpoint to listen only on that IPRepeat for mirrorHave Network ops team configure routingThis is something we were experimenting with. We have done this in Test by using a crossover cable.WHY do we care about this ?If you are using redundant WAN, this can allow Net ops to fine tune where your traffic goes.
22 Mirroring and Licensing (1) Witness can be SQL ExpressMirror server can be unlicensed – as long as you do not have anything running queries against the instance.No snapshots allowed.See SQL Licensing White paper for more detailsImportant Disclaimer: I am not a licensing expert.- Unlicensed mirror, as long as no production databases on it.-WITNESS CAN ALSO BE SQL EXPRESS.
23 Mirroring and Licensing (2) Must use Standard or EnterpriseSynchronous vs. Asynchronous mirroringCan’t mix versions.Only enterprise edition can perform async mirroring.
24 Demo #3 (T-SQL) Rolling Upgrade of Northwind and Pubs IN THE NEXT DEMO , WE HAVE NORTHWIND AND PUBS MIRRORED FROM A SQL 2005 INSTANCE TO SQL 2008 R2 AND THEN FAILOVER AND BREAK THE MIRROR.THE NET EFFECT WILL BE TO UPGRADE THE DATABASE QUICKLYALSO WORKS FOR QUICKLY MOVING Database between Datacenters (or Servers) with low down time.
25 Mirroring: Lessons Learned Not a substitute for T-Log backupsNetwork latency definitely is a factorTimeouts due to “design flaw” in Riverbed dual WAN configurationLog Buildup on PrimaryQOS network rulesGreat way to shovel data from A to B- Synchronous mirroring on the WAN: Don’t try it.You still have to do Transaction Log backups!- The mirroring code inside the SQL engine is especially sensitive to network latency.(over 100ms is usually the breaking point)If a WAN link goes down, the log will fill up on your production log drive.- Index rebuilds place a heavy load on the log drives as well- Shoveling data : Rapid upgrades and Server Moves
26 Index Rebuilds Carefully Schedule production instances Selective rebuild script really helps.Your mileage may vary.You want to Carefully schedule production rebuilds so they don’t interfere with each other on the WAN
27 Monitoring Mirroring (1) System Monitor CountersT-SQLMirroring Monitor GUIDMVs: sys.dm_db_mirroring_connectionssys.database_mirroringsys.database_mirroring_witnessessys.database_mirroring_endpointssys.tcp_endpointssys.server_principalssys.database_recovery_statusThere are multiple ways to monitor
28 Demo #4 Index Rebuilds and Monitoring Effect on Send queueUsually leads to timeoutsShow: multiple ways to monitor:LAUNCH PERFMON AND MIRRORING MONITOR BEFORE YOU KICK OFF THE REBUILD.Perfmonmirroring monitor.T-SQL
29 SQL Agent AlertingConfigure SQL Agent to alert you of mirroring problemsSee Technical Article: Alerting on Mirroring Events (escape macros are wrong).Watch for the bug in the white paper also. This deals with how the escape macros are used in SQL Agent – macros changed at SQL 2005 SP1, and the paper is out dated.HOWEVER: there is really a better way. Configure a monitor job to look at your mirrored servers and you every 2 hours if they are behind.The correct code (macros) are below:DB Mirroring: Record State Changes JobCorrect Syntax :INSERT INTO dbo.[DB Mirroring State Changes] ([Event Time],[Event Description],[New State],[Database] )VALUES ('$(ESCAPE_SQUOTE(WMI(StartTime)))','$(ESCAPE_SQUOTE(WMI(TextData)))','$(ESCAPE_SQUOTE(WMI(State)))','$(ESCAPE_SQUOTE(WMI(DatabaseName)))')
30 Error messages: T-SQL vs. GUI Msg 1418, Level 16, State 1, Line 1The server network address "TCP://MYMIRROREDSERVER.domainname.com:5022" can not be reached or does not exist. Check the network address name and that the ports for the local and remote endpoints are operational.1) The GUI tends to give you better error messages.2) You can’t mix mirroring between Standard and Enterprise3) 1418 can manifest itself also as Endpoint configuration problems.Same error condition (mixing SQL versions)
31 Troubleshooting netstat –ano tcping hostname 5022 Verify endpoints are using the same encryption, authenticationAre you running SQL as LocalSystem?Use the GUI to configure the endpoints for youTry and recreate the problem with the GUIRecreate it – T-SQL vs. the GUI.-if you are running SQL as localsystem, you must use certificates
32 Troubleshooting Verify the service accounts are Logins Verify both service accounts has CONNECT rights on the endpoint(s)You must make a Backup and a T-Log backup AFTER the DB is in full recovery mode (watch for extraneous Tlog backups)Look in the Error logWindows Firewall must be configuredIPV6!Netstat –anoWatch for IPV6
33 Other Issues Logins are NOT transferred Use sp_help_revlogin to get an exact sid / SQL login listSQL Agent Jobs are not transferredExtended Stored ProceduresLinked serversBasically anything that lives outside the user databases needs to be synced on a periodic basis.Keep the logins synced – could to be a weekly agent job
34 Key TakeawaysDatabase Mirroring can eliminate the single point of failure of clustering (aka the shared disk problem)Database Mirroring, coupled with clustering can raise your business database uptime an order of magnitude.This technology can be implemented without the need of expensive SAN…just a little creativity on YOUR partAll you need to do this is SQL Server (in the box).Mirroring is something do definitely want to consider.Mirroring , coupled with something else can really give your HA a boost. (4 9’s or even 5 9’s)
37 References (1) Tcping http://www.elifulkerson.com/projects/tcping.php SQL Server Licensing White PaperAsynchronous Database Mirroring with Log Compression in SQL ServerDatabase Mirroring White PaperDatabase Mirroring: Best Practices and Performance ConsiderationsMirroring with Certificates:Transferring Logins and Passwords between instances:
38 References (2)Alerting on Database Mirroring EventsHow to: Minimize Downtime for Mirrored Databases When Upgrading Server InstancesAsynchronous Database Mirroring with Log Compression in SQL ServerHigh Availability and Disaster Recovery for Microsoft’s SAP Data Tier: A SQL Server 2008 Technical Case Study
39 Resources SHunra VE Desktop – simulates latency To simulate satellite latency in my test environment, I used a product from called VE Desktop. It is a simple to use application providing programmatic tweaks to the network adapter.
40 Review of H.A. technology We grabbed this picture from SQL PASS , very informative
41 Monitoring Mirroring (3) WHERE:@dbname – the mirrored database@interval - 0 = last row, 1 last two hours, 2 last four, 3 last eight, 4 last day, 5 last two days, 6 last 100, 7 last 500, 8 last 1000, 9 ALLsp_dbmmonitorupdate queries the underlying perfmon counters where we feel there is an underlying bug.
42 Supports automatic failover? Supports manual failover? Mirroring OptionsModeAlternate nameRequires a witness?Supports automatic failover?Supports manual failover?Waits for mirror to write log block before client transaction is committed?High AvailabilitySynchronous (with Witness)YesYes – if mirror not connected then no delayHigh ProtectionSynchronousNoHigh PerformanceAsynchronousNo, must be changed to Synchronous
43 Network Overview Craig – We maintain a disaster recovery warm site at our office in Denver , COThe latency to Denver is about 65ms. So you see why we can’t do synchronous.
44 VLDB Considerations Multiple Log Restores (not in LSN window) FEDEXnet (portable hard drive)Data safeguarding implicationsDisable Log Backups while attempting to resynchronize the mirror.In today’s day, any database backup is going out as encrypted.Don’t do re-indexing while drive is in flight. (copy massive amounts of data over the WAN.)
45 Timeouts OKB Sent -> SQL Refuses to transmit Infinite to Send -> Can’t see mirror – why?Ping –l hostname (good)(better)Work closely with Network WAN team
46 Common ErrorsMsg 1412, Level 16, State 0, Line 1The remote copy of database “Adventureworks" has not been rolled forward to a point in time that is encompassed in the local copy of the database log.Msg 1408, Level 16, State 0, Line 1The remote copy of database "Northwind" is not recovered far enough to enable database mirroring.Error 1416: DB is in standby or recovered mode. The last transaction log must be restored with NORECOVERY option.Errors 1418 and 1486 – these are communications related errors.
48 Mirroring Demystified Operating ModeTransaction safetyTransfer mechanismQuorum requiredWitness serverFailover TypeHigh AvailabilityFULLSynchronousYAutomatic or ManualHigh ProtectionNManual OnlyHigh PerformanceOFFAsynchronousN/AForced OnlyMIRRORING should be just one of your tools in your H.A. toolkit.The Key difference is that Sync writes to both sets of log drives. Async is a best effort.Having a witness gives you automatic failover.The witness arbitrates between the 2 servers to determine who is the Principal.- WE WILL BE FOCUSING MOSTLY ON ASYNC MIRRORING