Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building Highly Available Systems with SQL Server™ 2005 Keith Burns Data Architect Microsoft Ltd.

Similar presentations


Presentation on theme: "Building Highly Available Systems with SQL Server™ 2005 Keith Burns Data Architect Microsoft Ltd."— Presentation transcript:

1 Building Highly Available Systems with SQL Server™ 2005 Keith Burns Data Architect Microsoft Ltd

2 Availability What does it mean to you?  Why not?  Site is unavailable  System is unavailable  Database is unavailable  Database is partially unavailable  Table is unavailable  Data is unavailable Can your customers get done, what they need to get done, when they need to do it? 24x7x365

3 Barriers to Availability  Isolated Failures  Concurrency Issues  Catastrophic Failures

4 Barriers to Availability Isolated Failures  Continuing to work with isolated failures  Limiting the scope of failure  Partial Database Availability  Online Piecemeal Restore  Supporting Technology  Instant File Initialization  How do they work?

5 What happens when…  Disks Fail  In SQL Server™ 2000  Database is marked suspect  Users are unable to access the database  In SQL Server™ 2005  Filegroup is marked offline  Users are able to access undamaged data

6 What happens when…  Recovery begins  In SQL Server™ 2000  Database is in a restoring state  Users are unable to access the database  File needs to be recreated and zero initialized  File Restore can proceed – offline  In SQL Server™ 2005  Filegroup is in a restoring state  Users are able to access undamaged data  File can be recreated with instant file Initialization  Piecemeal Restore can proceed – online

7 How is This Possible?  Fine grained operations are based on “functional partitioning”  Partitioning – in this sense – does not require Partitioned Tables  Partitioned Tables benefit significantly from fine grained operations  Partitioning for fine grained operations requires secondary, non-primary data files where data is strategically placed  Recovery of your damaged devices can be prioritized and then the database can be brought online in stages

8 Functional Partitioning Strategies to separate Objects/Data  Related Object-groupings  Separate tables – strategically placed on different filegroups  Time-based data placement/partitioning  Structures designed for sliding window scenario  List-based groupings/partitioning  Range-based partitioning based on complete lists  To fully leverage Partial Database Availability for partitioned objects – use Partitioned Tables  Partitioned Tables – new feature in SQL Server™ 2005 to further simplify the process of building large data warehouses

9 Benefits of Partitioning  Speed in managing sliding window  Partition manipulation outside of active table  Piecemeal Backup  Backup active components more frequently, inactive less frequently  Partial Database Availability  If a filegroup becomes unavailable the undamaged data remains available  Online Piecemeal Restore  During the restore, the undamaged data remains available

10 Partial Database Availability Improving Availability for Isolated Disaster  Undamaged data remains available while damaged data is inaccessible  File Status shown in sys.database_files catalog view  Page Errors written to suspect_pages table in msdb  Agent alerts:  Notification of the damaged file  Can take the database offline, if desired  Can automate the restore, for read-only data

11 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20 21 22 23 18 File Header extent 0 extent 1 extent 2 Database Components  Database consists of…  Filegroups consist of…  Files consist of…  Extents consist of…  Pages consist of data 24 25 27 28 29 30 31 26 … extent 3 TicketSalesD B File2 File3 Log Readwrite filegroup File4 File5 File6 Readonly filegroups 2004 2003 2002 2001 Primary File1

12 Online Piecemeal Restore Improving Availability during Recovery  Almost any component (page, file, filegroup) can be restored – ONLINE  If a page is damaged – restore only that page from a file, filegroup or database backup  If a file is damaged – restore only that file from a file, filegroup or database backup  If a filegroup is damaged – restore only that filegroup from a filegroup or database backup  Readonly filegroups can be restored without rolling forward log changes  Users can access the database during the restore

13 Summary: Isolated Failures TechnologyImprovesWhen Partial Database Availability  Data Availability Undamaged data/partitions remains available  Recovery Time Recover only that which is damaged Upgrade Immediate Instant File Initialization  Database Creation Time Files are not zero-initialized  File, Filegroup, and Database Restore Missing files are created quickly  Autogrow and Manual Growth time Additional space is quickly added  Recovery Time – Less time to create files Upgrade Immediate Online Piecemeal Restore  Data Availability Undamaged data/partitions remains available during recovery  Recovery Time Recover only that which is damaged – online Upgrade Immediate

14 Barriers to Availability  Isolated Failures  Concurrency Issues  Catastrophic Failures

15 What happens when…  Indexes need to be rebuilt  In SQL Server™ 2000  Index rebuilds require an exclusive table-level lock, resulting in offline rebuilds  Users are unable to access the table  In SQL Server™ 2005  Rebuilds of an index can be performed online if a few simple criteria are met  Users are able to access the table

16 Online Index Operations Improving Concurrency during Index Maintenance  SQL Server™ 2000  Offline Index Rebuilds ; table data is unavailable during operation  Rebuild options: DBCC DBREINDEX and CREATE with DROP_EXISTING  SQL Server™ 2005  Includes all of the above offline operations, plus…  New ALTER INDEX…REBUILD:  ONLINE – allows concurrent user access (queries as well as modifications) to the index during rebuild  OFFLINE – works using locks (same as SQL Server™ 2000)  If online is not possible by default, consider design alternatives to fully leverage online index rebuilds

17 What happens when…  Readers and Writers desire the same data  In SQL Server™ 2000  Locking is used to guarantee the intended level of isolation  Users must wait to access locked data  Concurrency and performance compromised  Correctness is compromised when lower isolation levels are used to avoid locking  In SQL Server™ 2005  Locking OR Versioning can be used to guarantee a variety of isolation levels  With versioning, Readers won’t block writers and writers won’t block readers  Performance improved if contention was primary bottleneck  Correctness is not compromised due to use of lower isolation levels

18 Snapshot Isolation Improving Concurrency in Mixed Workloads  SQL Server™ 2000  Isolation implemented solely through locking  Mixed workloads may experience:  Concurrency problems due to blocking  The Inconsistent Analysis problem  SQL Server™ 2005  Isolation implemented using locking and versioning  Mixed workloads can improve read consistency and performance using:  Read committed with Statement-level snapshot to improve statement-level consistency  Snapshot Isolation to improve transaction-level consistency

19 Summary: Concurrency Requirements TechnologyImprovesWhen Online Index Operations  Table concurrency tables being rebuilt remain available  Downtime due to Maintenance no longer required for majority of indexes Minimal Work to Leverage Design and Architect Snapshot Isolation Statement-level Snapshot  Row concurrency locked rows prior and consistent version remain accessible  Accuracy long running aggregates/statements use consistent version from statement start  Analysis/Query Time Queries do not wait! Minimal Work to Leverage Snapshot Isolation Transaction-level Snapshot for read-consistency Transaction-level Snapshot for Update Conflict Resolution  Row concurrency locked rows prior and consistent version remain accessible  Accuracy long running aggregates/statements use consistent version from transaction start  Analysis/Query Time Queries do not wait! Minimal Work to Leverage Design and Architect

20 Barriers to Availability  Isolated Failures  Concurrency Issues  Catastrophic Failures

21 Failover Clustering Server-level Redundancy  Established High Availability Technology  Hot Standby: Automatic Detection and Automatic Failover  No work loss exposure and no direct impact to workload  Protects against node failures  Geographically Dispersed Failover Clusters with approved hardware  Recovery on failover improved by Fast Recovery Failover Cluster

22 Failover Clustering New for SQL Server™ 2005  Faster Failover through Fast Recovery  Supports up to an 8-node Failover Cluster with Enterprise Edition  Supports up to a 2-node Failover Cluster with Standard Edition  Supports mounted volumes for better explicit disk usage – helps in server consolidation  Supports dynamic AWE for better memory utilization  Unattended setup  All SQL Server data services participate  Database Engine, SQL Server Agent, Full-Text Search  Analysis Services – Now has multiple instances

23 Redo Fast Recovery Improving Availability by Reducing Downtime  Not only beneficial to Failover Clustering  On every server startup, Restart Recovery runs to guarantee consistency  Restart Recovery has two phases:  REDO: rolls forward committed transactions  UNDO: rolls back any incomplete transactions  In SQL Server™ 2005, users are allowed access after REDO SQL Server™ 2005 SQL Server™ 2000 ONLINE Undo ONLINE

24 Database Mirroring Database-level Redundancy  Upcoming High Availability Technology  Released for testing and prototyping in SQL Server™ 2005 RTM  Certified for Production Use in the first half of 2006  Supports three configurations:  High Availability  High Protection  High Performance

25 Database Mirroring Technology Overview  Principal Database handles user activity  Mirror Database receives changes via secure, dedicated TCP channel  Server does NOT require a license if the server acts solely for redundancy  Optional Witness Server  Lightweight mechanism to help provide quorum  Can run on any SQL Server Edition  Supports three configurations:  High Availability  High Protection  High Performance

26 Commit Write to Local Log Transmit to Mirror Write to Remote Log Acknowledge Committed in Log Constantly Redoing on Mirror Acknowledge Database Mirroring Basic Principal of Synchronous Mirroring Log DB Log

27 Database Mirroring Configuration Summary  Mirror database is available within seconds of failover  Mirror database is available for read-only analysis through the use of Database Snapshots  No Automatic Detection  Manual Failover  Uses synchronous form of mirroring  Does not require Witness  Principal performance is affected by network speed and distance High Availability High ProtectionHigh Performance  Automatic Detection  Automatic Failover  Uses synchronous form of mirroring  Requires Witness  Principal performance is affected by network speed and distance  No Automatic Detection  Manual Failover  Uses asynchronous form of mirroring  Does not require Witness  Principal performance is NOT affected by network speed and distance

28 Database Scale Out Peer to Peer Replication  Identical databases continuously synchronize in near real time  Scale query workloads beyond what’s possible with a single database London Chicago Tokyo Example: Distributed Trading System

29 Availability through Scalability Peer to Peer Replication  Enables load-balancing and improved availability through scalability  Database failures shouldn’t bring down the application system  Database upgrades should be done without outages  Individual databases can be taken online/offline and maintained without application downtime  Warm Standby  Small possibility of some data loss on failure

30 Peer-to-Peer Replication  Based on Established Transaction Replication Technology  Based on Bi-directional Transactional Replication  All participants are peers  Schema is identical on all sites  Publish the updates made on “their” data  Subscribe to others to pick up their changes  No hierarchy as in “normal” transactional replication  A given set of data can be updated at only one site at a time  Data “ownership” is purely logical; does not prevent conflicts  SQL Server prevents a change from round-tripping

31 Distribution Agent Logreader Agent Dist DB London Chicago Tokyo Peer to Peer Topology Peer to Peer Transactional Replication Distribution Agent Logreader Agent Dist DB Distribution Agent Logreader Agent Dist DB

32 Log Shipping Database-level Redundancy  Established High Availability Technology  Supports multiple secondary servers  Secondary for Failover  Secondary for Reporting  Secondary with delay for Human Error Recovery  Can be combined with other technologies such as Failover Clustering and Database Mirroring  New for SQL Server™ 2005  Integration in SQL Server Management Studio  Log Shipping is not delayed during Database or Differential Backups

33 Barriers to Availability  Microsoft SQL Server™ 2005 gives you greatly improved tools to overcome these barriers to availability:  Database Server Failure or Disaster  Isolated Disk Failure  Data Access Concurrency Limitations  Database Maintenance and Operations  Availability at Scale  User or Application Error Many more barriers than discussed Only some are addressable by database technology Be sure to consider people, planning, procedures and training

34 Summary  SQL Server™ 2005 offers greater availability – immediately  Many technologies available just by upgrading!  Some architected/implemented over time  SQL Server™ 2005 is more Available  Partially damaged databases remain available  Databases being recovered remain available  Instant File Initialization, Fast Recovery  New and Improved Replication Alternatives  SQL Server™ 2005 is more Robust

35 Take Advantage When? How much work to leverage the technology?  Online Index Operations When Criteria Met  Snapshot Isolation Statement-level Snapshot  Snapshot Isolation Transaction-level Snapshot (RO)  Failover Clustering  Database Mirroring  Log Shipping  Database Snapshots Upgrade Immediate Minimal Work to Leverage Design and Architect  Improving Availability from Installation to Design  Availability in Layers to minimize downtime and data loss  Partial Database Availability  Online Piecemeal Restore  Instant File Initialization  Fast Recovery  Online Index Operations When Criteria NOT Met (minority)  Snapshot Isolation With Update Conflict Detection  Replication

36 What we haven’t covered http://www.microsoft.com/events/series/technetsqlserver2005.mspx http://www.microsoft.com/events/series/msdnsqlserver2005.mspx http://www.microsoft.com/events/series/technetsqlserver2005.mspx http://www.microsoft.com/events/series/msdnsqlserver2005.mspx  New Security  User/schema separation, encryption  http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032284145&EventCategory=4&culture=en-US&CountryCode=US http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032284145&EventCategory=4&culture=en-US&CountryCode=US  http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032270040&EventCategory=3&culture=en-US&CountryCode=US http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032270040&EventCategory=3&culture=en-US&CountryCode=US  CLR Integration  http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032275334&EventCategory=5&culture=en-US&CountryCode=US http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032275334&EventCategory=5&culture=en-US&CountryCode=US  XML Datatypes & XQuery support  http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032263330&EventCategory=5&culture=en-US&CountryCode=US http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032263330&EventCategory=5&culture=en-US&CountryCode=US  http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032271525&EventCategory=3&culture=en-US&CountryCode=US http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032271525&EventCategory=3&culture=en-US&CountryCode=US  T-SQL extensions  Varchar)max), try/catch exception handling, CTEs  http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032284458&EventCategory=4&culture=en-US&CountryCode=US http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032284458&EventCategory=4&culture=en-US&CountryCode=US  http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032284324&EventCategory=4&culture=en-US&CountryCode=US http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032284324&EventCategory=4&culture=en-US&CountryCode=US  Service Broker  http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032284274&EventCategory=4&culture=en-US&CountryCode=US http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032284274&EventCategory=4&culture=en-US&CountryCode=US  http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032263311&EventCategory=5&culture=en-US&CountryCode=US http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032263311&EventCategory=5&culture=en-US&CountryCode=US  Query Notifications

37 © 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.


Download ppt "Building Highly Available Systems with SQL Server™ 2005 Keith Burns Data Architect Microsoft Ltd."

Similar presentations


Ads by Google