Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big and Tall: When to Partition Kendra Little Founder, Brent Ozar PLF.

Similar presentations


Presentation on theme: "Big and Tall: When to Partition Kendra Little Founder, Brent Ozar PLF."— Presentation transcript:

1 Big and Tall: When to Partition Kendra Little Founder, Brent Ozar PLF

2 About Kendra

3 1.You be the consultant: should this client partition? 2.What is partitioning? 3.Demo 4.Gotchas / Goodies 5.How to: your guide to implementing partitioning. 6.What’s your decision for our client? 7.Resources AGENDA

4 Should this client use partitioning? Client has a reporting system containing both fact and dimension tables. –Fact tables are up to 300GB in size (including all indexes) with up to 1 billion rows. –Dimension tables are up to 200GB in size (including all indexes) with up to 200 million rows. A middle tier application dynamically executes queries to run reports custom-designed by clients.

5 AGENDA 1.You be the consultant: should this client partition? 2.What is partitioning? 3.Demo 4.Gotchas / Goodies 5.How to: your guide to implementing partitioning. 6.What’s your decision for our client? 7.Resources

6 All tables have at least one partition “In SQL Server, all tables and indexes in a database are considered partitioned, even if they are made up of only one partition. Essentially, partitions form the basic unit of organization in the physical architecture of tables and indexes.” …Partitioned Table and Index Concepts (msdn)Partitioned Table and Index Concepts (msdn)

7 “Partitioning” means “horizontal partitioning” Horizontal partitioning takes groups of rows in a single table and allocates them in semi-independent physical sections. SQL Server’s horizontal partitioning is RANGE based. –You can effectively do “list” partitioning, however.

8 Horizontal ranges are based on a partition key A single column in the table. –Just one! –You may use a computed column. Just make sure it performs well as a criterion and works for joins. –Look at sys.partition_parameters. What does this imply? Typically a date or integer value Consider: –A column you will join on –A column you can always use as a criterion Choose wisely.

9 Ranges of data are defined by a partition function which uses the key. The partition function defines your boundary points and can use either RANGE LEFT or RIGHT. LEFT: the first value is an UPPER boundary point in partition #1 RIGHT: the first value is a LOWER boundary point in partition #2 Keep to the right.

10 RIGHT based partition function 1/1/2009 1/1/2010 1/1/2011 Partition 1 Partition 2 Partition 3 Partition 4

11 Filegroups are mapped to the partition function using a partition scheme 1/1/2009 1/1/2010 1/1/2011 Partition 1: Compressed Partition 2: Compressed Partition 3 Partition 4 Slow, Read- only FG_A FG_B FG_C

12 Objects are created on a partition scheme

13 Non-Clustered Indexes are created on the partition scheme… or not. Aligned NCIs –Located on your partitioning scheme (or an identical partitioning scheme) –Must contain the partitioning key. –If the partitioning key is not specified, it will be added for you. Note: this affects your primary key for the table! –Indexes are aligned by default unless it is otherwise specified at creation time. –Perform better for aggregations and when partition elimination can be used. Non-aligned NCIs –Physically located elsewhere- either non partitioned or on a non-identical partitioning scheme –Allow unique indexes (because they do not have to contain the partitioning key) –May perform better with single-record lookup

14 Switching ← this is cool This an exceptionally fast way to load or remove a large amount of data from a table! Rules to know: Requires all indexes to be aligned. This is compatible with filtered indexes. (SWEET) Data may be switched in or out only within the same filegroup.

15 AGENDA 1.You be the consultant: should this client partition? 2.What is partitioning? 3.Demo 4.Gotchas / Goodies 5.How to: your guide to implementing partitioning. 6.What’s your decision for our client? 7.Resources

16 AGENDA 1.You be the consultant: should this client partition? 2.What is partitioning? 3.Demo 4.Gotchas / Goodies 5.How to: your guide to implementing partitioning. 6.What’s your decision for our client? 7.Resources

17 Editions with table partitioning EnterpriseDatacenterDeveloperEvaluationStandard

18 Support for HOW MANY partitions? 15,000 partitions are available in SQL 2008 with SP2 applied SQL Server 2005, 2008, and 2008 R2 (for now) are limited to 1,000 partitions. This is less than 3 years for daily partitioning. What problems could happen with lots of partitions?

19 More better parallelism in SQL 2008 In 2005, a query touching more than one partition typically had only one thread per partition. In 2008, the Partitioned Table Parallelism improvement allows multiple threads to be used on each partition for parallel plans. Partition 1! Partition 2! Partition 1! Partition 2!

20 Lock escalation AUTO Lock escalation can be set to AUTO for a table. If the table is partitioned, locks will escalate to the partition level rather than the table level. What’s awesome: greater concurrency! The gotcha: partition level deadlocks. Test your workload.

21 Partition aware seeks In SQL 2008, the optimizer has been made more clever and has a greater chance at achieving partition elimination. This has been done by: –Changing the internal representation of a partitioned table to be more optimized for seeking on the PartitionID (even when the table’s CX is on another column) –A “skip scan” operation has been added to allow the optimizer greater flexibility. More optimized optimizin.

22 Coming soon: Columnstore! “In the Denali release, tables with columnstore indexes can’t be updated directly using INSERT, UPDATE, DELETE, and MERGE statements, or bulk load operations. To move data into a columnstore table you can switch in a partition, or disable the columnstore index, update the table, and rebuild the index. Columnstore indexes on partitioned tables must be partition-aligned. “ Eric N. Hanson “Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0” Hot

23 Index rebuilds and compression Partitions can be rebuild individually. Individual partitions cannot be rebuilt online. The entirety of a partitioned index can be rebuilt online. Individual partitions can be compressed. For fact tables with archive data, older partitions can be rebuilt once with compression. Their filegroups can then be made read-only. Is there a benefit to a read-only filegroup?

24 Be careful with your statistics Statistics are not maintained per partition. Filtered statistics can be used to help with this in 2008: you can create new filtered statistics for your new partition. Consider using a combination of tables (partitioned and non-partitioned) with partitioned views. Limit the size of partitioned tables where possible.

25 Locking Switching in and switching out require schema mod locks (SCH-M). This means you can be blocked by long readers, and you’ll be the only one partying with your partitioned table--- if only for an instant.

26 Keep an eye on Connect bugs

27 Switching feature compatibility Works with replication in 2008 and later –Some subscribers can have the partitioning scheme, others don’t have to –This means you can have some subscribers on Standard. Works with Change Data Capture (with some special steps) Does not work with Change Tracking @SQLFool replicates her partitioned tables, check out her blog.

28 Logshipping/Mirroring Logshipping and mirroring have no compatibility requirements/problems with partitioning specifically. Be aware that if you are going to start using filegroups distributed on different drives, this will impact the configuration of your logshipping secondaries or mirrors, and you must plan appropriately.

29 2008R2 improvements Improved Performance on partition merges Details here: http://blogs.msdn.com/b/sqlserverstorageengine/archive/20 10/02/03/performance-improvement-by-orders-of- magnitude-when-merging-partitions-in-sql-server- 2008r2.aspx http://blogs.msdn.com/b/sqlserverstorageengine/archive/20 10/02/03/performance-improvement-by-orders-of- magnitude-when-merging-partitions-in-sql-server- 2008r2.aspx

30 AGENDA 1.You be the consultant: should this client partition? 2.What is partitioning? 3.Demo 4.Gotchas / Goodies 5.How to: your guide to implementing partitioning. 6.What’s your decision for our client? 7.Resources

31 Question 1: Is data management a significant problem for availability? YES Partitioning may drastically improve availability by allowing you to fine- tune your maintenance. Keep checking other factors. NO You may have other reasons to partition. Management includes index rebuilds, backups, loading data, and deleting data.

32 Question 2: Are query patterns defined by regions? YES Finding regions of data which are queried together and have a good partitioning key is important to good query performance. This is the basis of partition elimination. NO You may not have a good partitioning key. Keep looking at the query patterns for your workload and evaluating different partitioning keys. Data regions may be dates, integers, codes

33 Question 3: Can applications and queries be optimized for partitioning? YES You will be able to rewrite some queries and procedures as needed to take advantage of partition elimination. NO If you do not have the ability to tune user and application queries, some will likely perform very poorly. Some assembly required.

34 1. Implementing Partitioning: Planning Identify poorly performing queries Identify critical workloads Identify ways to reduce load –Caching –Multiple read servers / load balancing Create your Strategery: –Number of partitioned tables Choose partitioning key Identify if a partitioned heap is appropriate Identify aligned/non-aligned NCIs –Number of partitioned views –Application level implications

35 2. Implementing Table Partitioning: Pre- Tuning Resolve major performance problems –When rewriting queries, think about the partitioning key you’re going to use. Baseline system at “good” performance Establish agreement on responsibilities and cooperative approach between development and DBAs. –Practice your system for detecting, triaging, and resolving performance problems.

36 3. Implementing Table Partitioning: Sizing Optimize disk configuration/ plan filegroups –Older data may go on slower filegroups –Some partitions may be compressed or read-only Automate and test procedures for switching in/switching out –Safety mechanisms (check to make sure you’re only merging the right boundary points, etc) –Use check constraints exclude values that fall outside acceptable ranges at the outermost partitions on the table. –Monitoring and reporting Plan changes to maintenance –Changes to index maintenance (partition level/skipping read only partitions) –Changes to backup plan (if applicable)

37 4. Implementing Table Partitioning: Release Ideal world: Create two full powered SQL Instances with full sets of data– one partitioned/one not. –Test workload prior to release –Swap partitioned dataset in/out of production –Compare performance for problematic queries Cheaper world: Create release plan to partition on live instance –Create a strong rollback plan (as for any change) –Try to reload the data into new tables on new filegroups and maintain the old data (particularly if you haven’t been able to test at scale). Use this for verifying “new world” vs “old world” This means you have the space to roll back, even if you need to reload data into the older tables. If using this method, test every part of the release (including maintenance changes) prior to release – at minimum on a scaled system.

38 AGENDA 1.You be the consultant: should this client partition? 2.What is partitioning? 3.Demo 4.Gotchas / Goodies 5.How to: your guide to implementing partitioning. 6.What’s your decision for our client? 7.Resources

39 So, should this client use partitioning? Client has a reporting system containing both fact and dimension tables. –Fact tables are up to 300GB in size (including all indexes) with up to 1 billion rows. –Dimension tables are up to 200GB in size (including all indexes) with up to 200 million rows. A middle tier application dynamically executes queries to run reports custom-designed by clients.

40 AGENDA 1.You be the consultant: should this client partition? 2.What is partitioning? 3.Demo 4.Gotchas / Goodies 5.How to: your guide to implementing partitioning. 6.What’s your decision for our client? 7.Resources

41 THANK YOU! For attending this session and PASS SQLRally Orlando, Florida Session Code | Session Title 41 Presented by Dell

42 Please Complete the Evaluation Form Pick up your evaluation form: In each presentation room Drop off your completed form Near the exit of each presentation room At the registration area 42 Presented by Dell

43 Links / Contact There is a huge amount of documentation for table partitioning. It’ll overwhelm you in a heartbeat. Get my comprehensive list and recommendations on where to start: http://littlekendra.com/resources/partition/http://littlekendra.com/resources/partition/ Twitter: @kendra_little kendra@BrentOzar.com http://LittleKendra.com http://BrentOzar.com Please fill out your evaluation form! © Kendra Little 2011


Download ppt "Big and Tall: When to Partition Kendra Little Founder, Brent Ozar PLF."

Similar presentations


Ads by Google