Presentation is loading. Please wait.

Presentation is loading. Please wait.

René Balzano Technology Solution Professional Data Platform Microsoft Switzerland SQL Server Performance Programming.

Similar presentations


Presentation on theme: "René Balzano Technology Solution Professional Data Platform Microsoft Switzerland SQL Server Performance Programming."— Presentation transcript:

1 René Balzano Technology Solution Professional Data Platform Microsoft Switzerland SQL Server Performance Programming

2 This Session is about How to design databases and T-SQL code in a way that helps achieving good performance How to monitor and analyze what might decrease the performance of your database and application does not contain C#, EF, ODBC, Visual Studio, German, French, etc.

3 Help your DBA The DBA who runs the database you have programmed can compensate for many design-flaws and improve performance without touching your code You don’t want to depend on a DBA’s skill and attention when it comes to defining your application’s performance Design your database and the interaction of your application with it in the best possible way, so that your database performs well even without a DBA’s intervention To design for optimal and less DBA-dependent performance, it helps to understand what goes on under the hood of SQL Server So let’s have a look…

4 Demo… Scenario #1

5 Choosing the right keys Page Splits and Fragmentation

6 Why a Clustered Index? 121:11Huber…20070502 121:25Meier…20010219 121:36Meier…19880502 121:43Oberst…20110107 121:5………… …………… 121:11………… 121:1215Glauser…19620522 121:1322Keller…19811111 121:149Zurbriggen…19910414 122:12Amsler…20080502 122:25Kern…20010319 122:34Zorbas…20080511 122:47Klaus…20010108 122:5………… …………… 122:11………… 122:1249Straub…20020722 122:1318Djuric…19811121 122:1415Dankner…19890212 … Heap: Unordered Record identified by RID (file#:page#:position#) This example (simplified): 4000 records, 550kb/record, 14 records/page 286 pages (#121 - #407) (8096 bytes data/page) 407:1………… 407:2………… 407:3………… 407:4………… 407:5………… …………… 407:11………… 407:12…………. 407:13………… 407:14………… file #1:page #121 (1st page of this table)file #1:page #122file #1:page #407 (286th page of this table) 196205221:121:1…… 198111111:121:13…… 198111211:122:13…… 198805021:121:3…… ………… …… 199104141:121:14 ……200101081:122:4 ……200102191:121:2 198902121:122:14 …… file #1:page #2132 (2nd page of this index) 200103191:122:2…… …… ………… 200207221:122:12…… 200705021:121:1…… ………… …… 200805111:122:3 ………… ……201101071:121:4 …… file #1:page #2139 (9th page of this index) … 196205221:2132 200103191:2139……………… file #1:page #2131 (1st page of this index) Nonclustered (secondary) Index on a Heap: Ordered by index key: Date Record pointer is RID (file#:page#:position#) This example (simplified): 4000 index records, 16bytes/record, 506 records/page 9 index pages:1 index b-tree page (#2131) + 8 index leaf pages (#2132-#2139) (8096 bytes data/page) When a record in a heap moves to a different disk location, its entry has to be updated in ALL secondary indexes, resulting in increased disk activity and reduced performance for other tasks

7 Why a Clustered Index? 121:11Huber…20070502 121:22Amsler…20080502 121:33Oberst…20110107 121:44Zorbas…20080511 121:55Meier…20010219 121:65Kern…20010319 121:76Meier…19880502 …………… 121:137Klaus…20010108 121:14 ………… … 122:1………… 122:2……… 122:3………… 122:49Zurbriggen…19910414 122:5……… 122:6……… ………… 122:1215Dankner…19890212 122:1315Glauser…19620522 122:1418Djuric…19811121 … Clustered Index: Ordered by clustering Key: ID Record identified by Clustering Key This example (simplified): 4000 data records, 550kb/record, 14 records/page 286 leaf (data) pages (#121 - #407) 4000 index records, 16bytes/record, 506 records/page 9 index pages:1 index b-tree page (#112) + 8 index leaf pages (#113-#120) Total of 295 pages for this clustered index (8096 bytes data/page) 407:122Keller…19811111 407:2………… 407:3………… …………… 407:549Straub…20020722 …………… 407:11………… 407:12…………. 407:13………… 407:14………… file #1:page #121 (10st page of this table)file #1:page #122file #1:page #407 (295th page of this table) 11:121:1…… 21:121:2…… 31:121:3…… 41:121:4…… 5.11:121:5…… 5.21:121:6…… 61:121:7 91:122:4 …… ……15.11:122.12 71:121:1315.21:122.13 file #1:page #113 (2nd page of this clustered table) 221:407:1…… …… ………… 491:407:5…… …… ………… …… ………… …… file #1:page #120 (9th page of this clustered table) … 11:113 221:120……………… file #1:page #112 (1st page of this clustered table)

8 Why a Clustered Index? 1962052215…… 1981111122…… 1981112118…… 198805026…… ………… …… 199104149 ……200101087 ……200102195.1 1989021215 …… file #1:page #2132 (2nd page of this index) 200103195.2…… …… ………… 2002072249…… 200705021…… ………… …… 200805114 ………… ……201101073 …… file #1:page #2139 (9th page of this index) … 196205221:2132 200103191:2139……………… file #1:page #2131 (1st page of this index) Nonclustered (secondary) Index on a Clustered Table (Index): Ordered by index key: Date Record pointer is Clustering Key (ID) Same size as in previous example. 11:121:1…… 21:121:2…… 31:121:3…… 41:121:4…… 5.11:121:5…… 5.21:121:6…… 61:121:7 91:122:4 …… ……15.11:122.12 71:121:1315.21:122.13 221:407:1…… …… ………… 491:407:5…… …… ………… …… ………… …… … 11:113 221:120……………… When a record in a clustered table moves to a different disk location, its entry only has to be updated in the ONE clustered index, no secondary index has to be touched, no extensive disk activity results. Clustered Table (Index) Avoid secondary updates when changing data

9 Why a short Clustering Key? 196205220813A496-817E-43DB- B01B-B7C5B0EDFA70 | Glauser | Peter 19811111AEE9226F-0796-4D02- 9C62-44B0FBCFB15B | Keller | Klara …… 198111219E2EF8A9-C1F5-4824- 96B8-3669CF8FC875 | Djuric | Vladimir…… file #1:page #2132 (2nd page of this index) 20010319036B285D-6D20-4BA2- BA3D-A4AC40B6AD8E | Kern | Beat… ……… 200207222D498FB1-8085-436E- A783-CB4E800F9AF7 | Straub | Trudi …… ………… …… file #1:page #2230 (99th page of this index) … … Large clustering keys result in every secondary index being larger, thus increasing disk activity (number of pages to read from disk). Eventually this leads to additional levels in the B-tree, adding one or more extra IOs to EVERY read or update operation in EVERY secondary index. Avoid large secondary indexes and large numbers of disk IOs

10 Why a monotonous growing Clustering Key? 1962052215…… 1981111122…… 1981112118…… 198805026…… ………… …… 199104149 ……200101087 ……200102195.1 1989021215 …… Full page, sorted by Date Inserting 19850101 leads to page split: 1962052215 1981111122 1981112118 1985010166 198805026…… ………… …… 199104149 ……200101087 ……200102195.1 1989021215 …… Page splits occur when new data has to be inserted in ordered full pages. A page split results in increased disk activity. An index (including the clustered table) in which many pages splits have occured, is fragmented (pages with consecutive ordered data are spread over the disk, resulting in slower IO operations). If a clustering key’s values don’t grow monotonous, page splits occur on the base table, having a large negative impact on IO performance during writes and reads (fragmentation). Avoid Page Splits

11 Indexes and Performance To minimize disk activity when inserting and updating data and to reduce the number of disk IOs when reading data (= keep fragmentation low): Always have a clustered index Define clustering keys small monotonously growing with unchanging values Be prepared for rebuilding indexes as they show fragmentation

12 Clustering Keys Great int with IDENTITY clause (SEQUENCE in Denali) date or datetime, e.g. a timestamp value To be avoided GUID, UNIQUEIDENTIFIER (or at least use NEWSEQUENTIALID) varchar fields, at least those that aren’t tiny composite keys with multiple fields

13 Monitoring Fragmentation Check fragmentation of your indexes (tables) : select * from sys.dm_db_index_physical_stats (db_id(),null,null,null,null) Goal: As low as possible, reorg above 10%, rebuild above 30% Check page density DBCC SHOWCONTIG Goal: As high as possible (also depends on record size and fillfactor)

14 Monitoring Page Splits If your application is heavily inserting and you have many secondary indexes Performance Monitor Counter MSSQL$instancename:Access Methods Page Splits/sec Goal: Low and flat curve If not, check your clustered index key and the amount of secondary keys

15 Demo… Scenario #2

16 Don’t be afraid of the CXPACKET Parallelism

17 SQL Server tries to parallelize over all available cores (minus 1) by default Parallellism is generally great for querying, but not necessarily so in OLTP settings Be careful: Seeing CXPACKET waitstats often lets programmers use MAXDOP 1 to avoid parallelization CXPACKET waits are not necessarily bad, they occur in most ‘healthy’ parallelization settings too If SQL Server parallelizes wrongly (so that you would see high numbers for CXPACKET and use MAXDOP 1) this could also be due to bad indexing or outdated statistics Still: Parallelizing generates overhead for splitting up the workload and later recombining the results In certain settings (usually OLTP with many writes) MAXDOP 1 improves performance (e.g. recommended server-wide setting for SharePoint’s SQL Server configuration) Since as a developer you won’t know what the DBA sets with sp_configure, consider using the MAXDOP clause in your code when you know that parallelism isn’t useful

18 Monitoring Parallelism If you think that parallelism might be the source of a performance bottleneck See if your query plan is a parallel one Check CXPACKET values in sys.dm_os_wait_stats

19 Demo… Scenario #3

20 From Cubes to Columnstore: Life gets easier Don’t denormalize

21 Denormalization Many techniques exist to denormalize a technical data model for improved performance Cubes with pre-calculated aggregates Temporary tables with redundant copies of values from related tables and pre-calculated aggregates Denormalized technical models with redundant values Technologies exist and evolve that make denormalization less necessary Indexed views can replace temporary tables Compressed tables and indexes improve performance per se Light indexing based on columnar storage can further improve performance without touching the base tables or indexing them

22 Indexed Views An index on a view persists the view’s data content to disk Using indexed views instead of data ‘manually’ copied to redundant temporary tables or replicated columns relieves you from maintaining the redundant objects Using temporary objects in Stored Procedures can lead to increased recompiles Since for the majority of database applications updates represent only a single digit percentage of all operations, very often the update-overhead for an additional index (on the view) is neglegible Mind the schema-binding requirements for indexed views

23 Compression Database compression is a feature of SQL Server 2008’s Enterprise Edition and above Compressing indexes and tables improves performance substantially, since a smaller number of disk pages have to be accessed For medium and large databases even processor load goes down, since fewer pages have to be maintained, reducing management overhead Compression is transparent for any application, you don’t have to touch any code when you start using compression

24 Columnstore Indexes SQL Server 2012 introduces columnstore indexes, based on Vertipaq technology (PowerPivot) Columnstore indexes speed up data access hugely through A new storage architecture (columnar instead of column-wise) Much higher compression than prevoiusly Highly improved access algorithms A single columnstore index on a table covers any query that is run against that table You no longer need to create and maintain a separate covering index for every important query, making you less dependent on your DBA to rebuild them etc. The original technical data model performs much better, without denormalization and without the us of extensive indexing (-> light indexing, see PDW) Caveat in first release: The base table under a columnstore index will be read-only (workarounds exist)

25 How columnstore speeds up queries IDNameCityState 1JohnSeattleWA 2JaneRedmondWA 3JillRedmondOR 4JaneBellevueWA 1 John Seattle WA 2 Jane Redmond WA 3 Jill Redmond OR 4 Jane Bellevue, WA Row Store 1 2 3 4 John Jane Jill Jane Seattle Redmond Redmond Bellevue WA WA OR WA Column Store

26 How columnstore speeds up queries Fetches only needed columns from disk Less IO Better buffer hit rates C1 C2 C4 C5 C6 C3 SELECT region, sum (sales) …

27 How columnstore speeds up queries Advanced query processing technology Batch mode execution of some operations Processes column data in batches Groups of batch operations in query plan Compact data representation Highly efficient algorithms Better parallelism

28 How columnstore speeds up queries Column Segment Segment contains values from one column for a set of rows Segments for the same set of rows comprise a row group Segments are compressed Each segment stored in a separate LOB Segment is unit of transfer between disk and memory C1 C2 C3 C5C6C4

29 By the way: Recompilation Stored Procedures will be recompiled automatically, if relevant information was not available to the optimizer when they were compiled the last time or when such information has changed in the meantime (e.g. the structure of a referenced table) Recompiles have a negative impact on performance, increasing processor load and blocking (locked objects during compilation) Make sure that you place DML in a bloc at the beginning of a SP and use temporary objects and SET statements defensively Use Profiler (SP:Recompile Event Class) to analyze the reasons for recompilations SQL Server 2008 and 2012 reduce recompiles considerably too

30 Demo… Scenario #4

31 Manage your isolation levels actively Locking and Blocking

32 Transaction Isolation The transaction isolation level defines how you want to access data that is in use by others how you want others to be restricted when accessing the same data that you are using The transaction isolation level is a property of your database connection It is usually defined as a default of the client-application or –library (e.g. Tools-Options in SSMS) The default setting generally is READ COMMITTED With this setting, you prevent some operations for others while you’re in a transaction you often wait unnecessarily even if you just read uncritical and unchanging data (e.g. for reporting and data warehousing) Frequently waiting for locks to be lifted (being blocked) makes applications slow, they're just waiting alll the time... Consider setting your isolation level to READ UNCOMMITTED for reads in uncritical situations using SQL Server 2008’s READ COMMITTED SNAPSHOT ISOLATION mode (with this you depend on your DBA, it also implies increased load for tempdb)

33 Monitoring for Blocking Via SSMS sp_lock sys.dm_tran_lock (allows WHERE) sys.dm_os_wait_stats (allows WHERE) Via Performance Monitor MSSQL$yourinstance:Locks Lock Waits/sec etc.

34 Demo… Scenario #5

35 Beware of growing their content Varchar fields

36 As long as their content is short, varchar fields are placed on the same disk page as the rest of their record If an existing varchar field value is updated to a longer value that no longer fits on the same page, it is offloaded to a separate disk area, with a link remaining on the original page This operation creates additional disk IOs that will impact the database’s performance (fragmentation could stay beneath the common 30% threshold for rebuilds) If your application follows the habit of first creating a new record with empty or default values then reading back the default values and doing some additional stuff on the client and finally updating the new record to its final values you might be doing just that by default…

37 Watch out for additional performance topics at http://blogs.technet.com/b/swisssql There is more to say

38 Review Page Splits and Fragmentation: Choosing the right keys. Parallelism: Don’t fear the CXPACKET Don’t denormalize: Indexed views, compression and columnstore Locking and Blocking: Manage your isolation levels actively Varchar fields: Beware of growing their content

39 All the rest (at least part of it) Dynamic SQL Do you cache your plans? Plan caching How to help your cache by design The cursor Your enemy? WHERE are your SARGs? Watch out for these topics at http://blogs.technet.com/b/swisssql

40 The Tools DMV select sys.dm_ (IntelliSense will help you further) SSMS Settings SET STATISTICS IO ON Profiler Watch out for Extended Events Performance Monitor

41 Please help us make TechDays even better by evaluating this session. Thank you! Give us your feedback!

42 © 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.


Download ppt "René Balzano Technology Solution Professional Data Platform Microsoft Switzerland SQL Server Performance Programming."

Similar presentations


Ads by Google