Presentation is loading. Please wait.

Presentation is loading. Please wait.

SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration.

Similar presentations


Presentation on theme: "SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration."— Presentation transcript:

1 SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration

2 SQL Server Statistics SQL Server STATISTICS SQL Server Statistics object. Statistics Creation. Statistics Update and Auto update scenarios. Execution Plans with Data Skew Issues. Solutions for Skewed Data Distribution. AGENDA

3 SQL Server Statistics Statistics Statistics for query optimization are objects that contain statistical information about the distribution of values in one or more columns of a table. Statistics Header Describes the properties of the statistics object. Density Vector The query optimizer uses densities to enhance cardinality estimates for queries that return multiple columns from the same table. What are SQL Server STATISTICS?

4 SQL Server Statistics Histogram A histogram measures the frequency of occurrence for each distinct value in a data set. The query optimizer computes a histogram on the column values in the first key column of the statistics object. Histogram has limitation of 200 steps. Cardinality : The total number of rows processed at each level of a query plan operator. Cardinality estimator only interested in predicates What are SQL Server STATISTICS?

5 SQL Server Statistics Histogram

6 SQL Server Statistics Statistics Stats header Vector density Histogram

7 SQL Server Statistics Histogram

8 SQL Server Statistics The range includes all possible column values between boundary values, excluding the boundary values themselves. The lowest of the sorted column values is the upper boundary value for the first histogram step

9 SQL Server Statistics STMT Syntax check (Parsing)Logical tree (Binding) Optimizer Plan Execution Result Estimated Row Count Hardware Configuration Query Hints Indexes Partitioning Filegroups/Files

10 SQL Server Statistics The Different way stats being created The query optimizer creates statistics on key column for indexes on tables when the index is created. The query optimizer creates statistics for single columns in query predicates. Composite indexes creates Multi column statistics. sp_createstats stored procedure Create Statistics Statement Note: Auto create will not work on Read-only DB’s Note: Auto create will not work on Read-only DB’s Creating Statistics

11 SQL Server Statistics Detailed look at the utilization of statistics. How SQL Server Thinks about Stats? How SQL Server Comes to a conclusion? DEMO -1 How SQL Server STATISTICS Impacts?

12 SQL Server Statistics When STASTS Updated SQL Server update stats on default threshold (20%+500) A query compiles for the first time, A query has an existing query plan, but a statistic in the plan is out of date Auto_update_statistics Sp_Updatestats stored procedure Update statistics Command Note: Index reorg will not Update index linked stats Updating Statistics

13 SQL Server Statistics Updating Statistics Index Rebuild SQL Server update stats on default threshold (20%+500) A query compiles for the first time, and a statistic used in the plan is out of date A query has an existing query plan, but a statistic in the plan is out of date

14 SQL Server Statistics When AUTO Update Kicks off Demo - 2

15 E Un-even distribution is HARDER This is all over the place and varies over time as well The Histogram does a much better job having steps and average distribution per step but what if there are well over 200 distinct values and millions of rows with heavy skew between steps Simply put the averages just aren’t going to cut it anymore Even Distribution is easy ( it is fairly consistent ) Un-even distribution is HARDER This is all over the place and varies over time as well The Histogram does a much better job having steps and average distribution per step but what if there are well over 200 distinct values and millions of rows with heavy skew between steps Siply put the averages just aren’t going to cut it anymore Skewed data Distribution

16 SQL Server Statistics Impact of Skewed estimates 1.Spills to Disk(Under estimates) 2.IndexScan Vs. Index Seek Decisions 3.Inflated Memory Grants 4.Serial vs. Parallel Operations 5.Least Selective tables Joined first 6. outer/inner choice 7.Inappropriate Join algorithm selected Data Distribution Matters

17 SQL Server Statistics Statistics Best Practices DEMO -3

18 SQL Server Statistics Some solutions for Skewed data Distribution Optimize for (Unknown) Recompile Plan guides filtered statistics Declare local Variables. Skewed data Distribution

19 SQL Server Statistics Keep AUTO_CREATE_STATISTICS ON by default Keep AUTO_UPDATE_STATISTICS ON by default Try AUTO_UPDATE_STATISTICS_ASYNC for large OLTP tables if auto updating is an issue Eliminate any duplicate statistics If you notice estimate VS actual are off while troubleshooting, try updating the statistics Make Sure Your Data is not SKEWED (Important) Statistics Best Practices

20 SQL Server Statistics Schedule an Update statistics job with sample or FULLSCAN if feasible Nightly for highly modified/updated OLTP tables Weekly for smaller, less updated tables Statistics Maintenance

21 So the Auto Update stats will fire for every 500 + 20% change in table rows. Of course, we have an improved algorithm in SQL 2012 which is SQRT(1000 * Table rows) which is much better. When it fires it will use the default sampling rate and here is the algorithm how it calculates the sampling rate. 1) If the table < 8MB then it updates the statistics with a full scan. 2) If the table > 8MB, it follows an algorithm. It reduces the sampling rate as the number of rows in the table are increased to make sure we are not scanning too much data. This is not a fixed value but is under the control of optimizer. It is not a linear algorithm either. Example: if we have 1,000,000 rows it would use a sampling rate of 30% but when the number of rows increase to 8,000,000 it would reduce the sampling rate to 10%. These sampling rates are not under the DBAs control but optimizer decides it. SQL Server Statistics SQL Server 2012 algorithm

22 SQL Server Statistics Q&A


Download ppt "SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration."

Similar presentations


Ads by Google