SQL Server Statistics and its relationship with Query Optimizer

Slides:



Advertisements
Similar presentations
Cardinality How many rows? Distribution How many distinct values? density How many rows for each distinct value? Used by optimizer A histogram 200 steps.
Advertisements

SQL Performance 2011/12 Joe Chang, SolidQ
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Denny Cherry Manager of Information Systems MVP, MCSA, MCDBA, MCTS, MCITP.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
1 Chapter 7 Optimizing the Optimizer. 2 The Oracle Optimizer is… About query optimization Is a sophisticated set of algorithms Choosing the fastest approach.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
Triggers A Quick Reference and Summary BIT 275. Triggers SQL code permits you to access only one table for an INSERT, UPDATE, or DELETE statement. The.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Query Optimizer Execution Plan Cost Model Joe Chang
Indexes and Views Unit 7.
Maciej Pilecki | Project Botticelli Ltd.. SELECT Bio FROM Speakers WHERE FullName=‘Maciej Pilecki’;  Microsoft Certified Trainer since 2001  SQL Server.
Pinal Dave Mentor | Solid Quality India |
Chapter 13 Triggers. Trigger Overview A trigger is a program unit that is executed (fired) due to an event Event such as updating tables, deleting data.
Table Structures and Indexing. The concept of indexing If you were asked to search for the name “Adam Wilbert” in a phonebook, you would go directly to.
Thinking in Sets and SQL Query Logical Processing.
Retele de senzori Curs 2 - 1st edition UNIVERSITATEA „ TRANSILVANIA ” DIN BRAŞOV FACULTATEA DE INGINERIE ELECTRICĂ ŞI ŞTIINŢA CALCULATOARELOR.
How to kill SQL Server Performance Håkan Winther.
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP,MCP. SQL SERVER Database Administration.
October 15-18, 2013 Charlotte, NC SQL Server Index Internals Tim Chapman Premier Field Engineer.
SQL Basics Review Reviewing what we’ve learned so far…….
SQL Server Statistics DEMO SQL Server Statistics SREENI JULAKANTI,MCTS.MCITP SQL SERVER Database Administration.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
2 Copyright © 2008, Oracle. All rights reserved. Building the Physical Layer of a Repository.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
SQL Server Statistics 101 Travis Whitley Senior Consultant, Oakwood Systems whitleysql.wordpress.com.
Stored Procedures – Facts and Myths
Query Tuning without Production Data
UFC #1433 In-Memory tables 2014 vs 2016
Data Virtualization Tutorial… Semijoin Optimization
Query Tuning without Production Data
Query Tuning without Production Data
Reading execution plans successfully
Example of a page header
SQL Server Optimizing Query Plans
Introduction to Execution Plans
Chapter 15 QUERY EXECUTION.
Working with Very Large Tables Like a Pro in SQL Server 2014
Statistics And New Cardinality Estimator (CE)
Marcos Freccia Stop everything! Top T-SQL tricks to a developer
The Key to the Database Engine
Now where does THAT estimate come from?
Cardinality Estimator 2014/2016
Query Optimization Statistics: The Driving Force Behind Good Performance G. Vern Rabe -
Statistics What are the chances
Query Optimization Techniques
JULIE McLAIN-HARPER LINKEDIN: JM HARPER
Execution Plans Demystified
Statistics: What are they and How do I use them
Reading Execution Plans Successfully
Index Use Cases.
Hugo Kornelis Now where does THAT estimate come from? The nuts and bolts of cardinality estimation.
SQL Server Query Plans Journeyman and Beyond
Four Rules For Columnstore Query Performance
Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
Introduction to Execution Plans
Query Tuning Fundamentals
Diving into Query Execution Plans
Prof. Arfaoui. COM390 Chapter 9
A – Pre Join Indexes.
Introduction to Execution Plans
Query Optimization Techniques
Reading execution plans successfully
T-SQL Basics: Coding for performance
Introduction to Execution Plans
Working with Very Large Tables Like a Pro in SQL Server 2017
All about Indexes Gail Shaw.
Presentation transcript:

SQL Server Statistics and its relationship with Query Optimizer Musab Umair Malik

About me Working with SQL Server since 2007 MCITP SQL Server 2005, MCSA 2012/2014 & MCSE Data Platform Working with S&P Global Market Intelligence since last 4 years Currently holding a Senior DBA position

Agenda Definition Purpose Relationship with Query Optimizer– The Impact Types Some Internals Q & A Session

Definition Statistics for query optimization are objects that contain statistical information about the distribution of values in one or more columns of a table or indexed view. Statistics for query optimization are objects that contain statistical information about the distribution of values in one or more columns of a table or indexed view. The query optimizer uses these statistics to estimate the cardinality, or number of rows, in the query result. These cardinality estimates enable the query optimizer to create a high-quality query plan. For example, the query optimizer could use cardinality estimates to choose the index seek operator instead of the more resource-intensive index scan operator, and in doing so improve query performance. Each statistics object is created on a list of one or more table columns and includes a histogram displaying the distribution of values in the first column. Statistics objects on multiple columns also store statistical information about the correlation of values among the columns. These correlation statistics, or densities, are derived from the number of distinct rows of column values. For more information about statistics objects,

Purpose The query optimizer uses these statistics to estimate the cardinality, or number of rows, in the query result. These cardinality estimates enable the query optimizer to create a high- quality query plan. For example, the query optimizer could use cardinality estimates to choose the index seek operator instead of the more resource-intensive index scan operator, and in doing so improve query performance. MSDN Query optimizer uses statistics to create query plans that improve query performance. For most queries, the query optimizer already generates the necessary statistics for a high quality query plan; in a few cases, you need to create additional statistics or modify the query design for best results. 

Relationship with Query Optimizer– The Impact Estimated Number of rows vs Actual Number of rows

Relationship with Query Optimizer– The Impact Query Execution plan design depends upon Statistics (Other factors as well) Logical Reads CPU time Execution Plan Join Operator Selection (HASH , MERGE or LOOP)

Types Column Statistics Index Statistics User Created Statistics Filtered Statistics Incremental Statistics

Column Statistics Each statistics object is created on a list of one or more table columns and includes a histogram displaying the distribution of values in the first column. Statistics objects on multiple columns also store statistical information about the correlation of values among the columns. These correlation statistics, or densities, are derived from the number of distinct rows of column values.

Index Statistics The query optimizer creates statistics for indexes on tables or views when the index is created. These statistics are created on the key columns of the index. If the index is a filtered index, the query optimizer creates filtered statistics on the same subset of rows specified for the filtered index.

User Statistics For most queries, these two methods for creating statistics ensure a high-quality query plan; in a few cases, you can improve query plans by creating additional statistics with the CREATE STATISTICS statement. These additional statistics can capture statistical correlations that the query optimizer does not account for when it creates statistics for indexes or single columns. Your application might have additional statistical correlations in the table data that, if calculated into a statistics object, could enable the query optimizer to improve query plans. For example, filtered statistics on a subset of data rows or multicolumn statistics on query predicate columns might improve the query plan.

Filtered Statistics Filtered statistics can improve query performance for queries that select from well-defined subsets of data. Filtered statistics use a filter predicate to select the subset of data that is included in the statistics. Well-designed filtered statistics can improve the query execution plan compared with full-table statistics.

Incremental Statistics A major problem with updating statistics in large tables in SQL Server is that the entire table always has to be scanned, for example when using the WITH FULLSCAN option, even if only recent data has changed. This is also true when using partitioning: even if only the newest partition had changed since the last time statistics were updated, updating statistics again required to scan the entire table including all the partitions that didn’t change. Incremental statistics, a new SQL Server 2014 feature, can help with this problem. Using incremental statistics you can update only the partition or partitions that you need and the information on these partitions will be merged with the existing information to create the final statistics object. Another advantage of incremental statistics is that the percentage of data changes required to trigger an automatic update of statistics now works at the partition level which basically means that now only 20% of rows changed (changes on the leading statistics column) per partition are required.

Some Internals What works automatically? (500 rows 20%) AUTO_CREATE_STATISTICS AUTO_UPDATE_STATISTICS AUTO_UPDATE_STATISTICS_ASYNC What does not work by default. Trace Flag 2371 What to do when automatic statistics does not work ? Temp Tables – subject to statistics auto/re-compile Table variable – no statistics, assumes 1 row

Some Internals (continued) AUTO_CREATE_STATISTICS Option When the automatic create statistics option, AUTO_CREATE_STATISTICS, is on, the query optimizer creates statistics on individual columns in the query predicate, as necessary, to improve cardinality estimates for the query plan. These single-column statistics are created on columns that do not already have a histogram in an existing statistics object. The AUTO_CREATE_STATISTICS option does not determine whether statistics get created for indexes. This option also does not generate filtered statistics. It applies strictly to single-column statistics for the full table. When the query optimizer creates statistics as a result of using the AUTO_CREATE_STATISTICS option, the statistics name starts with _WA. You can use the following query to determine if the query optimizer has created statistics for a query predicate column.

Some Internals (continued) AUTO_UPDATE_STATISTICS Option When the automatic update statistics option, AUTO_UPDATE_STATISTICS, is on, the query optimizer determines when statistics might be out-of-date and then updates them when they are used by a query. Statistics become out-of-date after insert, update, delete, or merge operations change the data distribution in the table or indexed view. The query optimizer determines when statistics might be out-of-date by counting the number of data modifications since the last statistics update and comparing the number of modifications to a threshold. The threshold is based on the number of rows in the table or indexed view.

Some Internals (continued) AUTO_UPDATE_STATISTICS_ASYNC The asynchronous statistics update option, AUTO_UPDATE_STATISTICS_ASYNC, determines whether the query optimizer uses synchronous or asynchronous statistics updates. By default, the asynchronous statistics update option is off, and the query optimizer updates statistics synchronously. The AUTO_UPDATE_STATISTICS_ASYNC option applies to statistics objects created for indexes, single columns in query predicates, and statistics created with the CREATE STATISTICS statement.

Conclusion Maintaining Statistics is a complicated task with a huge impact on server/query performance Understanding the importance is the core A great way to increase performance with variety of options

THANK YOU ! Questions ?