Microsoft SQL Server 2005 Advanced SQL Programming and Optimization

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

SQL Performance 2011/12 Joe Chang, SolidQ
EXECUTION PLANS By Nimesh Shah, Amit Bhawnani. Outline  What is execution plan  How are execution plans created  How to get an execution plan  Graphical.
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
Virtual techdays INDIA │ 9-11 February 2011 SQL 2008 Query Tuning Praveen Srivatsa │ Principal SME – StudyDesk91 │ Director, AsthraSoft Consulting │ Microsoft.
Indexes Rose-Hulman Institute of Technology Curt Clifton.
Module 7: Creating and Maintaining Indexes. Overview Creating Indexes Creating Index Options Maintaining Indexes Introduction to Statistics Querying the.
1 Indexing for Performance for SQL Server 2005 Single - Table Optimization Chapter Four Jeff Garbus – Tony Cannizzo.
1 Section 5 - Grouping Data u The GROUP BY clause allows the grouping of data u Aggregate functions are most often used with the GROUP BY clause u GROUP.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
SQL/lesson 2/Slide 1 of 45 Retrieving Result Sets Objectives In this lesson, you will learn to: * Use wildcards * Use the IS NULL and IS NOT NULL keywords.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Views: Limiting Access to Data A view is a named select statement that is stored in a database as an object. It allows you to view a subset of rows or.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
SQL/Lesson 4/Slide 1 of 45 Using Subqueries and Managing Databases Objectives In this lesson, you will learn to: *Use subqueries * Use subqueries with.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
SQL Server Indexes Indexes. Overview Indexes are used to help speed search results in a database. A careful use of indexes can greatly improve search.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Views Lesson 7.
Microsoft AREC TAM Internship SQL Server Performance Tuning(I) Haijun Yang AREC SQL Support Team Feb, SQL Server 2000.
Indexes / Session 2/ 1 of 36 Session 2 Module 3: Types of Indexes Module 4: Maintaining Indexes.
Indexes and Views Unit 7.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
Creating Indexes on Tables An index provides quick access to data in a table, based on the values in specified columns. A table can have more than one.
Table Structures and Indexing. The concept of indexing If you were asked to search for the name “Adam Wilbert” in a phonebook, you would go directly to.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
SQL Basics Review Reviewing what we’ve learned so far…….
Module 6: Creating and Maintaining Indexes. Overview Creating Indexes Understanding Index Creation Options Maintaining Indexes Introducing Statistics.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Prepared By: Bobby Wan Microsoft Access Prepared By: Bobby Wan
Tuning Transact-SQL Queries
Indexes By Adrienne Watt.
Indexing Structures for Files and Physical Database Design
CS522 Advanced database Systems
Record Storage, File Organization, and Indexes
CS 540 Database Management Systems
Physical Changes That Don’t Change the Logical Design
Database Management System
Advanced SQL Programming for SQL Server 2008
COMP 430 Intro. to Database Systems
Database Management Systems (CS 564)
Chapter 12: Query Processing
Database Performance Tuning and Query Optimization
Evaluation of Relational Operations: Other Operations
Using the Set Operators
Lecture 12 Lecture 12: Indexing.
JULIE McLAIN-HARPER LINKEDIN: JM HARPER
Statistics: What are they and How do I use them
Chapter 4 Indexes.
CH 4 Indexes.
CH 4 Indexes.
Advance Database Systems
Query Processing CSD305 Advanced Databases.
Implementation of Relational Operations
Chapter 11 Database Performance Tuning and Query Optimization
Section 4 - Sorting/Functions
Physical Storage Structures
Evaluation of Relational Operations: Other Techniques
Diving into Query Execution Plans
Chapter 11 Managing Databases with SQL Server 2000
Indexes and more Table Creation
External Sorting Sorting is used in implementing many relational operations Problem: Relations are typically large, do not fit in main memory So cannot.
Evaluation of Relational Operations: Other Techniques
All about Indexes Gail Shaw.
Unit 12 Index in Database 大量資料存取方法之研究 Approaches to Access/Store Large Data 楊維邦 博士 國立東華大學 資訊管理系教授.
Presentation transcript:

Microsoft SQL Server 2005 Advanced SQL Programming and Optimization Single - Table Optimization Indexing for Performance

Acknowledgements Microsoft SQL Server and Microsoft SQL Server Management Studio are trademarks of Microsoft Inc. This presentation is copyrighted. This presentation is not for re-sale This presentation shall not be used or modified without express written consent of Soaring Eagle Consulting, Inc. © Soaring Eagle Consulting, Inc

Topics Examine detailed topics in query optimization Indexes with SARGs Improvised SARGs Clustered vs. nonclustered indexes Queries with OR Index covering Forcing index selection © Soaring Eagle Consulting, Inc

SQL Server 2005 Search Techniques SQL Server 2005 uses three basic search techniques for query resolution Table Scans Index Searches Covered Index Searches © Soaring Eagle Consulting, Inc

Table Scans If SQL Server 2005 can’t resolve a query any other way, it does a table scan Scans are expensive Table scans may be the best way to resolve a query If there is a clustered index on the table, SQL Server will try and use it instead of performing a table scan Table Scan Search select * from pt_tx where id = 1 © Soaring Eagle Consulting, Inc

Index Selection Topics Optimizer selection criteria When indexes slow access When indexes cause deadlocks Index statistics and usage © Soaring Eagle Consulting, Inc

Optimizer Selection Criteria During the index selection phase of optimization the optimizer decides which (if any) indexes best resolve the query Identify which indexes match the clauses Estimate rows to be returned Estimate page reads © Soaring Eagle Consulting, Inc

SARG Matching Indexes must correspond with SARGs Useful indexes will specify a row or rows or set bounds for the result set An index may be used if any column of the index matches the SARG where dob between '3/3/1941' and '4/4/65' create unique index nci on authors (au_lname, au_fname) © Soaring Eagle Consulting, Inc

SARG Matching (Cont’d) create unique index nci on authors (au_lname, au_fname) Which of the following queries (if any) could be helped by the index? If there are not enough rows in the table, indexes that look useful may never be used select * from authors where au_lname = 'Smith' or au_fname = 'Jim' select * from authors where au_fname = 'Jim' select * from authors where au_fname = 'Jim' and au_lname = 'Smith' © Soaring Eagle Consulting, Inc

Clustered Index Mechanism With a clustered index, there will be one entry on the last intermediate index level page for each data page The data page is the leaf or bottom level of the index (Assume a clustered index on last name) © Soaring Eagle Consulting, Inc

Nonclustered Index Mechanism The nonclustered index has an extra, leaf level for page / row pointers Data placement is not affected by non-clustered indexes (Assume an NCI on first name) © Soaring Eagle Consulting, Inc

Using Indexes Clustered Index Indications Columns searched by range of values Columns by which the data is frequently sorted (order by or group by) Sequentially accessed columns Static columns Join columns (if other than the primary key) Nonclustered Index Indications NCI selection tends to be much more effective if less than about 20% of the data is to be accessed NCIs help sorts, joins, group by clauses, etc., if other column(s) must be used for the CI Index covering © Soaring Eagle Consulting, Inc

Index Selection Examples 1. What index will optimize this query? 2. What indexes optimize these queries? 3. In the second query, what would the net effect be of changing the range to this? select title from titles where title = ‘Alleviating VDT Eye Strain’ select title from titles where price between $5. and $10. between $500 and $600 © Soaring Eagle Consulting, Inc

CI vs. NCI select title from titles where price between $5. and $10. Table facts: 2,000,000 titles (= 14492 pages) 138 rows / page 1 million rows in the range © Soaring Eagle Consulting, Inc

CI vs. NCI It is feasible, occasionally likely, that a table scan is faster than using a nonclustered index for specific queries The server evaluates all options at optimization time and selects the least expensive query © Soaring Eagle Consulting, Inc

Or Indexing select title from titles where price between $5. and $10. or type = 'computing' Questions What indexes should (could) be used? Will a compound index help? Which column(s) should be indexed? © Soaring Eagle Consulting, Inc

Or Indexing (Cont’d) How is the following query different (from a processing standpoint)? What is a useful index for? select title from titles where price between $5. and $10. and type = 'computing' select * from authors where au_fname in ('Fred', 'Sally') © Soaring Eagle Consulting, Inc

Or Clauses Format SARG or SARG select * from authors where au_lname = 'Smith' or au_fname = 'Fred' (How many indexes may be useful?) select * from authors where au_lname in ('Smith', 'Jones', 'N/A') © Soaring Eagle Consulting, Inc

Or Strategy An or clause may be resolved via a table scan, a multiple match index or using or strategy Table Scan Each row is read, and criteria applied Matching rows are returned in the result set The cost of all the index accesses is greater than the cost of a table scan At least one of the clauses names a column that is not indexed, so the only way to resolve the clause is to perform a table scan © Soaring Eagle Consulting, Inc

Or Strategy (Cont’d) Multiple match index Using each part of the or clause, select an index and retrieve the row Only used if the results sets can not return duplicate rows Rows are returned to the user as they are processed © Soaring Eagle Consulting, Inc

Or: Query Plan select company, street2 from pt_sample where id = 2017 or id = 2163 Query Execution Plan © Soaring Eagle Consulting, Inc

Index Selection and the Select List select * from publishers where pub_id = 'BB1111' Questions What is the best index? Do the columns being selected have a bearing on the index? © Soaring Eagle Consulting, Inc

Index Selection and the Select List Question Should there be a difference between the utilization of the following two indexes? select royalty from titles where price between $10 and $20 create index idx1 on titles (price) /* or */ create index idx2 on titles (price, royalty) © Soaring Eagle Consulting, Inc

Index Covering The server can use the leaf level of a nonclustered index the way it usually reads the data pages of a table: this is index covering The server can skip reading data pages The server can walk leaf page pointers A nonclustered index will be faster than a clustered index if the index covers the query for a range of data (why?) Adding columns to nonclustered indexes is a common method of reducing query time This has particular benefits with aggregates © Soaring Eagle Consulting, Inc

Index Covering (Cont’d) Beware making the index too wide; As index width approaches row width, the benefit of covering is reduced # of levels in the index increases Index scan time approaches table scan time Remember that changes to data will cascade into indexes © Soaring Eagle Consulting, Inc

Composite Indexes Composite (compound) indexes may be selected by the server if the first column of the index is specified in a where clause, or if it is a clustered index create index idx1 on employee (minit, job_id , job_lvl) © Soaring Eagle Consulting, Inc

Composite Indexes (Cont’d) create index idx1 on employee (minit, job_id , job_lvl) Which queries may use the index? select * from employee where minit = 'A' and job_id != 4 and job_lvl = 135 where job_id != 4 select * from employee where minit = 'A' © Soaring Eagle Consulting, Inc

Composite vs. Many Indexes Each additional index impacts update performance In order to select appropriate indexes, we need to know how many indexes the optimizer will use, and how many rows are represented by the where clause select pub_id, title, notes from titles where type = 'Computer' and price > $15. © Soaring Eagle Consulting, Inc

Which are the best options in which circumstances? select pub_id, title, notes from titles where type = 'Computer' and price > $15. CI or NCI on type CI or NCI on price One index on each of type & price Composite on type, price Composite on price, type CI or NCI on type, price, pub_id, title, notes Which are the best options in which circumstances? © Soaring Eagle Consulting, Inc

Index Usefulness It is imperative to be able to estimate rows returned for an index. Therefore, the server will estimate rows returned before index assignation If statistics are available (When would they not be?) the server estimates number of rows using distribution steps or index density SQL Server 2005 query processor automatically generates statistics about index key distributions using efficient sampling algorithms If you have an equality join on a unique index, the server knows only one row will match and doesn't need to use statistics The query analyzer index analyzer can analyze a query and recommend indexes The more selective an index is, the more useful the index © Soaring Eagle Consulting, Inc

Data Distribution You have a 1,000,000 row table. The unique key has a range (and random distribution) of 0 to 10,000,000 Question How many rows will be returned by the following query? How does the optimizer know whether to use an index or table scan? select * from table where key between 1000000 and 2005000 © Soaring Eagle Consulting, Inc

Index Statistics SQL Server keeps distribution information about indexes in a separate page pointed to by sysindexes There is a distribution page for every index The optimizer uses this information to estimate the number of rows returned for a query The distribution page(s) are built at index creation time and maintained by the server © Soaring Eagle Consulting, Inc

Viewing Index Statistics Viewed with the dbcc show_statistics dbcc show_statistics (table_name,index_name) © Soaring Eagle Consulting, Inc Continued next page

Viewing Index Statistics (Cont’d) dbcc show_statistics (pt_sample_CICompany,CICompany) go Statistics for INDEX 'CICompany'. Updated Rows Rows Sampled Steps Density Average key length -------------------- ----------- ------------ ----------- ------------------------ ------------------------ Dec 3 1998 3:11PM 5000 5000 295 2.4390244E-4 25.999405 (1 row(s) affected) © Soaring Eagle Consulting, Inc Continued next page

Viewing Index Statistics (Cont’d) All density Columns ------------------------ -------------------------------------------------------------------------------------------------------------------------------- 2.4390244E-4 company (1 row(s) affected) Steps ------------------------------------------------- aabMy Company abhMy Company adqMy Company afkMy Company . . . © Soaring Eagle Consulting, Inc

Explaining DBCC Show Statistics Updated date and time: When the statistics were last updated Rows: Number of rows in the table Rows Sampled: Number of rows sampled for statistics information Density: Selectivity of the index Average key length: Average length of an index row All density: Selectivity of the specified column prefix in the index Columns: Name of the index column prefix for which the all density is displayed Steps: Number of histogram values in the current distribution statistics for the specified target on the specified table © Soaring Eagle Consulting, Inc

Estimating Logical Page I/O If there is no index, there will be a table scan, and the estimate will be the number of pages in the table If there is a clustered index, estimate will be the number of index levels plus the number of pages to scan For a nonclustered index, estimate will be index levels + number of leaf pages + number of qualifying rows (which will correspond to the number of physical pages to read) For a unique index and an equality join, the estimate will be 1 plus the number of index levels © Soaring Eagle Consulting, Inc

When to Force Index Selection Don't Do it With every release of the server, the optimizer gets better at selecting optimal query paths Forcing the optimizer to behave in a specific manner does not allow it the freedom to change selection as data skews It also does not permit the optimizer to take advantage of new strategies as advances are made in the server software © Soaring Eagle Consulting, Inc

When to Force Index Selection (Cont’d) Exceptions When you (the developer) have information about a table that SQL Server 2005 will not have at the time the query is processed (i.e., using a temp table in a nested stored procedure) Occasions when you've proven the optimizer wrong © Soaring Eagle Consulting, Inc

How to Force Index Selection Use the following syntax to force indexes Instead, identify why the optimizer picked incorrectly select * from titles (index(titleind)), publishers (index( UPKCL_pubind) ) where titles.pub_id = publishers.pub_id © Soaring Eagle Consulting, Inc

Summary The optimizer uses indexes to improve query performance when possible Watch out for improvised SARGs Queries with OR may require a table scan Try to take advantage of covered queries Be careful when forcing an index © Soaring Eagle Consulting, Inc