Presentation on theme: "SQL Server Indexing for Developers"— Presentation transcript:
1SQL Server Indexing for Developers Greg LinwoodSolid Quality Learning
2About Me Live in Melbourne, Australia Director of SQL Servants Director of Solid Quality LearningMicrosoft SQL Server MVP since 2003Australian SQL Server User GroupUsing SQL Server since 1993
3Agenda The Dilemma of SQL SQL Server data caching infrastructure 101 SQL set based logic vs serial executionSQL Server data caching infrastructure 101SQL Server indexing toolsTable storage formatsClustered Indexes (index organised)Heaps (non-index organised)Non-Clustered IndexesWhen implemented on HeapsWhen implemented on Clustered IndexesIncluded ColumnsMatching indexes to queriesDesigning Covering IndexesLimits of covering indexesAnalysing index usage via Execution PlansDiscussion
4SQL is a simple & easy language to learn The Dilemma of SQLSQL is a simple & easy language to learnDeveloper concentrates only on WHAT data is being accessed & manipulatedTotally disconnected from HOW the DBMS executes the commandsDBMS hides Cost Based Optimisation process from developersOptimisation process is largely undocumentedDevelopers have to “second guess” how it worksDevelopers have enough to learn already!
5A little knowledge of index mechanisms… The Good News!A little knowledge of index mechanisms…A few easy to follow rules…Can help you solve the majority of query tuning problems with indexes
6SQL Server data caching infrastructure 101… Le Table8kb Buffer PageSelect * from authors where au_lname = ‘White’update authors set au_fname = ‘Johnston’ where au_lname = ‘White’au_id au_lname au_fname phone address city stateWhite Johnson Bigge Rd. Oakland CAupdate authors set au_fname = ‘Marj’ where au_lname = ‘Green’Write aheadlog (TLOG)Physical Memory(RAM)UPDATEDataCacheDataCacheProcCacheProcCacheMTLUPDATEData volume(HDD)
7CIX also provides b-tree lookup pages, similar to a regular index Indexing Tools – Clustered IndexTable rows stored in physical order of clustered index key column/s – CustID in this case.Physical ordering of table row storage enforced“Physical” meaning physical database model, not “on disk”CIX also provides b-tree lookup pages, similar to a regular indexTable rows stored in leaf level of clustered index, in order of index column/s (key/s)Default table storage format for tables WITH a primary keyB-Tree index nodes also createdEach level contains entries based on the first row in pages from lower levelCan only have one CIX per table (as table storage can only be sorted one way)Query execution example:Select FName, Lname, PhNo from Customers where CustID = 23
8No physical ordering of table rows Indexing Tools – HeapNo physical ordering of table rowsNo physical ordering of rowsScan cannot complete just because a row is located. Because data is not ordered, scan must continue through to end of table (heap)New rows simply added to last pageNO B-Tree index nodes (not really an “index”Unless other indexes added, only option is to scan tableNo b-tree with HEAPs, so no lookup method available unless other indexes are present. Only option is to scan heapQuery execution example:Select FName, Lname, PhNofrom Customers where Lname = ‘Smith’
9Indexing Tools – Non-Clustered Indexes NCIXs are “real” indexes, rather than table storage structuresImplemented differently, depending on whether the base table is stored on a heap or a clustered index.Nearly always more efficient for queries than CIXsBoth for “seeks” and “range scans”Read further about this topic on my blog:Can only be 900 bytes & up to 16 columns “wide”SQL 2005 allows “wider” NCIXs via new “Included Columns” featureOn SQL 2000, any queries that require wider indexes need a good CIX
10Indexing Tools – NCIXs on Heaps (1st of 3) create nonclustered index ncix_lname on customers (lname)B-tree structure contains one leaf row for every row in base table, containing index columns, sorted by index column values.Each row contains a “RowID”, which is an 8 byte “pointer” to the heap storage page(RowID actually contains File, Page & Slot data)Leaf pages “chained” via doubly linked list for intra index scanQuery execution example:Select Lname from Customers where Lname = ‘Smith’
11Indexing Tools – NCIXs on Heaps (2nd of 3) create nonclustered index ncix_lname on customers (lname)Previous example “covered” the query.Where index does NOT cover query, RowID lookups performed to obtain values for non-indexed columnsQuery execution example:Select Lname, Fnamefrom Customerswhere Lname = ‘Smith’Very important to “cover” queries where performance is criticalImpact or RowID lookups is far worse with clustered index “Bookmark Lookups” (covered next)
12NCIX contains CIX keys in leaf pages for Bookmark lookups Indexing Tools – NCIXs on Clustered Indexes (1st of 3)create nonclustered index ncix_lname on customers (lname)NCIX contains CIX keys in leaf pages for Bookmark lookupsB-tree structure contains one leaf row for every row in base table, containing index columns, sorted by index column values. (same as when NCIX is on a heap)Instead of a RowID, each row’s clustered index “key” value is stored in the index leaf level instead.This means RowID bookomarks cannot be performed (as RowID is not available). Instead, bookmark lookups are performed, which are considerably more expensiveLeaf pages “chained” via doubly linked list for intra index scanBookmarkLookupQuery execution example:Select Lname, Fname from Customers where Lname = ‘Plumb’Bookmark lookups seriously degrade performance where many rows qualify for results
13Indexing Tools – NCIXs on Clustered Indexes (2ndof 3) create nonclustered index ncix_lname on customers (lname, fname)NCIX now “covers” query because all columns named in query are present in NCIXB-tree structure contains one leaf row for every row in base table, containing index columns, sorted by index column values. (same as when NCIX is on a heap)Instead of a RowID, each row’s clustered index “key” value is stored in the index leaf level instead.Leaf pages “chained” via doubly linked list for intra index scanQuery execution example:Select Lname, Fname from Customers where Lname = ‘Saunders’Bookmark lookups seriously degrade performance where many rows qualify for results
14Designing indexes to match queries Cover filter predicates by indexing columns in where & join clausesProvides SQL Server with efficient access path to identify rows that qualify filtersInner Joins are filters – equivalent to WHEREColumn order is critical – most selective columns firstEnsure filter predicates are not accessed via Bookmark or RowID lookupsIf many rows are being accessed, ensure entire query is “covered”Include columns referenced by filter predicates first (WHERE, JOIN) then include columns referenced in SELECT list last)
15How many indexes should I add? (1st of 2) Index update overhead is often over-statedOLTP systems usually ‘Read’ FAR more than they write.eg, customers usually browse many website pages before actually placing an ordereg, even ‘pure’ system update / insert activity usually generates more read activity than write activityPKs / FKs at least have to be ‘Read’ during inserts & updatesUsually far more important to tune reads than writes in an OLTP.
16What happens when I have too many indexes? How many indexes should I add? (2nd of 2)What happens when I have too many indexes?When databases are over-indexed, the performance bottleneck is usually CPU or memory related rather than disk related. Why?During insert, update & delete operations, SQL Server has to first “find” the pages that contain the rows being manipulated. Finding these pages usually involves multiple reads for every update..
17Reference materialA few books with excellent performance tuning content“SQL Server Query Performance Tuning Distilled”, Sajal Dam“Inside SQL Server 2000”, Kalen Delaney“Guru’s Guide to SQL Server Architecture & Internals”, Ken Henderson“SQL Server 2000 Performance Tuning”, Ed Whalen et all