SQL Server performance monitoring and tuning

SQL Server performance monitoring and tuning
Bala Subra

Make SQL Server faster Methodology
Look at SQL Server from a holistic standpoint. How to baseline a SQL Server system. How to track it from a “landscape perspective.” Evaluate the system now and going forward

Performance Tuning Challenges
Difficult to create a baseline and compare over time to current status Exorbitant amount of data to manage and analyze = ‘analysis paralysis’ Multiple tools to manage the tiers and they are not compatible Difficult to distinguish the issue at an object level or tier = ‘finger pointing’

Challenges - Continued
Cannot further impact the performance of the production system Understand production, not simulation Need for real time and historical view of performance data in a single system Record of changes over time and subsequent impact once implemented Throw hardware at the problem as a quick fix’

Phases of performance tuning
Define components Evaluate objects Interpret findings Create an action plan

Performance tracking Use tracking tool of your choice
Word Excel Database Methodology works on any platform

Define components Path determination Systems Software Hardware
A holistic view of the landscape Path determination Systems Software Hardware

The landscape “Literally everything”
Server itself Clustering components, if clustered Networking, cards and driver levels Routers and switches Client workstations Etc. An entire representation of your environment

Define components A holistic view of the landscape Path determination
Systems Software Hardware

The path Determine how data gets from a fairly unique client machine to the server. Diagram the path: Paint PowerPoint Visio Network tools Determine areas of slowdown.

The system Document the architecture Two tier – client and a server
Three tier – client, middle layer and a server N tier – multiple systems SOA – lots of moving parts

The software Document software drivers, interfaces and code
Only concerned with representative systems. Avoid making immediate changes; if you change the test, you can’t determine the exact issue. Do take care of security issues. Graphical representation of your system

Define components A holistic view – the landscape Path determination

The hardware Document hardware Networking Memory Input/Output
hard drives storage area networks (SANs) Health Activity Performance impact Communication channels network-attached storage (NAS) devices Network Monitoring VMWare (Virtualization) Designed for SMP Buffer cache avoid physical IO (tens of thousands of times faster) Disk Design for throughput and fault tolerance (reliability) Many small drives, not few big ones Raid 0 striping faster but not fault tolerant Raid 1 mirroring fault tolerant but not faster for writes (is for reads since can read from either) Multiple Raid 1 faster but if have a hot spot (all writes to one table) then all writes to one disk so no speed increase. (File groups help across a database and Yukon partitioning will address table hot spots) Raid 1 good for transaction logs since writes are sequential. Raid 5 = striping and parity (any one drive fails can get data from remaining 4) has a write performance penalty and a read enhancement Raid 10 = raid 1&0 (e.g configure stripes and then mirror using W2K) (Raid 01 configure mirrors and then stripe

Evaluate objects Tools Working with a baseline
Working without a baseline Don’t fix anything yet!

Tools Enterprise Manager Performance Monitor SQL Server Profiler
Primary tool to manage all SQL Server functionality across the enterprise Features – Ability to manage database processes, locking, indexes, configurations, etc. Performance Monitor Capture a macro view of the servers with the ability to configure counters with specific sample rates save to a log file or real time monitor Counters Memory, Processors SQL Server Network Activity, Disk Drives System Statistics (Threads, Context Switching, Queuing, etc.) SQL Server Profiler Micro view of all SQL Server transactions saved to a log file or database table Filters – Ability to capture a subset of the transactions based on the transaction type, user, application, etc. Concerns – High system overhead Query Analyzer – Query Plan Ability to graphically review the query execution plan Performance Monitor Counters Memory Page Reads/sec Page Writes/sec Page Input/sec Page Output/sec Network Interface Bytes Received/sec Bytes Sent/sec Bytes Total/sec Current Bandwidth Output Queue Length Objects All Paging File Physical Disk – set ‘disk perf – y’ in DOS and reboot % Disk Read Time % Disk Write Time % Idle Time Avg Disk Bytes/Read Avg Disk Bytes/Transfer Avg Disk Bytes/Write Avg Disk Queue Length Current Disk Queue Length Process % Privileged Time % Processor Time % User Time Processor Server Work Queues Active Threads Available Threads Queue Length Total Bytes/sec Total Operations/sec SQLServer:Access Methods Full Scans Page Splits/sec Table Lock Escalations/sec SQLServer:Cache Manager Cache Hit Ratio - _Total Cache Pages - _Total SQLServer:Databases Transactions/sec SQLServer:General Statistics Logins/sec Logouts/sec User Connections SQLServer:Locks Number of Deadlocks/sec

SQL Server Query Processing basics
SQL Server stores table rows & columns (Authors Table in pubs db has ~26rows and is approx 6kb total size) Pages read from disk - slow Table rows are stored on Disk in 8kb units, named “pages”. When loaded into memory pages are referred to as “buffers” Pages read from cache – tres fast! No? Compile & Execute.. Yes? Execute.. All DB changes hardened in TLog Then, DB changes written to cache Pages can by dirtied multiple times Execution Plan Found? Dirty pages later flushed to .mdf Select * from authors where au_lname = ‘White’ update authors set au_fname = ‘Johnston’ where au_lname = ‘White’ au_id au_lname au_fname phone address city state White Johnson Bigge Rd. Oakland CA Lookup Pages In Data Cache update authors set au_fname = ‘Marj’ where au_lname = ‘Green’ Lookup Exec Plan in Proc Cache Write ahead log (TLOG) Physical Memory (RAM) UPDATE Data Cache Proc Cache MTL 256Mb UPDATE Buffer Manager Data volume (HDD)

SQL Profiler Event Data
Cursors CursorOpen CursorExecute CursorClose Errors and Warnings Hash Warning Missing Column Statistics Locks Lock:Deadlock Lock:Timeout TSQL Unprepare SQL Parallelism Degree of Parallelism – All Counters Execution Plan Show Plan All Show Plan Statistics Show Plan Text Stored Procedure SP:Starting SP:Completed SP:Recompile SP:StmtCompleted SP:StmtStarting

Identifying Query Bottlenecks
SQL Server Profiler Collect RPC:SPCompleted / TSQL:BatchCompleted events Filter with Reads > 10,000 at first, reduce to 1,000 # of reads = # of pages “read” from cache (disk if not cached) CPU, Duration, Writes & RowCount also interesting, but reads is the best representation of source workload Relies on queries completing On a swamped server, queries might be piling up without completing, therefore not showing up in Profiler as completed events as fast as they are starting. SQL Trace Same as Profiler, but runs in background Far lower performance impact that Profiler GUI Requires post analysis of .trc log files collected 3rd Party Tools – SQLBenchmarkPro (continuous) / Cleartrace (ad-hoc) Can be scripted from GUI Profiler

Identifying Query Bottlenecks (cont..)
DMVs Gives only a current snapshot of query / procedure cache All data lost between restarts Similar to SQL Trace \ Profiler in that updates only occur POST query completion. Therefore not quite up to the second information. Important DMVs: sys.dm_exec_query_stats – reads / time by sql_handle sys.dm_exec_query_plan() – execution plan by sql_handle sys.dm_exec_sql_text() – query text by sql_handle Identify slow queries by joining above three DMVs together

Identifying Query Bottlenecks (cont..)
What about up to the second perf info? sys.sysprocesses (sysprocesses in SQL2K) provides up to the second data on CPU, IO PRIOR to query completion can be joined to DMVs via sql_handle to obtain executing query data SQL2k options DBCC INPUTBUFFER() fn_getsql() sys.dm_os_workers DMV provides further info from thread perspective

What about query blocking?
Use Profiler / SQL Trace – “Blocked Process Report” Event Must configure “Blocked Process Threshold” configuration set in seconds (# of seconds blocked) trace events continually raised every x seconds

What about query blocking? (cont..)
Blocked queries are usually caused by inefficient queries taking more locks than necessary Blocked queries are usually a consequence of other poorly performing queries Still worth monitoring with Blocked Process Report trace to identify (other) inefficient queries for tuning Snapshot isolation level provides an alternative to readers being blocked by writers readers see previous committed value and read past rather than be blocked by writers.

Infrastructure bottlenecks
New features released Logical Page Reads / sec shows TOTAL number of query reads / sec. Increases represent either: New features, possibly not well tuned (this case) Query optimization problems Increased utilization You can deal with capacity issues by tuning query workload, or increasing hardware, but tuning workload is most effective & cheaper Memory is the most significant infrastructure component to size correctly Unless utilization genuinely increases significantly or memory is actually reduced, memory problems are typically consequences of other problems. If query workload efficiency has degraded (increased reads), usually better to tune queries (source of problem) than simply add more memory. Requires problem query identification (Profiler, Trace, DMVs) Might not be “tunable” (eg vendor applications)

Infrastructure bottlenecks
Buffer Life Expectancy shows average time (secs) page buffers survive in data cache before being forced out by pressure from other queries High Number (> 1000 secs for OLTPs) is good (low cache cycling) Decreases represent either: Inefficient query workload (new changes / optimization issues) Increased utilization

Special case - tempdb Temp Tables AND Table Variables are created on disk Version store is materialized in tempdb Under snapshot isolation, db updates are written to disk in tempdb, allowing other queries to read previously committed results Large resultset query sorting (ORDER BY) on disk Turns SELECT queries from pure disk reads (in user db), to read + write + read all cases listed above occur on a per-session basis, so many users can be causing each of the disk IO workloads concurrently all cases listed above are highly disk WRITE oriented in nature temp table & cursor population, resultset sorting & versioning all WRITE to disk often causes significantly higher random, concurrent disk activity than user databases hard drive disk heads can only be physically in one place at any point in time tempdb‘s random, concurrent, highly write intensive disk activity can generate enormous queued disk workloads

Solid State Drives (SSDs)
SSDs are similar in nature to RAM. No physically moving parts Concurrent access Extremely high speed SSDs are ideal for tempdb, given tembdb’s disk oriented workload SSDs have lower mean time between failures than HDDs no moving parts to wear down HDDs involve physically moving metal at high speed

Solid State Drives (SSDs)
Even if SSD fails, having tempdb on it creates no risk tempdb persists no transactional data tempdb is totally rebuilt upon every reboot of SQL Server even if device totally fails, tempdb can be relocated on HDD during restart of SQL Server hard drive disk heads can only be physically in one place at any point in time tempdb‘s random, concurrent, highly write intensive disk activity can generate enormous queued disk workloads Testing / Live results Customer testing & live deployment of SDD on tempdb alone confirms significant improvement in system performance large-scale financial services online system 19,000% reduction in IO stalls in batch processing

Common SQL Server Performance Problems
High CPU Utilization Identification Guess – Task Manager figures Hunt – Perfmon counters 24x7 – CPU usage by time, statement Resolution Add additional CPUs Identify statement(s) with high CPU Move processes to another server or to off peak times

High Disk I/O Identification Resolution
Guess – Disk drive lights or drive churning Hunt – Avg Disk Queue Length, % Disk Time 24x7 – Review IO wait types and consumption Resolution Add additional physical drives Separate tables, indexes, file groups Separate databases on physical disks Appropriate RAID (database 1, 10, 5 - log 1) Add additional indexes and/or re-index tables

Poor Performing Statements
Identification Guess – User perception and input Hunt – Profiler statement analysis 24x7 – Statements by resource, time, user Resolution Review database design and query plans Review table access order for JOINs Recommend indexes based on data access Short transactions with regular commits

The Index Impact Identification Resolution
Guess – User perception and input Hunt – Review query plans for entire application 24x7 – Index recommendations Resolution Use Index Tuning Wizard CRUD chart to determine needed indexes Review code to determine columns in JOIN, WHERE, ORDER BY, GROUP BY, etc clauses Leverage correct index based on needs Maintain indexes and statistics per object It’s doing “include” fields for us. How many know what that is? What does this NOT show? The impact of inserts/updates/deletes by adding these indexes. If we went and added all of these indexes right now, would that help? How can we tell? Are there existing indexes we could tweak instead? We could test it by capturing a load test with Profiler, then making our changes on a dev box, replaying those traces with/without our changes. But that’s getting into an ACTIVE method of tuning. More on that later.

Clustered Indexes Clustered indexes are the actual physically written records. A SELECT statement with no ORDER BY clause will return data in the clustered index order. 1 clustered index per table, 249 non-clustered indexes per table. Highly recommended for every table! Very useful for columns sorted on GROUP BY and ORDER BY clauses, as well as those filtered by WHERE clauses. Putting the clustered index on the primary key of OLTP tables reduces pages splits. Don’t do this on earlier versions of SQL Server where row-level locking is unavailable or on SQL2K servers where row-level locking is disabled.

Non-Clustered Indexes
Useful for retrieving a single record or a range of records. Maintained in a separate structure and maintained as changes are made to the base table. Tend to be much narrower than the base table, so they can locate the exact record(s) with much less I/O. Has at least one more intermediate level than the clustered index, but are much less valuable if table doesn’t have a clustered index. Any time you rebuild the clustered index, you also automatically rebuild all non-clustered indexes on the table.

Fill Factor When SQL Server creates indexes, every page is nearly 100% full. No room on the leaf or intermediate pages for INSERTs, UPDATEs, or DELETEs. Default can cause costly page splits on certain tables. Promotes table fragmentation. SQL Server allows you to specify amount of free space in leaf pages with FILL FACTOR, an option in the CREATE INDEX statement. Can set FILL FACTOR at a server level or specify with each clustered index. A good rule of thumb setting is 75-80% Should not set at the server level since some tables perform worse with fill factor of less than 100%. Naturally, this option strongly affects the amount of space that a table and its indexes will consume. However, disk is cheap!

Stored Procedure Optimization
SET NOCOUNT OFF improves performance when coding stored procedures, triggers, and functions. Turns of the N rows affected verbiage and eliminates messages from the server to the client for each step in a stored procedure. CREATE PROC xyz AS SET NOCOUNT ON < stored procedure code > SET NOCOUNT OFF GO Mixing DDL and DML operations causes a recompile Certain operations on temporary tables cause a recompile Refer to temp tables created locally Don’t declare cursors that reference a temp table Don’t create temp tables while in a loop

Querying against Composite Keys
Composite keys are only useful from the leftmost column to the rightmost column, in the order they appeared in the CREATE INDEX statement. Example: CREATE NONCLUSTERED INDEX ndx_foo ON foo(a, b, c, d) The following WHERE clauses will access the NDX_FOO: WHERE a WHERE a AND b The following WHERE clauses will access only part of NDX_FOO: WHERE a AND d WHERE a AND c AND b The following WHERE clauses invalidate NDX_FOO: WHERE b AND c WHERE b AND a

Queries with LIKE Queries on production systems should NOT use SELECT * FROM… Main reason is that any time the underlying table is changed, all query plans stored in the cache must be rebuilt The SQL tools allow very quick scripting – so no excuses! Queries that use the LIKE clause have two simple rules: LIKE can use indexes if the pattern starts with a character string, such as WHERE lname LIKE ‘w%’ LIKE cannot use an index if the pattern starts with a leading wildcard, such as WHERE lname LIKE ‘%alton’ Avoiding wildcards in the SELECT list (as in, SELECT * FROM foo) offers several advantages: reduces on network activity since unneed columns are not to the client improves self-documentation of the code, since unacquainted coders can more easily discern the important data in the query alleviates the need for SQL Server to rebuild the query plan on procedures, views, triggers, functions or even frequently run queries any time the underlying table structure is changed 4. narrows the the query engine to the most pertinent index choices

Queries with Functions & Calculations in the WHERE clause
Avoid using functions or calculations on the column in a WHERE clause because it causes SQL Server to ignore any index on the column: WHERE qty * 12 > 10000 WHERE ISNULL(ord_date, ‘Jan 01,2001’) > ‘Jan 01, :00:00 AM’ Instead, move the function or calculation to the SARG: WHERE qty > 10000/12 WHERE ord_date IS NOT NULL AND ord_date > ‘Jan 01, :00:00 AM’

Query Tuning Use SHOWPLAN_TEXT or Graphic Query Plan to analyze queries. Joins perform better than sub queries Beware queries that have SCAN but not SEEK operations Beware join queries that have HASH but not NESTED LOOP operations Remember that constraints put lots of overhead on INSERT and UPDATE statements

Execution Plan Notation
The Good, the Ugly, the Bad Table Scans and Index Scans – Bad and Ugly Sorts – generally Bad and Ugly Hash Joins – Bad and Ugly Thick lines coming into the operation and thin lines coming out – Bad and Ugly Merge Joins – Good without big sort Index Seeks and Clustered Index Seeks – Good Nested Loop Joins – Good Bookmark Lookups – “it depends” Join Conditions Nested loops are used when one of the inputs is smaller then other. Extremely effective Merge joins are used when both inputs are roughly the same size. Requires presorted data and therefore can be dangerous Hash joins are used to process un-sorted data using in-memory hashing – generally the slowest way Execution Plan Is Strategy determined by SQL Query Optimizer Can be influenced by developers Key Decisions are made: Which indexes to use? How to perform JOIN operations? How to order and group data? In what order tables should be processed? If cached data and previously compiled plans can be reused? High-percentage bookmark lookups are bad if your query is already running slow… Consider covering indices Thick lines coming into the operation and thin lines coming out – Bad and Ugly because it means that SQL Server is reading a lot of data initially (high disk I/O) but it’s throwing most of it away at the end because of the query clauses. Similar to reading 1GB file in memory and using only 100k of data. Database Developer’s role in Query Optimization Process Is: Apply iterative process of changing queries and database objects so that SQL Query Optimizer can make better decisions and do it faster Identify plan deficiencies and make changes to force SQL Server to correct them Correct one problem at a time Capture performance statistics “before” and “after” – use Database Load Testing tool To test our applications with production-like data load before rolling it out As amount of data increases, to proactively monitor application performance in order to catch problems early and correct them before customer sees them – work with DBA team SQL Server performs sort, intersect, union, and difference operations using in-memory sorting and hash join technology. Using this type of query plan, SQL Server supports vertical table partitioning, sometimes called columnar storage. SQL Server employs three types of join operations: Nested loops joins Merge joins Hash joins If one join input is quite small (such as fewer than 10 rows) and the other join input is fairly large and indexed on its join columns, index nested loops are the fastest join operation because they require the least I/O and the fewest comparisons. For more information about nested loops, see Understanding Nested Loops Joins. If the two join inputs are not small but are sorted on their join column (for example, if they were obtained by scanning sorted indexes), merge join is the fastest join operation. If both join inputs are large and the two inputs are of similar sizes, merge join with prior sorting and hash join offer similar performance. However, hash join operations are often much faster if the two input sizes differ significantly from each other. For more information, see Understanding Merge Joins. Hash joins can process large, unsorted, nonindexed inputs efficiently. They are useful for intermediate results in complex queries because: Intermediate results are not indexed (unless explicitly saved to disk and then indexed) and often are not produced suitably sorted for the next operation in the query plan. Query optimizers estimate only intermediate result sizes. Because estimates can be an order of magnitude wrong in complex queries, algorithms to process intermediate results not only must be efficient but also must degrade gracefully if an intermediate result turns out to be much larger than anticipated. The hash join allows reductions in the use of denormalization to occur. Denormalization is typically used to achieve better performance by reducing join operations, in spite of the dangers of redundancy, such as inconsistent updates. Hash joins reduce the need to denormalize. Hash joins allow vertical partitioning (representing groups of columns from a single table in separate files or indexes) to become a viable option for physical database design. For more information, see Understanding Hash Joins.

Optimization Process Spiral
Not optimally written queries and flawed database schema Missing indexes Unnecessary joins Too much data returned (SELECT *) Unnecessary ORDER BY clauses Inadequate disk performance Disk I/O can’t keep up with needs of relational engine Disk fragmentation of db files (sometimes called external fragmentation) Index and data pages fragmentation within db files (sometimes called internal fragmentation) Memory pressure and low page life expectancy Data pages frequently accessed can not fit in SQL Server memory which causes more disk I/O Low cache/hit ratio and page life expectancy Long running queries (blocking and locking) Reports or massive batch inserts or updates Long running transactions Parsing is a step during which syntax of the statement is validated and clauses are converted into internal compiler structures. Execution tree is produced. Execution tree is a structure that describes the logical steps needed to transform the source data into the format required by the result set. Normalization is a step during which objects are verified and bound to, views are replaced with their definitions and implicit type conversions are performed (when column/variable types allow implicit conversions). Optimization is a most important step, during which execution plan (optimized, final version of execution tree) is formed. Execution plan is a detailed strategy of query execution – see next slides for details. Execution plans are reused and cached in memory. If SQL query engine finds a suitable execution plan that is already cached, it will use it. By the same token, “aged” execution plans are ejected from cache. After that, execution plan is cached is a specially allocated buffer called procedure cache. Please notice that percentage of memory allocated for procedure cache fluctuates depending on number of plans that need to be kept in memory. Therefore, having too many execution plans (common scenario when raw SQL statements are used) may cause SQL server to start ejecting data and index pages from cache, which is not good. After that, the relational engine begins executing the execution plan. As steps that need data from the base tables are processed, the relational engine uses OLE DB to request that the storage engine pass up data from the rowsets requested from the relational engine 44

Inside SQL Server Query Optimization
Source: Inside SQL Server Book Phase 1: Trivial Plan Optimization. Cost-based optimization is expensive to do when there really is only one viable plan for the SQL statement. Example 1: a query that consists of an INSERT statement with a VALUES clause. There is only one possible plan. Example 2: SELECT statement where all the columns are in a unique covering index and there is no other index that has that set of columns in it. The trivial plan optimizer finds the really obvious plans, which are typically very inexpensive. As a result, optimizer doesn't spend a lot of time searching for a good plan. Phase 2: Syntactical Transformations Looking for commutative properties Looking for operations that can be rearranged Constant folding Other operations that don't require looking at the cost or analyzing indexes Phase 3: Full Optimization SQL Server then loads up the statistics information on indexes and columns, and enters the final major part of optimization, which is the cost based optimizer 45

Optimization Techniques
Join Conditions Nested loops are used when one of the inputs is smaller then other. Extremely effective Merge joins are used when both inputs are roughly the same size. Requires presorted data and therefore can be dangerous Hash joins are used to process un-sorted data using in-memory hashing – generally the slowest way Execution Plan: Hash Match vs. Stream Aggregation Hash Match is similar to Hash Join Stream Aggregation requires input to be sorted by the columns forming GROUP BY Stream Aggregation is faster SQL Server performs sort, intersect, union, and difference operations using in-memory sorting and hash join technology. Using this type of query plan, SQL Server supports vertical table partitioning, sometimes called columnar storage. SQL Server employs three types of join operations: Nested loops joins Merge joins Hash joins If one join input is quite small (such as fewer than 10 rows) and the other join input is fairly large and indexed on its join columns, index nested loops are the fastest join operation because they require the least I/O and the fewest comparisons. For more information about nested loops, see Understanding Nested Loops Joins. If the two join inputs are not small but are sorted on their join column (for example, if they were obtained by scanning sorted indexes), merge join is the fastest join operation. If both join inputs are large and the two inputs are of similar sizes, merge join with prior sorting and hash join offer similar performance. However, hash join operations are often much faster if the two input sizes differ significantly from each other. For more information, see Understanding Merge Joins. Hash joins can process large, unsorted, nonindexed inputs efficiently. They are useful for intermediate results in complex queries because: Intermediate results are not indexed (unless explicitly saved to disk and then indexed) and often are not produced suitably sorted for the next operation in the query plan. Query optimizers estimate only intermediate result sizes. Because estimates can be an order of magnitude wrong in complex queries, algorithms to process intermediate results not only must be efficient but also must degrade gracefully if an intermediate result turns out to be much larger than anticipated. The hash join allows reductions in the use of denormalization to occur. Denormalization is typically used to achieve better performance by reducing join operations, in spite of the dangers of redundancy, such as inconsistent updates. Hash joins reduce the need to denormalize. Hash joins allow vertical partitioning (representing groups of columns from a single table in separate files or indexes) to become a viable option for physical database design. For more information, see Understanding Hash Joins.

Reduce Contention Keep transactions short Good Indexes
Don’t get user input in the middle of a transaction Process all rows Good Indexes Reduce time to identify rows to update More granular locks Monitor Locks and Deadlocks Enterprise Manager, syslockinfo, sysprocesses, trace Manage Locks and Deadlocks Balance deadlocks and performance

Deadlocks Choose Appropriate Isolation Levels Cyclic Deadlocks
Ensure consistent update sequences Conversion Deadlocks Serialize Access (UPDLOCK hint)

Missing Index Query How many indexes are there already?
How many rows are in the table? Is it write-intensive? Do we have fast enough storage for our writes? Query to Identifying Missing Indexes SELECT statement AS [database.scheme.table], column_id , column_name, column_usage, migs.user_seeks, migs.user_scans, migs.last_user_seek, migs.avg_total_user_cost, migs.avg_user_impact FROM sys.dm_db_missing_index_details AS mid CROSS APPLY sys.dm_db_missing_index_columns (mid.index_handle) INNER JOIN sys.dm_db_missing_index_groups AS mig ON mig.index_handle = mid.index_handle INNER JOIN sys.dm_db_missing_index_group_stats AS migs ON mig.index_group_handle=migs.group_handle ORDER BY mig.index_group_handle, mig.index_handle, column_id GO

Database Tuning Advisor
The Index Tuning Wizard is invoked in SQL Enterprise Manager on the Tools menu >> Wizards >> Management >> Index Tuning Wizard. We feed it a load – could be a single query, could be a trace file It actively runs these queries against our server, and it changes schema objects while it works to figure out what indexes and statistics will be the fastest. The Index Tuning Wizard can consume significant CPU resources on the server where it is run so you might want to: A) avoid running it on production servers, B) run it on a separate computer, C) run it on small subsets of the tables in the database, and/or D) disable the Perform thorough analysis option. Rename Each Index If we follow its recommendations and just hit apply, this is what we get – a bunch of new indexes we can’t identify. CREATE NONCLUSTERED INDEX [_dta_index_Activity_11_ __K1_K4_K7_K5_K3] ON [dbo].[Activity] ( [ServerName] ASC, [ActivityTypeID] ASC, [StatusTypeID] ASC, [StartTime] ASC, [DatabaseID] ASC ) 50

Index Tuning Passive Tuning with DMVs Active Tuning
Don’t just click apply Use smart names Look for overlaps Go passive first DMF & DMV New with SQL Server 2005 Gathers information continuously Data does disappear with restarts You can walk in and start tuning immediately with little preparation Now, when we get a list of index recommendations from the wizard, we can compare it against our schema to see what we’ve got, and what we need to add. -- Possible bad Indexes (writes > reads) int = db_id() SELECT 'Table Name' = object_name(s.object_id), 'Index Name' =i.name, i.index_id, 'Total Writes' = user_updates, 'Total Reads' = user_seeks + user_scans + user_lookups, 'Difference' = user_updates - (user_seeks + user_scans + user_lookups) FROM sys.dm_db_index_usage_stats AS s INNER JOIN sys.indexes AS i ON s.object_id = i.object_id AND i.index_id = s.index_id WHERE objectproperty(s.object_id,'IsUserTable') = 1 AND s.database_id AND user_updates > (user_seeks + user_scans + user_lookups) ORDER BY 'Difference' DESC, 'Total Writes' DESC, 'Total Reads' ASC

SQL 2008 Data Compression Estimating Compression
sp_estimate_data_compression_savings @schema_name @object_name @index_id @partition_number @data_compression Index Compression Drawbacks Enterprise Edition only No inheritance No automation You can pick which individual indexes you want to compress. The smaller they are, the faster they’re read off disk. How much faster? Well, SQL gives us a tool to find that out.

Index Defragmentation Best Practices
Table fragmentation is similar to hard disk fragmentation caused by frequent file creation, deletion and modification. Database tables and indexes need occasional defragmentation to stay efficient. The most efficient allocation for read-heavy tables is when all pages occupy a contiguous area in the database, but after weeks of use, a table may become scattered across the disk drive. The more pieces it is broken into – the less efficient the table becomes. T-SQL Code As Microsoft SQL Server 2000 maintains indexes to reflect updates to their underlying tables, these indexes can become fragmented. Depending on workload characteristics, this fragmentation can adversely affect workload performance. This white paper provides information to help you determine whether you should defragment table indexes to benefit workload performance. To defragment indexes, SQL Server 2000 provides several statements. This white paper compares two of those statements: DBCC DBREINDEX and DBCC INDEXDEFRAG. /* Determine which indexes to defrag using our user-defined parameters */ INSERT INTO #indexDefragList SELECT database_id AS databaseID , QUOTENAME(DB_NAME(database_id)) AS 'databaseName' , [OBJECT_ID] AS objectID , index_id AS indexID , partition_number AS partitionNumber , avg_fragmentation_in_percent AS fragmentation , page_count , 0 AS 'defragStatus' /* 0 = unprocessed, 1 = processed */ , Null AS 'schemaName' , Null AS 'objectName' , Null AS 'indexName' FROM sys.dm_db_index_physical_stats Null , WHERE avg_fragmentation_in_percent And index_id > 0 -- ignore heaps And page_count > 8 -- ignore objects with less than 1 extent OPTION (MaxDop 1); /* Grab the most fragmented index first to defrag */ SELECT TOP 1 @objectID = objectID = indexID = databaseID = databaseName = fragmentation = partitionNumber = page_count FROM #indexDefragList WHERE defragStatus = 0 ORDER BY fragmentation DESC; /* If the index is heavily fragmented and doesn't contain any partitions or LOB's, rebuild it */ And 0) != 1 <= 1 BEGIN /* Set online rebuild options; requires Enterprise Edition */ = 1 = 1 = N' Rebuild With (Online = On'; ELSE = N' Rebuild With (Online = Off'; /* Set processor restriction options; requires Enterprise Edition */ IS Not Null = 1 + N', MaxDop = ' + AS VARCHAR(2)) + N')'; + N')'; = N'Alter Index ' + N' On ' + N'.' + N'.' END; EXECUTE 13% to 460% Faster

DBCC SHOWCONTIG Use either table name and index name, or table ID and index ID numbers. DBCC SHOWCONTIG ( [Order Details], OrderID ) GO Results: DBCC SHOWCONTIG scanning 'Order Details' table... Table: 'Order Details' ( ); index ID: 2, database ID: 6 LEAF level scan performed. - Pages Scanned : 5 - Extents Scanned : 2 - Extent Switches : 1 - Avg. Pages per Extent : 2.5 - Scan Density [Best Count:Actual Count] : 50.00% [1:2] - Logical Scan Fragmentation : 0.00% - Extent Scan Fragmentation : 50.00% - Avg. Bytes Free per Page : - Avg. Page Density (full) : 74.52% DBCC execution completed. If DBCC printed error messages, contact your system administrator. Object names can include table name, table id, view name or view id (where an index exists on the view), and/or an optional index name or index ID. The WITH option allows you to control how much data comes back. - FAST skips a leaf (data) level read and returns minimal information. These columns: Pages Scanned, Extent Switches, Scan Density [Best Count:Actual Count], Logical Scan Fragmentation. - TABLERESULTS returns the data in table format. You could then store it in a temp table if you wanted. It also returns a few additional columns: ExtentSwitches, AverageFreeBytes, AveragePageDensity, ScanDensity, BestCount, ActualCount, LogicalFragmentation, ExtentFragmentation. - ALL_INDEXES returns data for all indexes on a table, even when an individual index is identified - ALL_LEVELS only usable with TABLERESULTS (and not with FAST), produces results for each level of the index processed. Otherwise, only the index leaf level or table data level are processed. There are several especially important points to check here. Pages Scanned: Number of database pages used by the table (when you specify indid of 1 or 0) or a non-clustered index (when you specify indid > 1). Extent Switches: All pages of a table or an index are linked into a chain. Access to the table or index is more efficient when all pages of each extent are linked together into a segment of this chain. DBCC command scans the chain of pages and counts the number of times it has to switch between extents. If the number of extent switches exceeds the number of pages divided by 8, then there is a room for optimization. Extents switched compared to extents scanned gives us the scan density value. Avg. Pages per Extent: Space for each table is reserved in extents of 8 pages. Some pages are unused, because the table has never grown to use them or because rows have been deleted from a page. The closer this number is to 8 – the better. A lower number indicates that there is a lot of unused pages that decrease performance of table access. Scan Density [Best Count: Actual Count]: Scan Density shows how contiguous the table is. The closer the number is to 100% – the better. Anything less than 100% indicate fragmentation. Best Count shows the ideal number of extent switches that could be achieved on this table. Actual Count shows the actual number of extent switches. Logical Scan Fragmentation: The Percentage of out-of-order pages returned from scanning the leaf pages of an index. This reading is not relevant for heaps (tables without indexes of any kind) and text indexes. A page is considered out of order when the next page in the Index Allocation Map (IAM) is different than the page indicated by the next page pointer in the leaf page. Extent Scan Fragmentation: Percentage of out-of-order extents in scanning the leaf pages of an index, excluding heaps. An extent is considered out-of-order when the extent containing the current index page is not physically next after the extent holding the previous index page. Logical and Extent scan fragmentation should be as low as possible. Extent scan will usually be higher. Logical scan of 0% to 10% is usually acceptible. Avg. Bytes free per page: The average number of free bytes per page used by the table or index. The lower the number – the better. High numbers indicate inefficient space usage. The highest possible number of free space is 2014 (on SQL 7.0) – the size of a database page minus overhead. This or a close number will be displayed for empty tables. For tables with large rows this number may be relatively high even after optimization. For example, if row size is 1005 bytes, then only one row will fit per page. DBCC will report average free space also as 1005 bytes, but don’t expect another row to fit into the same page. In order to fit a row of 1005 bytes you’d also need additional room for row system overhead. Avg. Page density (full): How full is an average page. Numbers close to 100% are better. This number is tied to the previous one and depends on the row size as well as on clustered index fillfactor. Transactions performed on table rows change this number because they delete, insert or move rows around by updating keys. SQL BOL has a good way to defragment all indexes in a database that is fragmented above a declared threshold under the topic DBCC SHOWCONTIG.

DBCC INDEXDEFRAG DBCC INDEXDEFRAG is a great way to rebuild the leaf level of index in one step Performs on-line index reconstruction Can be interrupted without losing the work already completed Fully logged Can take longer than rebuilding the index and is not quite as effective Syntax: DBCC INDEXDEFRAG ( { database | 0 } ,{ table | 'view' } ,{ index } ) [ WITH NO_INFOMSGS ] When defining which database, table, view, or index you would like to defragment, you may use either the name of the object or its object ID. (When using a zero instead of the database name or database ID, the current database is assumed.) For example: DBCC INDEXDEFRAG (Pubs, Authors, Aunmind) GO Results: Pages Scanned Pages Moved Pages Removed

DBCC DBREINDEX DBCC DBREINDEX was introduced in version 7.0 to enable DBAs to rebuild indexes without having to drop and recreate PRIMARY KEY and UNIQUE constraints Locks the table for the duration of the operation Can offer additional optimizations than a series of individual DROP INDEX and CREATE INDEX statements on a single table Syntax: DBCC DBREINDEX ( ['database.owner.table_name' [,index_name [,fillfactor] ] ] ) [ WITH NO_INFOMSGS ] If either index_name or fillfactor is specified, all preceding parameters must also be specified.

Implement Best Practices
Be sure to set “Maximize throughput for network applications”. Make sure PAGEFILE.SYS is adequately sized. Add additional pagefiles on separate physical drives and/or segregate them from SQL Server files. Tempdb is too small by default. automatic file growth is much too small by default In high OLTP systems, it should be on a physically separate and fast I/O system Create another pagefile.sys files on every separate physical drives (Except drive contains the Windows NT system directory). Spreading paging files across multiple disk drives and controllers improves performance on most disk systems because multiple disks can process input/output requests concurrently. If you have a lot of RAM, you can configure your Windows NT server to never page out drivers and system code to the pagefile that are in the pageable memory area. Run regedit and choose: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management Set DisablePagingExecutive to 1 and reboot the server box. Set the "Maximize Throughput for Network Applications" option. This can increase SQL Server performance, because Windows NT will allocate more RAM to SQL Server than to its file cache. To set this option, you can do the following: 1. Double-click the Network icon in Control Panel. 2. Click the Services tab. 3. Click Server to select it, and then click the Properties button. 4. Click Maximize Throughput for Network Applications, and then click OK. 5. Restart the computer. Allow the tempdb database to automatically expand as needed. This ensures that queries that generate larger than expected intermediate result sets stored in the tempdb database are not terminated before execution is complete. Set the original size of the tempdb database files to a reasonable size to avoid the files from automatically expanding as more space is needed. If the tempdb database expands too frequently, performance can be affected. Set the file growth increment percentage to a reasonable size to avoid the tempdb database files from growing by too small a value. If the file growth is too small compared to the amount of data being written to the tempdb database, then tempdb may need to constantly expand, thereby affecting performance. Place the tempdb database on a fast I/O subsystem to ensure good performance. Stripe the tempdb database across multiple disks for better performance. Place the tempdb database on disks different from those used by user databases.

Automate DBA Maintenance Tasks
A daily task to perform DBCC checks and dump each of the major system databases. A weekly task to reinitialize the indexes and restore fill factors on major datbases. A nightly task to update index statistics. A nightly task to update the Sysindexes table of each database. This tasks are superior to the Database Maintenance wizard. But, at least, do the DBMaint wizard if nothing else.

Unknown SQL Server Changes
Identification Guess – Broken application Hunt – Query sysobjects 24x7 – Schema change report Resolution Appropriate security based on duties Solidified change management process Open communication among the team

SQL Server Trending Identification Benefits
Guess – Change in user complaints Hunt – Perfmon and Profiler changes 24x7 – Performance metrics over time Benefits Proactive approach for future planning Justification for hardware and software Capacity planning

Gather a baseline Working with a baseline Working without a baseline
Collect data when the problem doesn’t exist. Gather a lot of detail. Working without a baseline Start broad and zero in on problems. Look at wider counters (i.e. CPU performance).

Interpret findings Gather subject matter experts Gather their thoughts
You can’t do it all – don’t try Gather their thoughts Make everyone come up with what they think Agree on common interpretations Don’t sweat the small stuff Table differences Don’t fix anything yet!

Create an action plan Decide on the fixes Decide who should implement
Decide risks and rewards Detail timelines Create backup plan Implement Monitor for change, report

SQL Server Performance Tuning Process Automation
Manual Automated Educated Guess Hunt and Peck Tool Set Educated Guess Users notify Help Desk of system issues Help desk scrambles IT to find the problem IT frantically searches for the problem Network, Windows, SQL Server, front end application, logs, etc Unable to find issue  report to Help Desk User escalation to management IT monitor for symptoms and make changes to benefit the users, but cannot validate Problem = Lack of information Hunt and Peck Approach Ask users where problems exist Monitor SQL Server to capture data Review data to determine issues Change SQL Server based on data Re-design, code changes, indexing, etc. Monitor to determine improvement Problem = Information Overload 24x7 Performance Monitoring Install, configure and baseline Review data from integrated tools Current and historical view of system Proactively and intuitively review Focus IT Team on objects requiring the most resources Correct and validate improvement People: Process: Technology: Entire company Reactive approach No tools Entire company Reactive approach Disjointed tools Entire company Proactive approach Integrated tools

Performance Tuning Redefined with SQL 2008
Data Collection Sets Data Collection System Collection Sets Reports Performance and Diagnostics Monitoring Management Data Warehouse Historical and Baseline Comparisons Policy-Based management Troubleshooting and Tuning Performance Studio concepts (data collection, management data warehouse) How to monitor/troubleshoot performance issues using Performance Studio New performance monitoring features in SQL Server® 2008 A framework that ties together collection, analysis, troubleshooting, and persistence of SQL Server diagnostics information. It consists of a suite of tools for: Low overhead data collection Performance monitoring, troubleshooting, tuning Persistence of diagnostics data Reporting Short term goals: Provide enhanced data collection and reports out of the box In many cases, when a problem occurs, you get a call later that same day or even the next day, saying, “There is a problem, we don’t know what’s happening, but could you please look into it?” To correctly fix the issue, you need the ability to centrally store performance data to go back in time to see exactly what happened during that period of time and, hopefully, you’ll be able to figure out what the problem was. With Performance Studio, you could use the performance data to analyze and write policies to prevent future issues to your system. For example, a policy if the CPU utilization goes over 85 percent for more than 15 minutes, then take this action or enable a specific type of data collection. But, you want to be able to apply more general policies to your system. Server Side: Data Collector Extensible data collection infrastructure Includes out of the box data collections required to identify and troubleshoot the most common problems for the relational engine Support for SQL Server relational engine only, but other SQL Server services can be added in the future Server Side: Management Data Warehouse Data repository for baseline and historical comparisons Aggregated reporting for multiple SQL Server instances Client Side: Data collection configuration UI Management data warehouse properties General data collection properties Collection set configuration SQL Server dashboard based on system collection sets reports Performance monitoring and troubleshooting Historical data analysis based on warehouse information Data Collector Concepts data collection should be always on and have low overhead. Overhead is a very tricky question because for some people, anything above zero is overhead. For some other people, 5 percent is OK. The overhead level is up to you. Out of the box, a lot of testing has been done—running against TPC-C and all sorts of other benchmarks—to ensure that the basic overhead on our system collection sets is always below 5 percent. But it’s really up to the user. It’s disabled by default, so you have to enable it and run the collection sets if you want data collection on. Data Provider Source of information (for example, SQL Trace, Perform counters, DMVs, T-SQL queries, logs)

Performance Tuning Best Practices
Focus on performance needs from the project scope to maintenance Design and develop for high performance Hardware, Windows, SQL Server and application System baseline with ongoing comparisons Monitor, analyze, alert and report Solidified change management process Properly designate permissions based on duties Work towards continuous improvements

Methodology review Gather component list Evaluate objects
Interpret findings Create an action plan

Del Dele

Thank You! SearchSQLServer.com Performance and Tuning: InformIT.com: (Click on Reference Guides, then SQL Server) SQL-Server-Performance.com: Books with excellent performance tuning content “SQL Server Query Performance Tuning Distilled”, Sajal Dam “SQL Server 2005 Performance Tuning”, various “Guru’s Guide to SQL Server Architecture & Internals”, Ken Henderson “SQL Server 2005 Practical Troubleshooting”, Ken Henderson

SQL Server performance monitoring and tuning

Similar presentations

Presentation on theme: "SQL Server performance monitoring and tuning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SQL Server performance monitoring and tuning

Similar presentations

Presentation on theme: "SQL Server performance monitoring and tuning"— Presentation transcript:

Similar presentations

About project

Feedback