Download presentation
Presentation is loading. Please wait.
Published byἸαρέδ Ελευθερίου Modified over 6 years ago
1
Tuna Helper Proven Process for SQL Tuning Dean Richards Senior DBA, Confio Software
1
2
Tuna Helper – Proven Process for SQL Tuning
Question: How many people came to this presentation because of the title? If they don’t like it or seem to not care, make a joke about it coming from the marketing department. Many people who undertake SQL tuning projects look for the magic bullet in the form of tools for help. While those tools can help tune simple SQL statements, tuning the complex ones that contains unions, subqueries, outer joins, etc, give tools problems. I have used many tools out there and have been less than impressed. Question: Does anyone here use a tool for SQL tuning? Which ones? How many would say those tools work 1) very well; 2) okay; 3) not well at all. The goal of this presentation is to give you a process that will help you tune SQL statements. Instead of relying on tools that do not work well anyway, I suggest that you learn how to perform SQL tuning. Many people fail when it comes to tuning, so if you can become good at it, your image from people around you will undoubtedly rise. Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime. Chinese Proverb
3
Who Am I? Senior DBA for Confio Software
Current – 20+ Years in Oracle, SQL Server Former – 15+ Years in Oracle Consulting Specialize in Performance Tuning Review Performance of 100’s of Databases for Customers and Prospects Common Thread – Paralyzed by Tuning
4
Agenda Introduction Challenges Identify - Which SQL and Why
Gather – Details about SQL Tune – Case Study Monitor – Make sure it stays tuned SQL Tuning Myths
5
Introduction SQL Tuning is Hard This Presentation is an Introduction
3-5 day detailed classes are typical Providing a Framework Helps develop your own processes There is no magic tool Tools cannot reliably tune SQL statements Tuning requires the involvement of you and other technical and functional members of team The reason that most tools do not work, and the reason why people fail at SQL tuning projects, is quite simple. It’s hard! Many people make their living off training people how to do SQL tuning well and those classes are usually 3-5 days in length. If it were easy there would be no performance issues. We are by no means going to make you an expert in the one hour we have today, but hopefully I will present you with a framework that you can use to become a better SQL tuner. Based on this framework you can start to build your own process. As I mentioned before, there is no magic bullet or tool that you can point at your database and voila, the database is tuned. Becoming an expert SQL tuner takes time and practice, so if you’re scared of the prospects of undertaking a tuning projects, just go for it. You will never learn by sitting on the sidelines and doing the same old things. When you start you will want to make sure you are not standing on an island by yourself. Most likely you did not write the application or the SQL statements that are killing the database. One of the biggest mistakes I see are that DBA works in isolation when tuning. It’s there nature – “I’m smart and I can do this by myself”, or more likely “I’m a recluse and I don’t want to include anyone else”. However, it’s always better to include other technical people that are familiar with the application. If the app is custom built by your company, work with the developer who initially wrote it. Work with the people that designed the app. Work with the business people to understand what they do in the application when the performance problem occurs. All of these people or typically more than willing to help because they will benefit from a faster application.
6
Challenges Requires Expertise in Many Areas Tuning Takes Time
Technical – Plan, Data Access, SQL Design Business – What is the Purpose of SQL? Tuning Takes Time Large Number of SQL Statements Each Statement is Different Low Priority in Some Companies Vendor Applications Focus on Hardware or System Issues Never Ending There are many challenges as well. I already said this, but it’s worth saying again – SQL tuning is difficult. To do it correctly you or the people you get on your team, need to be very familiar with many different aspects. From a technical standpoint, you need to understand the execution plans, how is Oracle executing it, how is the data being accessed. You also need to be familiar with SQL design concepts because sometimes tuning the SQL means rewriting it. When you are working with the end users, understand how the application is used and why they do things? Why do you fill in those fields? Why do you use this screen? Understanding the purpose of the SQL/app will help you make better decisions down the road. Many people are impatient, but SQL tuning takes time. That’s why working with other people is also about putting a face on the project. Instead of the users saying something like “I’ve been complaining about this problem for weeks and the DBA team says they’re doing something, but no one knows what.”, they might say “I’ve been working with Bob from the DBA group and he definitely asks a lot of questions and seems to care about my experience.” Another huge challenge is that there are a large number of SQL statements and how do you know for sure you are working on the right one? I’ll talk about this in more detail, but this is also where the end users come in and can help you focus on the correct things. Instead of worrying about 100s or 1000s of SQLs, worry about the ones that affect this user, this screen in this application. Tuning is also difficult because all SQL statements are different. Just because you solved the last problem in 30 minutes by tuning a certain way, does not mean the next project is that easy or can be tuned the same way. Plus, some companies in general do not care about performance. As long as it gives the correct results, they don’t care as much about performance. Also, some people get used to the way things work – “I always press this button the first thing in the morning and then go get coffee, because I know it takes an hour.”. Bad performance becomes a way of life and sometimes people get mad when the app comes back in 20 seconds and they can’t walk around interrupting everyone else with what their kids did last night. SQL tuning can be never ending. There always seems to be a next problem. Once you tune something, other people start coming out of the woodwork wanting their app tuned as well. But that’s a good thing for you.
7
Identify – Which SQL Tracing a Session / Process
User / Batch Job Complaints Highest I/O (LIO, PIO) SQL Performing Full Table Scans Known Poorly Performing SQL Highest Wait Times (Ignite, AWR, etc) Once you get back to the office and want to tune something, where do you start? Don’t just pick one that looks interesting, have some method to choose the SQL. That could come from talking with users and understanding their complaints. Maybe it stems from a batch job that continues to run longer and longer. It could also come from a tracing session that you setup because of user complaints. Have some reason for tuning, even if it’s because the statement is the number one SQL in the database from a wait time perspective, or it does the most LIOs on a daily basis. Or, I see a lot of full table scans occuring and here are the top ones. Maybe it’s a known poor performer that comes from the developer asking your help. Whatever it is, have a method you use for picking the SQL statements. At Confio we believe the best measure is end user wait times, which is what our tool Ignite is all about, but the other methods are also valid.
8
Identify – End-to-End Business Aspects Technical Information
Who registered yesterday for SQL Tuning Why does the business need to know this How often is the information needed Who uses this information Technical Information Review ERD Understand tables and the data (at a high level) End-to-End Process Understand application architecture What portion of the total time is database Where is it called from in the application When I mentioned End-to-End, most people think of the performance of the application from the web browser or client application, through the application server and to the database. That is technically correct, but from a business perspective, you should also know the SQL end-to-end. What I mean by that is understand what the SQL is used for, why does the business need to know this information, how often is this data needed and who consumes the information. Also understand other technical aspects of the statement. Review the ERD to understand the relationships of tables being used in the query. It’s helpful (I think it should be a requirement) to get the big picture when jumping into a tuning project. When that user complains how do you know the problem is with the database? The problem could be in the Java application tier. Even if you tuned the worst performing SQL statement for that process, you may not make that much of a difference because only 10% of the total time was spent in the database. Vice versa, if you know that 90% of the time is spent in the database, and that most of that time is spent executing a specific SQL statement that’s performing a full table scan (wait events will be discussed soon), you can start to make predictions about the performance gains. To get this type of information you will probably need a tool of some sort, but there are also ways to get this application other ways via debugging, logging times to files, code instrumentation among many others. I, nor Confio, has a complete answer for this data, but I feel that it is very important to ensure the correct things are being tuned.
9
Identify – End-to-End Time
9
10
Wait Event Information
V$SESSION SID USERNAME SQL_ID PROGRAM MODULE ACTION PLAN_HASH_VALUE V$SESSION_WAIT SID EVENT P1, P1RAW, P2, P2RAW, P3, P3RAW STATE (WAITING, WAITED…) Oracle 10g added this info to V$SESSION I already mentioned that I feel that wait event information is very critical when you jump into SQL tuning. When SQL executes, Oracle has instrumented their code to give information about what a SQL statement is waiting on. If it runs for 3 minutes, where did it spend it’s time. This is where v$session and v$session_wait tell you what’s happening. You can also get this information from tracing, which is a great way to get detailed information about a problem that happens at a known time, that is, you can predict it will happen. V$SQL SQL_ID SQL_FULLTEXT V$SQLAREA SQL_ID EXECUTIONS PARSE_CALLS BUFFER_GETS DISK_READS V$SQL_PLAN SQL_ID PLAN_HASH_VALUE DBA_OBJECTS OBJECT_ID OBJECT_NAME OBJECT_TYPE
11
Wait Time Scenario Which scenario is worse? SQL Statement 1
Executed 1000 times Caused 10 minutes of wait time for end user Waited 99% of time on “db file sequential read” SQL Statement 2 Executed 1 time Waited 99% on “enq: TX – row lock contention” Speaking of wait time and wait events, I have a little test for you. Which of these scenarios is worse? SQL statement 1 that executed 1000 times, made the end users wait for 10 minutes and waited 99% of the time on “db file sequential read”. Or, SQL statement 2 that executed 1 time, made the end user wait for 10 minutes and spend 99% of it’s time waiting on a locking problem? The answer is both are equally bad, they both made the end user wait for 10 minutes. End users don’t care what they waited for, only that it took 10 minutes. SQL statement 2 may be harder to tune, because locking problems are typically application design issues, but both are equal in the eyes of your customer, the user.
12
Identify – Simplification
Break Down SQL Into Simplest Forms Complex SQL becomes multiple SQL Sub-Queries Should be Tuned Separately UNION’ed SQL Tuned Separately Get the definition of views Are synonyms being used So far we have identified: 1) the users or business impact of the performance issue; 2) and end-to-end view of the process that specifies the layer of the app where performance is suffering; 3) the SQL statements being executed in the database. So we now have identified some things to work on. These statement are not typically “select * from table1 where col1 = X”, they are usually much more difficult. They may be pages long, include subqueries, unions, not exists and all sorts of things. The first step I take is to simplify. Break it down into manageable pieces that you can comprehend and get your arms around. Complex SQL statements may become 5 smaller SQL components that are simpler. Tune each of these separately. If you have 5 statements all unioned together, tune each on it’s own. If views are used, get the definitions and tune the views separately as well. If synonyms are used, know where they point. I’ll talk about this more later when we dive in execution plans as well.
13
Identify – Summary Determine the SQL Understand End-to-End
Measure Wait Time Simplify Statement
14
Gather - Metrics Get baseline metrics
How long does it take now What is acceptable (10 sec, 2 min, 1 hour) Collect Wait Time Metrics – How Long Locking / Blocking I/O problem Latch contention Network slowdown May be multiple issues All have different resolutions Document everything in simple language The next phase I the gathering section where we will collect metrics about the SQL statement. This should include the following items: How long does the statement take now What is acceptable to the end users. If they want the query to return in 1 sec and it’s reading tons of data, you may have to stop the tuning project immediately, because you will not satisfy this person’s expectations. This also helps you know when to stop tuning. Collect wait time information because all performance problems are not created equal. If you find a query waiting on locking/blocking wait events, you know that tuning the end user wait time is not about tuning the SQL statement. Fixing locking issues are usually application design issues, and in this case, you should become very good friends of the developers and the people that designed the application. If you find a statement waiting on I/O wait events, e.g. db file sequential read or db file scattered read, this also helps you get to the next step. Waits on db file scattered read indicate full table scans while waits on db file sequential read typically indicates inefficient indexes being used for this query. If you have latch contention, you have to figure out the exact latch and understand why it is so popular If you see network waits, understand whether the statement is returning a lot of data, either in the form of a lot of rows, or large columns like LOBs are involved. There could also be real network latency between the database and the client, and in this case, make friends with your network admin. Quite likely there will be multiple problems. There could be a full table scan on Table1 and an inefficient index being used to access table2 which is also causing latching problems. If a query executes for 3 minutes, understand what makes up that 3 minutes from a wait event view, i.e. it waits 2:30 on db file scattered read 15 seconds on latching, and 15 seconds on CPU (service time). When you have gathered the necessary metrics, document it in simple language. Example of this are: Say “This query spends 85% of it’s time locked because of other processes accessing the same data.”, vs. “This query spends 85% of it’s time waiting on “enq: TM contention’. No one knows what “enq: TM contention means, but they will likely understand “locking problem”. Say “This query takes a long time because it is inefficient and reads every row of a 1 million row table.” vs. “This query spends 85% of it’s time waiting on db file scattered read”. Say “The query takes 20 ms to execute, which sounds efficient, but every time the user executes this process, the query executes 100,000 times and causes the user to wait 2000 seconds or over 30 minutes”. Don’t say “The query executes in 20 ms and there is nothing I can do” – the user will not be happy about this at all.
15
Gather – Execution Plan
EXPLAIN PLAN Estimated execution plan - can be wrong for many reasons V$SQL_PLAN (Oracle 9i+) Real execution plan Use DBMS_XPLAN for display Tracing (all versions) Get all sorts of good information Works when you know a problem will occur Historical – AWR, Confio Ignite Once you have broken the query down you need to understand how each component behaves. That’s where executions plans help. They supply costing information, data access paths, join operations and many other things. Question: How many people here use EXPLAIN PLANS from the explain plan command (or from a tool that gives you this)? Do you know that it could be wrong and not match with how Oracle is really executing the statement? How could the explain plan be wrong? It can be wrong because it is typically executed in a completely different environment, i.e. via SQL*Plus or Toad and not from the Java code. There could be different session settings, bind variable data types, etc. along with many other things that are outside the scope of this doc. The two best places to get execution plan information are: V$SQL_PLAN – contains raw data for execution plans for executed SQL statements. It’s what Oracle used so go straight to the source. You can use DBMS_XPLAN to get the execution plan in a format that you can use. Tracing – gives all sorts of great information as well as executions plans (<what does tracing provide for sure?>). Historical tools – collect execution plan information so you can go back a week ago and understand why this SQL statement performed poorly. Question: How many people use AWR data without going through OEM? Once you have an execution plans, look for how each of the SQL components are being executed. Based on wait time data you should already have a feel whether full table scans are significant or whether more time is spent reading indexes. Execution plans help determine where those problems are - look for expensive steps.
16
Plans Not Created Equal
SELECT company, attribute FROM data_out WHERE segment = :B1 Wait Time – 100% on “db file scattered read” Plan from EXPLAIN PLAN Plan from V$SQL_PLAN using DBMS_XPLAN I mentioned that explain plan data can be wrong. Here is an example where, based on wait time data, we were pretty sure the query was doing a full table scan. However, when we review the explain plan, the query looked very efficient and seemed to be using a unique index to retrieve one row of data. How could this be? It is a very simple query with one criteria in the WHERE clause, so if it’s executing very efficiently, why are my end users waiting 20 seconds for this query? Having multiple sources of performance data, i.e. wait time data, execution plans, end user documented performance, end-to-end views, etc, are helpful. If we were only reviewing explain plan data in this case, we may have dismissed the issue and pointed our fingers back to the development group claiming the problem was in the application tier. It appears to be using an index and in fact it’s using a unique index to get one row. It cannot get any faster. They would probably instrument their code to better understand where the performance problem is and would soon realize that this query is indeed the problem. You now look bad, because they have evidence the statement is performing poorly. However, if you would reviewed the execution plan, you would very quickly understand what’s happening. In the data from V$SQL_PLAN, it points to a full table scan on the DATA_OUT table. Furthermore, by using the predicate information under the plan, you also understand why. The :B1 bind variable is being given a BINARY DOUBLE number from Java. The segment column is defined as a standard NUMBER datatype so Oracle is forced to implicitly convert the data in the table to BINARY DOUBLE to apply the criteria. Anytime a function is used around a column, standard indexes on that column cannot be used. You now have a plan of action: 1) go back to the developers and tell them to modify the code to send in a standard number; 2) Or, redefine the column in the DATA_OUT to be a BINARY DOUBLE if it needs to be; 3) create a function-based index to immediately help performance while the development team figures out how to change the code. You’re a hero by telling them what they need to do to help performance, and, oh by the way, I’ll create a second index to help performance for now. Everyone wins when the correct information is applied to the problem.
17
Gather – Bind Values V$SQL_BIND_CAPTURE
STATISTICS_LEVEL = TYPICAL or ALL Collected at 15 minute intervals SELECT name, position, datatype_string, value_string FROM v$sql_bind_capture WHERE sql_id = '15uughacxfh13'; NAME POSITION DATATYPE_STRING VALUE_STRING :B BINARY_DOUBLE Bind Values also provided by tracing Level 4 – bind values Level 8 – wait information Level 12 – bind values and wait information The third step of gathering information is to understand the data being passed into any bind variables. You can get this information from the V$SQL_BIND_CAPTURE table. One caveat is that I could not get this to work on my test database even though STATISTICS_LEVEL=ALL. Plus, the values are only collected every 15 minutes, so the data you get from here is not exact. However, thinking back to our previous example, we would have at least seen that datatype as BINARY DOUBLE and start asking questions as well. Tracing the process and collecting bind values gives the best data. It is accurate and you will see every value passed in while the trace was running. Use Level 4 or 12 to get binds. Level 12 gives both wait events and binds, so use it if possible.
18
Gather – Table / Index Stats
Use TuningStats.sql Provides data on objects in execution plans. Table sizes Existing indexes Cardinality of columns Segment sizes Histograms and Data Skew Many things the CBO uses Run it for expensive data access targets The fourth step is to gather information about each table being accessed inefficiently. This would be the highest execution steps from the execution plan – there is no use gathering data for objects that are already being accessed efficiently. Confio has a script on the support site that will help with this process. It will help gather information such as: Table sizes – a full table scan on a 20 row table is better than using an index even it it existed, so understand if you have small, medium, large or very large tables involved. Also understand segment sizes. We had an experience with a 20 row table and Oracle was doing a full table scan to access that table. We kept overlooking it because it only had 20 rows, however, the underlying segment for the table was 80MB. At some point in the life of the table, this table had over 500,000 rows, so it grew but never shrank. Each time the full table scan was done, Oracle did not just read 20 rows, it read 80MB of data, with most of it empty. Existing indexes – get a list of all indexes that already exist on these tables. Understand the selectivity or cardinality of columns from your limiting factors discovery. If a query includes criteria such as “STATUS=‘D’”, how many values does the status field have, i.e. how selective is it. Also understand data skew at this point. If there are only 5 values for the STATUS column, but 1% of them have a status=‘D’, then an index with histograms may help this statement.
19
Example SQL Statement Who registered yesterday for SQL Tuning
SELECT s.fname, s.lname, r.signup_date FROM student s, registration r, class c WHERE s.student_id = r.student_id AND r.class_id = c.class_id AND UPPER(c.name) = 'SQL TUNING' AND c.class_level = 101 AND r.signup_date BETWEEN TRUNC(SYSDATE-1) AND TRUNC(SYSDATE) AND r.cancelled = 'N' Execution Time – 12:38 Wait Time – 95% on “db file scattered read” We are now to the point where I want to jump into the case study. I feel that actually doing some SQL tuning while we discuss the rest of the process will be the most beneficial. The example query is answering the business question: “Who registered yesterday for the class SQL Tuning 101?” This is a fairly straightforward class registration example and the following simple ERD helps you understand the relationship of the tables. Remember that we have arrived at this SQL statement by executing the previous steps. This could also come from a developer as a know poor performer.
20
Relationship CLASS class_id name class_level REGISTRATION class_id
student_id signup_date cancelled STUDENT student_id fname lname Here is the ERD for this query. It’s very simple – your ERD will not be this simple, but restrict it to the objects the query is accessing. If the statement only accesses 3 tables, only review the ERD for those 3 tables, not all Simply at this stage as well.
21
Gather – Summary Execution Plan Bind Values Table and Index Statistics
V$SQL_PLAN Do not use EXPLAIN PLAN DBMS_XPLAN Bind Values V$SQL_BIND_CAPTURE Tracing Table and Index Statistics ERD
22
Tune – Create SQL Diagram
SQL Tuning – Dan Tow Great book that teaches SQL Diagramming registration .04 5 30 1 1 student class .002 Now we’re ready to tune. Start by reviewing all data collected so far and start to plan your attack. select count(1) from registration where cancelled = 'N' and signup_date between trunc(sysdate-1) and trunc(sysdate) 3562 / = .0445 select count(1) from class where name = 'SQL TUNING' 2 / 1000 = .002
23
Execution Plan | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | | | | | 1 | NESTED LOOPS | | | | | | 2 | NESTED LOOPS | | | | | | 3 | NESTED LOOPS | | | | | | 4 | VIEW | VW_SQ_ | | | | |* 5 | FILTER | | | | | | 6 | HASH GROUP BY | | | | | |* 7 | FILTER | | | | | |* 8 | TABLE ACCESS FULL | REGISTRATION | | 1328K| | |* 9 | INDEX UNIQUE SCAN | SYS_C | | | | |* 10 | TABLE ACCESS BY INDEX ROWID| CLASS | | | | |* 11 | INDEX UNIQUE SCAN | SYS_C | | | | | 12 | TABLE ACCESS BY INDEX ROWID | STUDENT | | | | |* 13 | INDEX UNIQUE SCAN | SYS_C | | | | Predicate Information (identified by operation id): 5 - AND 7 - 8 - filter("CANCELLED"='N') 9 - access("R1"."STUDENT_ID"="STUDENT_ID" AND "R1"."CLASS_ID"="CLASS_ID" AND "SIGNUP_DATE"="VW_COL_1") AND 10 - filter(("C"."CLASS_LEVEL"=101 AND UPPER("C"."NAME")='SQL TUNING')) 11 - access("CLASS_ID"="C"."CLASS_ID") 13 - access("S"."STUDENT_ID"="STUDENT_ID") Let’s dive into it and get the execution plan. When reviewing this, at a high level look for costly data access steps, i.e. “table access full”, “table index by index rowid”, indexes and tables involved. Also, start from the inside and work your way back out. From this execution plan, the inside and most indented step is “TABLE ACCESS FULL” on the REGISTRATION table. Notice that the REGISTRATION table is not used directly in the query, but something named ACTIVE_REGISTRATIONS is. This typically indicates a view or synonym so go grab those definitions as well. Also review the predicate information as well for any strange things like our previous example. When you match up step 8 in the execution plan with the predicate information, you can see the REGISTRATION table is being filtered by a CANCELLED=“N’ criterion as well as MAX(SIGNUP_DATE) >= now and <= a day ago. The STUDENT and CLASS tables are being accessed by unique ids. Based on this information Oracle first goes to the REGISTRATION table and looks for rows that are not cancelled and then filters on some date logic. We don’t what that is yet because we have not see all data yet, but let’s catalog it for the future. It then joins the CLASS and STUDENT tables on a unique index, which probably represents foreign keys back to the main tables.
24
New Plan create index cl_unm on class(upper(name))
| Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | | | 10| |* 1 | FILTER | | | | | | 2 | NESTED LOOPS | | | | 7| | 3 | NESTED LOOPS | | | | 6| |* 4 | TABLE ACCESS BY INDEX ROWID | CLASS | | | 5| |* 5 | INDEX RANGE SCAN | CL_UNAME | | | 1| |* 6 | INDEX RANGE SCAN | REG_ALT | | | 1| | 7 | SORT AGGREGATE | | | | | |* 8 | TABLE ACCESS BY INDEX ROWID| REGISTRATION | | | 3| |* 9 | INDEX RANGE SCAN | REG_ALT | | | 2| | 10 | TABLE ACCESS BY INDEX ROWID | STUDENT | | | 1| |* 11 | INDEX UNIQUE SCAN | SYS_C | | | 0| We created the index, gathered stats, executed the query and the new execution plan is much different. A cost of 10 and our new index REG_SUDT is being utilized. We have been successful because the cost is much lower. However, ensure that you test the query and new index in good test environment before claiming success. A lower cost does not always equate to better execution times (usually, but not always).
25
Query 2 Who cancelled classes within the week
SELECT s.lname, c.name, r.signup_date cancel_date FROM registration r, student s, class c where r.signup_date between sysdate and sysdate-7 AND r.cancelled = 'Y' AND r.student_id = s.student_id AND r.class_id = c.class_id 30% of rows are dated within last week No index on CANCELLED column = FTS Will an index on CANCELLED column help? Why or why not? Here is another example on the same set of tables that returns the students who cancelled a class within the last week. The first step is to review execution plan and waits for the query. It waits exclusively on “db file scattered” read and the execution plan clearly shows a full table scan. We also know from our data, that almost 30% of the rows in the registration table have a signup_date value from the past week. So, this will not make for efficient index use. However, remember our data skew query for the CANCELLED columns. Only 638 rows include a value of ‘Y’ – the selectivity of this column becomes 638 / = .007 or very good selectivity. So, let’s just create an index on the cancelled column. Will this help? I will not because Oracle assumes even data distribution and has this column calculated for .5 selectivity and will not use the idnex. We will also need to collect histograms so Oracle knows about this uneven distribution of data.
26
Tune – Create SQL Diagram
registration .008 .15 5 30 1 1 student class select count(1) from registration where cancelled = ‘Y’ and signup_date between trunc(sysdate-1) and trunc(sysdate) 622 / = .0077 select count(1) from registration where cancelled = ‘Y‘ 638 / = .0079 select count(1) from registration where signup_date between trunc(sysdate-1) and trunc(sysdate) 11598 / = .1449 Now we’re ready to tune. Start by reviewing all data collected so far and start to plan your attack.
27
Query 2 Column Stats Oracle will not use an index on this column
create index reg_can on registration(cancelled); select cancelled, count(1) from registration group by cancelled; C COUNT(1) Y N Oracle will not use an index on this column Unless it has more information CBO assumes an even data distribution Histograms give more information to Oracle Based on skewed data, CBO realizes an index would be beneficial Works best with literal values Bind Variables – Oracle peeks first time only Collecting histograms gives Oracle the same data as returned by the query: select cancelled, count(1) from registration group by cancelled; C COUNT(1) Y N It sees good selectivity for cancelled=‘Y’ and decides to use the index. Histograms work best with literal values. If a bind variable is used and the first time the query is parsed, Oracle will perform bind variable peeking to determine the execution plan. However, what if the first time the query is executed, the bind variable is ‘N’. Oracle will perform a full table scan, even though all other subseuqent queries use a bind variable of ‘Y’.
28
Query 2 - Histogram dbms_stats.gather_table_stats(
ownname => 'STDMGMT', tabname => 'REGISTRATION', method_opt=>'FOR COLUMNS cancelled SIZE AUTO') | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | | | 7| |* 1 | FILTER | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID| REGISTRATION | | | 7| |* 3 | INDEX RANGE SCAN | REG_CAN | | | 2| Collecting histograms is very easy and in this case we may use something like the following: dbms_stats.gather_table_stats( ownname => 'STDMGMT', tabname => 'REGISTRATION', method_opt=>'FOR COLUMNS cancelled SIZE AUTO'); Based on this new data, the execution plan becomes: | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | | | 7| |* 1 | FILTER | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID| REGISTRATION | | | 7| |* 3 | INDEX RANGE SCAN | REG_CAN | | | 2|
29
Monitor Monitor the improvement Monitor for next tuning opportunity
Be able to prove that tuning made a difference Take new metrics measurements Compare them to initial readings Brag about the improvements – no one else will Monitor for next tuning opportunity Tuning is iterative There are always room for improvements Make sure you tune things that make a difference Shameless Product Pitch - Ignite The final step is to monitor the results. Even with very good testing, sometimes you will find that what worked there will not work in production. Be sure to monitor the results when your end users begin using the newly tuned statement. Take all metrics again and understand how fast it executes, what it waits on, etc. Then, brag about the improvements, i.e. document the results and share with end users, managers, etc. Monitoring is also important because it really is the start of the next tuning opportunity. Tuning is iterative, so now go after the next worst query.
30
Ignite for Oracle 40% Improvement
31
Summary Identify Gather Tune Monitor What is the Bottleneck
End-to-End view of performance Simplify Gather Metrics – Current Performance Wait Time Execution Plan Object Definitions and Statistics Tune SQL Diagrams Monitor New Metrics, Wait Time Profile, Execution Plan
32
Confio Software - Monitor
Developer of Wait-Based Performance Tools Igniter Suite Ignite for SQL Server, Oracle, DB2, Sybase Provides Help With Identify Gather Monitor Based in Colorado, worldwide customers Free trial at
33
Myth 1 – Option 1 Use outer joins vs. NOT IN / NOT EXISTS
Which class is currently empty? select class_id, name from class where class_id not in ( select class_id from registration) | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | | | 88| |* 1 | HASH JOIN ANTI | | | | 88| | 2 | TABLE ACCESS FULL| CLASS | | | 14| | 3 | TABLE ACCESS FULL| REGISTRATION | | 312K| 72|
34
Myth 1 – Option 2 Try NOT EXISTS vs. NOT IN select class_id, name
from class c where not exists ( select 1 from registration r where c.class_id = r.class_id) | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | | | 88| |* 1 | HASH JOIN ANTI | | | | 88| | 2 | TABLE ACCESS FULL| CLASS | | | 14| | 3 | TABLE ACCESS FULL| REGISTRATION | | 312K| 72|
35
Myth 1 – Option 3 Try OUTER JOIN No Differences with 3 options
select c.class_id, c.name from class c, registration r where c.class_id = r.class_id (+) and r.class_id is null | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | | | 88| |* 1 | HASH JOIN ANTI | | | | 88| | 2 | TABLE ACCESS FULL| CLASS | | | 14| | 3 | TABLE ACCESS FULL| REGISTRATION | | 312K| 72|
36
Myth 2 – Option 1 Use MINUS vs. NOT IN
Which students live in DC area but not in or zip Cost = 15, LIO = 20 select student_id from student where state in ('VA', 'DC', 'MD') and zip not in (20002, 20003) | Id | Operation | Name | Rows | Bytes | Cost | | | 0 | SELECT STATEMENT | | | | 15| | 1 | INLIST ITERATOR | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID| STUDENT | | | 15| |* 3 | INDEX RANGE SCAN | ST_ST | | | 3|
37
Myth 2 – Option 2 Try MINUS vs. NOT IN
Cost = 20, LIO = 23 – Worse Performance select student_id from student where state in ('VA', 'DC', 'MD') minus where zip in (20002, 20003) | Id | Operation | Name | Rows | Bytes | Cost | | 0 | SELECT STATEMENT | | | | 20| | 1 | MINUS | | | | | | 2 | SORT UNIQUE | | | | 16| | 3 | INLIST ITERATOR | | | | | | 4 | TABLE ACCESS BY INDEX ROWID| STUDENT | | | 15| |* 5 | INDEX RANGE SCAN | ST_ST | | | 3| | 6 | SORT UNIQUE | | | | 4| | 7 | INLIST ITERATOR | | | | | | 8 | TABLE ACCESS BY INDEX ROWID| STUDENT | | | 3| |* 9 | INDEX RANGE SCAN | ST_ZIP | | | 2|
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.