Download presentation
Presentation is loading. Please wait.
Published byScarlett Barnett Modified over 9 years ago
1
Oracle Database Administration Lecture 6 Indexes, Optimizer, Hints
2
Indexes in Oracle Indexes are used to: provide faster access to data help enforce primary key and unique constraints help enforce foreign key constraints Index types: B-Tree indexes (default) Bitmap indexes
3
Indexes in Oracle B-Tree Index can be: unique – each index value except NULL must be unique non unique single column – one column is indexed multiple column (compound index) – multiple columns are indexed NULL values in B-Tree index are ignored (are not indexed)
4
Compound indexes Column values are combined together in the order they appear in CREATE INDEX statement Statements: CREATE INDEX IND1 ON EMP(ID, SALARY); CREATE INDEX IND2 ON EMP(SALARY,ID); create two different indexes Compound index value is NULL if all the columns are NULL
5
Index usage Index can be used by a SELECT statement to: limit number of rows to be processed by the WHERE clause order the results of the SELECT statement join two tables Index can be accessed in the following ways: unique index scan (0 or 1 rows) non unique index scan (any number of rows) index range scan (any number of rows)
6
Oracle optimizer Many statements can be executed in a different way Each statement executed is analyzed by the Oracle optimizer Oracle optimizer creates execution plan for the statement
7
Oracle optimizer Execution plan defines: what tables to access and how to access them (access path) the order of operations (join order) join method / methods the optimizer can rewrite statement to different one, as long as the results of the statement are the same
8
Access paths Access path is a way of accessing table data. Example access paths: FULL – full table scan (all rows are searched) ROWID – table access by rowid (the fastest) INDEX_ASC, INDEX_DES – access table using an index INDEX_FFS – index fast full scan INDEX_SS – index skip scan
9
Full table scan Full table scan is relatively expensive It is used when: entire table needs to be scanned (for example in SELECT * FROM table) table is small (for example: 100 blocks) there is no index that can be used there are indexes, but expected number of rows is large and total cost of execution would be larger when using indexes
10
Full table scan example LIKE '%...' operator cannot use index Table test1 contains 2000 rows Optimizer estimates (guesses) that 100 rows will be returned from the query
11
Full table scan example SELECT * FROM emp WHERE gender = 'F' approximately 50% of the rows have gender = 'F' using an index is inefficient when 50% of the rows are to be searched – Oracle performs full table scan this index has low selectivity – this column is not a good candidate for B-Tree index
12
Index range scan Index range scan uses index to search for multiple rows Index contains: ROWID of the actual row indexed values Index can be used to: locate row using ROWID get indexed value directly from the index
13
Index range scan example 1 Index is used to get ROWID of the table row ROWID retrieved from index is used to get data from table data block
14
Index range scan example 2 Since SALARY column exists in the index IDX1, value is retrieved directly from the index, without accessing the table
15
Index fast full scan Index fast full scan is sometimes used instead of full table scan Index FFS can be used when data to be retrieved in the query is in the index For FFS of compound indexes, the order of columns is not important
16
Join operations When two tables are joined, the optimiser decides: join order – which table will be accessed first, which second join method – how the tables will be joined. Available join methods include: nested loops merge join (backward compatibility only) hash join anti join, semi join
17
Nested loops join For each row in the first table: find all rows in the second table that match the where clause
18
Nested loops join Nested loops join is most effective when: Number of nested executions is low – when number of rows returned from the first statement is low Nested statement can be executed quickly, for example using a unique index scan
19
Merge join Sort rows in the first table by the join key Sort rows in the second table by the join key Merge sorted rows (in a single pass) Merge sort is most effective when: –Rows can be sorted effectively, for example when using index on the join column – that way rows are already sorted without additional effort
20
Hash join Similar to merge join Instead of sorting – hash table of all rows indexed by the join key is used
21
Hash join Hash join is most effective: –When joining large sets of data that is not sorted by the join key (otherwise merge join is useful)
22
Semi join Statement with EXISTS clause SELECT D.NAME FROM DEPT D WHERE EXISTS (SELECT * FROM EMP E WHERE D.ID = E.DEPT_ID) SELECT D.NAME FROM DEPT D WHERE D.ID IN (SELECT E.DEPT_ID FROM EMP E WHERE E.SALARY > 2000)
23
Anti join Statement with NOT EXISTS or NOT IN clause SELECT D.NAME FROM DEPT D WHERE NOT EXISTS (SELECT * FROM EMP E WHERE D.ID = E.DEPT_ID) SELECT D.NAME FROM DEPT D WHERE D.ID NOT IN (SELECT E.DEPT_ID FROM EMP E WHERE E.SALARY > 2000)
24
Execution plan sample EXPLAIN PLAN command output: Rows Execution Plan -------- ---------------------------------------------------- 12 SORT AGGREGATE 2 SORT GROUP BY 76563 NESTED LOOPS 76575 NESTED LOOPS 19 TABLE ACCESS FULL CN_PAYRUNS_ALL 76570 TABLE ACCESS BY INDEX ROWID CN_POSTING_DETAILS_ALL 76570 INDEX RANGE SCAN (object id 178321) 76563 TABLE ACCESS BY INDEX ROWID CN_PAYMENT_WORKSHEETS_ALL 11432983 INDEX RANGE SCAN (object id 186024)
25
Oracle optimizer initialization parameter OPTIMIZER_MODE can take values: first_rows first_rows_1, first_rows_10, first_rows_100, first_rows_1000, all_rows, choose
26
Cost based optimizer Cost based optimizer tries to estimate real cost of executing statement (cpu, memory, io operations) CBO uses statistics to estimate execution cost The following types of statistics exist: table statistics column statistics index statistics column histograms
27
Table statistics Table statistics include: number of rows in a table total number of blocks number of free blocks average row length number of chained blocks Table statistics can be viewed in DBA_TABLES view
28
Column statistics Column statistics include: minimum value maximum value number of distinct values number of NULL values Table statistics can be viewed in DBA_TAB_COLS view
29
Index statistics Index statistics include: BTree level number of leaf blocks number of distinct keys average leaf blocks per key average data blocks per key number of rows Index statistics can be viewed in DBA_INDEXES view
30
Column histograms Column histograms can be computed for columns with non-uniform distribution Example: select * from emp where salary between 0 and 1000 Without histogram Oracle estimates number of rows matching the where condition using: total number of rows (table statistics) minimum value of salary (column statistics) maximum value of salary (column statistics)
31
Column histograms With histograms Oracle can better estimate number of rows that match WHERE condition Histograms are not useful if: column distribution is uniform (normal statistics are enough) column is accessed using bind variable: select * from emp where salary between ? and ?
32
Gathering statistics There are many ways to gather statistics: manualy for a table: ANALYZE TABLE employees COMPUTE STATISTICS; ANALYZE TABLE employees ESTIMATE STATISTICS SAMPLE 10 PERCENT ANALYZE TABLE employees COMPUTE STATISTICS FOR ALL INDEXED COLUMNS manually for a user: EXECUTE dbms_stats.gather_schema_stats( ownname => 'Username', method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO', cascade=>TRUE);
33
Automating statistics gathering Oracle 9i and before: create a job for computing statistics every day, week or month Oracle 10g and 11g: Oracle 10g and 11g automatically gather statistics
34
Optimizer modes RULE – force rule based optimizer FIRST_ROWS – force cost based optimizer, optimize statement based on time of returning the first row FIRST_ROWS_1, FIRST_ROWS_10, etc. – optimize statement based on time of returning first X rows ALL_ROWS - force cost based optimizer, optimize statement based on execution of entire statement CHOOSE – if statistics are present, use CBO, otherwise use RBO
35
Hints
36
SQL statements can have hints hints help optimizer choose best access path optimizer can ignore hint(s) if: hint does not make sense in the query example: using index unrelated to sort order or join condition hints are in conflict with each other hints have incorrect syntax
37
Hints Hints are included in SQL as comments: SELECT /*+ INDEX(employees emp_idx) */ * FROM employees WHERE id > 100; SELECT /*+ INDEX(emp emp_idx) USE_NL(emp, dept)*/ * FROM employees emp, departments dept, WHERE emp.id > 100 AND emp.dept_id = dept.id Hints start with /*+ Hints must immediately follow SELECT, UPDATE, etc. Hints can be included in sub-selects and in views
38
Optimizer mode hints Hint can change global optimizer mode: ALL_ROWS FIRST_ROWS FIRST_ROWS_10, FIRST_ROWS_100,... – minimum time to return first 10, 100 rows RULE CHOOSE Example SELECT /*+ FIRST_ROWS */ * FROM EMPLOYEES ORDER BY ID ASC;
39
Access path hints Access path is a way of accessing table data. The following hints can be used: FULL – full table scan (all rows are searched) ROWID – table access by rowid (the fastest) CLUSTER – cluster scan (only for clustered objects) HASH – hash scan for cluster objects INDEX – access table using an index INDEX_FFS – index fast full scan
40
Access path hints INDEX_ASC, INDEX_DESC INDEX_COMBINE – used for bitmap indexes NO_INDEX – disables specified index for a query AND_EQUAL – merge several indexes
41
Other hints Query transformation hints USE_CONCAT – rewrite OR query to UNION ALL MERGE – merge view into a query NO_MERGE – disable view merge Join order hints ORDERED – forces join of tables in the order in which they appear in the FROM clause (very useful hint!)
42
Join operation hints USE_NL – Nested loops USE_MERGE – join two tables using sort-merge join USE_HASH – join two tables using hash join LEADING – select table that is a first table in a join order HASH_AJ, MERGE_AJ, NL_AJ – hash, merge or nested loops for anti join query HASH_SJ, MERGE_SJ, NL_SJ – hash, merge or nested loops for semi join query
43
Samples SELECT /*+ use_hash(employees departments)*/ * FROM employees, departments WHERE employees.department_id = departments.department_id; SELECT * FROM departments WHERE exists (SELECT /*+ HASH_SJ*/ * FROM employees WHERE employees.department_id = departments.department_id AND salary > 200000);
44
Optimizing SQL statements 1.Check execution plan ( EXPLAIN PLAN command). If possible: update statistics create indexes provide optimizer hints 2.Check AUTOTRACE output 3.Optimize entire database
45
SQLPlus SET TIMING ON: shows execution time for each SQL statement SET AUTOTRACE ON: shows detailed statistics for each statement: Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 1749 consistent gets 0 physical reads 0 redo size 395 bytes sent via SQL*Net to client 512 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
46
AUTOTRACE output recursive calls – how many additional SQL statements were executed db block gets – how many blocks were processed (read) in memory consistent gets – how many blocks were processed (read) in memory physical reads – number of blocks read from disk redo size – how much redo was generated from this statement
47
AUTOTRACE output bytes sent and received via SQL*NET – amount of network traffic generated by the statement sorts in memory and on disk number of rows processed (returned) from the statement
48
Resource intensive statements long running SELECT statements can have: large number of recursive calls (complex subselect) large number of consistent gets large number of physical reads large amount of data received over the network many sorts in memory and on disk
49
Resource intensive statements long running UPDATE and DELETE statements can have: large number of db block gets large number of physical reads lot of redo log generated
50
Solving typical problems large number of physical reads can mean: database memory cache is too small index should be created on one or more tables large amount of data received over the network: query results are processed on the client instead on the server many sorts (especially disk): order by, group by is used on a very large table without an index. Create index on some columns
51
Solving typical problems long running UPDATE or DELETE: try splitting the statement into several transactions execute the statement as batch statement at night, when database traffic is low consider rebuilding the application check if the WHERE condition is not a complex long running query, if so – optimize the query
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.