Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Understanding Common Oracle Wait Events Kirtikumar Deshpande Dallas Oracle User Group September 15, 2005.

Similar presentations


Presentation on theme: "1 Understanding Common Oracle Wait Events Kirtikumar Deshpande Dallas Oracle User Group September 15, 2005."— Presentation transcript:

1 1 Understanding Common Oracle Wait Events Kirtikumar Deshpande Dallas Oracle User Group September 15, 2005

2 2 About Me Senior Oracle DBA Verizon Information Services Phone Directories Publication

3 3 About OWI Book: “…Where this book excels is bridging the gap between the perfect measurement and implementable solutions by explaining why's and how's of the problem. I was amazed how the authors put together rational explanations of common wait events like latch free, bolstered by the elaboration of internals like hash buckets, cache buffer chains and how to rectify those - it all seems so simple when it comes out in the book. Whether you are a veteran DBA who have seen all the battles since the Civil War or a rookie just starting out, this book is for you, a vital weapon in your arsenal, especially the scripts for identifying trouble spots. If I'm allowed to keep only one book on Oracle - this will be it.” - Arup Nanda

4 4 About OWI Book: “I received this book on Tuesday and I literally could not put it down. I consumed it like a great thriller. It contains a great deal of information that can be found no where else in print. Performance monitoring and tuning with the Oracle Wait Interface is still new to many Oracle DBAs, even seasoned ones, precisely because the details of how to gather and interpret the information have been difficult to come by until now...” – John Smiley

5 5 About OWI Book: “The book is simply spectacular, both for the quality of its writing as well as the depth of the material. Practical? Indeed, indispensible! The three authors, Richmond Shee, Kirti Deshpande, and K Gopalakrishnan, have done a wonderful job in organizing a enormous subject area into manageable chunks. They have also managed to render potentially bone-dry source material into very readable text, interspersed heavily with code examples and output, sidebars, and analogies. This is a good read as well as an authoritative reference. …. The combined efforts of the three authors and the four technical editors blows my mind. …. All I can say is - get it! ” - Tim Gorman

6 6 Acknowledgement Special thanks to Richmond Shee for allowing me to use the contents of his presentation at IOUG Live! 2005

7 7 OWI Monitoring and Data capture Handling Common Oracle Wait Events – Going Beyond P1, P2, and P3 New Convert: Paradigm Shift Beginner level: Event Attributes (P1, P2, P3), Event Classification OWI Views Novice level: Separating Symptoms from Problems Intermediate level: OWI Monitoring and Data Collection Agenda

8 8 OBJECTIVE and SCOPE Take Home Information you can use to discover the root cause of performance problems and answer the 64-thousand dollar questions: Why did the job run so slowly? Why did the job run so quickly? Scope OWI Monitoring: Oracle7 to Oracle9i Database Handling OWI: Oracle7 to Oracle Database 10g

9 9 First agenda item… OWI Monitoring and Data Capture

10 10 OWI Monitoring and Data Capture Q1) Why is historical performance data important? Q2) What is the best source of performance data? – V$SYSTEM_EVENT? – V$SESSION_EVENT? – V$SESSION_WAIT? Q3) What is a good data capture method and sampling frequency? – Trace event 10046? – Statspack?

11 11 OWI Monitoring and Data Capture The importance of historical performance data… Users expect their DBAs to be omniscient DBAs are expected to be aware of performance issues 24x7 You need a history of all foreground processes ran in the instance

12 12 OWI Monitoring and Data Capture Determine the best source of data… V$SYSTEM_EVENT Pros: Cons: system-level data V$SESSION_EVENT Pros: session-level granularity Cons: session-level granularity V$SESSION_WAIT (X$KSUSECST) Pros: Fine-grain data Cons: – Changes quickly, High volume of data – Data requires translation

13 13 OWI Monitoring and Data Capture Determine the best data capture method and sampling frequency… Requirement: A performance data collector that is capable of monitoring all foreground processes on a 24x7 basis. Desired features: Wait-based philosophy Low overhead Always-on Repositories (wait events, runtime statistics, SQL statements, and SQL plans)

14 14 OWI Monitoring and Data Capture Consider the trace event 10046… Oracle’s most comprehensive trace facility. It captures wait events, SQL statements, bind variables. Fine-grain data is best for troubleshooting, but requires a lot of disk space. Disk space and overhead limitations prevent instance-wide monitoring Trace file …is not user friendly: WAIT #12: nam='db file scattered read' ela= 0 p1=106 p2=60227 p3=8 …does not have cross referencing: WAIT #1: nam='enqueue' ela= p1= p2= p3=149 …can have bugs: WAIT #0: nam='db file parallel write' ela= 2 p1=-144 p2=1 p3=0 It may add significant overhead to the RDBMS and further degrade the performance of an already slow running process. Documentation for interpreting trace file is seldom available

15 15 OWI Monitoring and Data Capture Criteria Trace Wait-based MethodologyYes Low overheadNo Monitor every process 24x7No Repository: Wait EventNo Repository: Process RuntimeNo Repository: SQL StatementNo Repository: SQL PlanNo Granularity of Performance data Fine-Grain Summary - Trace event 10046

16 16 OWI Monitoring and Data Capture Consider the database logoff trigger… Excellent for session-level summary. Great for benchmarking. Instance-wide monitoring capability. Trigger overhead depends mainly on the code. Some PL/SQL coding is necessary. Disk space requirement is generally low – depends on the logoff rate. Only available in Oracle8i and later versions. Not suitable for root cause analysis which requires fine-grain data.

17 17 OWI Monitoring and Data Capture CATEGORY SUBCATEGORY WAIT_EVENT VALUE PERCENT CPU OTHER Fetch, Execute, Lookups, etc PARSE parse time cpu RECURSIVE recursive cpu usage DISK I/O DIRECT I/O direct path read 0 0 Direct path write 0 0 FULL SCANS db file scattered read NORMAL I/O db file sequential read LATENCY COMMITS log file sync FILE OPS file open 1 0 LATCH latch free LOG FILE log file switch completion 6.01 NETWORK SQL*Net message to client SQL*Net more data from client 2 0 SQL*Net more data to client OTHER buffer busy waits MISC MISC library cache pin 2 0 An application of the database logoff trigger…

18 18 OWI Monitoring and Data Capture Summary – Database logoff trigger Criteria Trace Database Logoff Trigger Wait-based MethodologyYes Low overheadNoYes ** Monitor every process 24x7NoYes Repository: Wait EventNoYes Repository: Process RuntimeNoYes Repository: SQL StatementNo Repository: SQL PlanNo Granularity of Performance data Fine-GrainCoarse

19 19 OWI Monitoring and Data Capture Consider Statspack… Report has a lot of information that allows you to examine performance from several perspectives. Instance-level snapshots offer coarse-grain information that roughly indicates there is a problem but not specifically where the problem is - No different than querying v$system_event, v$sysstat, v$latch, etc. Session-level snapshots? How are you going to automate it? Even if session-level snapshot automation is not an issue, the data is still too coarse. - No different than querying v$session_event and v$sesstat. Difficulty in determining the best sampling frequency.

20 20 OWI Monitoring and Data Capture Summary – Statspack Criteria Trace Database Logoff Trigger Statspack Wait-based MethodologyYes Low overheadNoYes **Yes Monitor every process 24x7NoYesNo Repository: Wait EventNoYes Repository: Process RuntimeNoYes Repository: SQL StatementNo Yes Repository: SQL PlanNo Yes Granularity of Performance data Fine-GrainCoarse

21 21 OWI Monitoring and Data Capture Problem: There is no free suitable tool available. Prior to Oracle Database 10g, you have to develop your own tool or purchase very expensive 3 rd party tools. Criteria Trace Database Logoff TriggerStatspack Wait-based MethodologyYes Low overheadNoYes **Yes Monitor every process 24x7NoYesNo Repository: Wait EventNoYes Repository: Process RuntimeNoYes Repository: SQL StatementNo Yes Repository: SQL PlanNo Yes Granularity of Performance data Fine-GrainCoarse Too Expensive Too Coarse

22 22 OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) Three major areas to consider: Sampling frequency Repository Events to monitor

23 23 OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) Data Source: V$SESSION_WAIT (X$KSUSECST) Sampling frequency: Affects the quantity and granularity of data Depends on data capture method Unix Shell script PL/SQL procedure Unix Cron SNP background process

24 24 OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) Repositories: Minimum two repositories (wait events & SQL code) SQL statements help set the context and get you closer to the problem. Event: Buffer busy waitsP1 & P2 = FOOBAR tableP3 = 220 Also helps developers to locate the right module.

25 25 OWI Monitoring and Data Capture BYOT: Build Your Own Tool (using PL/SQL to capture data) Events to monitor: db file sequential read db file scattered read latch free direct path read direct path write Enqueue library cache pin buffer busy waits free buffer waits Events to ignore: KXFX: Execution Message Dequeue – Slave PX Deq: Execution Msg KXFQ: kxfqdeq - normal deqeue PX Deq: Table Q Normal Wait for credit - send blocked PX Deq Credit: send blkd Wait for credit - need buffer to send PX Deq Credit: need buffer Wait for credit - free buffer PX Deq Credit: free buffer parallel query dequeue wait PX Deque wait Parallel Query Idle Wait – Slaves PX Idle Wait dispatcher timer virtual circuit status slave wait pipe get rdbms ipc message rdbms ipc reply pmon timer smon timer WMON goes to sleep client message SQL*Net message from client (* debatable) Null event (* debatable) PL/SQL lock timer

26 26 OWI Monitoring and Data Capture 24x7 monitoring. Wait event history. –Immediate answer to why a certain process runs like molasses. –Proactive performance management. SQL statement and plan repositories. Jobs elapsed time can be determined from the sampling intervals. Low disk space requirement. Extensive PL/SQL coding. Overhead depends on the quality of code. Not suitable for short-running jobs. BYOT: Build Your Own Tool (using PL/SQL to capture data)

27 27 OWI Monitoring and Data Capture Summary – PL/SQL procedure Criteria Trace Database Logoff TriggerStatspack PL/SQL Procedure Wait-based MethodologyYes Low overheadNoYes **YesYes ** Monitor every process 24x7NoYesNoYes Repository: Wait EventNoYes Repository: Process RuntimeNoYes No Repository: SQL StatementNo Yes Repository: SQL PlanNo Yes Granularity of Performance data Fine-GrainCoarse Near fine- grain

28 28 OWI Monitoring and Data Capture Chapter 4 contains a detailed discussion of OWI monitoring and data capture

29 29 Second agenda item… Handling Common Oracle Wait Events Going beyond P1, P2, and P3

30 30 Handling Wait Events db file sequential read db file scattered read At what point do these wait events become a problem? What are they a symptom of? a) Low cache hit ratio b) Slow I/O subsystem c) Physical I/O calls d) Small block size e) Small buffer cache Handling these wait events requires you to know:- 1. The amount of time the events are costing the process. 2. The SQL statement that is associated with the events. Solution: SQL tuning

31 31 Handling Wait Events Latch Free Latch Free contention is a symptom of? a) Low SPIN_COUNT. b) Inefficient SQL statements. c) Concurrency coupled with high demands for resources. d) Insufficient number of latches. e) Insufficient or slow CPU. Handling the latch free contention requires you to know:- 1. The type of latch sessions are competing for (28 individual latch wait events in Oracle10g Release 1). 2. The amount of time a session spent waiting on latches. 3. The SQL statement that is associated with the event.

32 32 Handling Wait Events Latch Free: Shared Pool & Library Cache Contention for the Shared Pool & Library Cache latch is a symptom of? a)Hard parses – literal SQL statements. b)Soft parses. c)Oversized shared pool. d)High version count. e)Bad application Solution: If not (c), the real solution is correcting Application Workarounds: Set CURSOR_SHARING = FORCE Set SESSION_CACHED_CURSORS

33 33 Handling Wait Events – Latch Free: Cache Buffers Chains LRULRUW A Working Set Hash Latch Hash Bucket Buffer Header Hash Chain Buffers Memory

34 34 Handling Wait Events Latch Free: Cache Buffers Chains Contention for the CBC latch is symptomatic of? a) Inefficient SQL statement. b) Hot blocks. c) Long hash chains. d) Insufficient number of latches. Handling the CBC latch contention requires you to know: 1. If the contention is widespread or localized to a particular latch. 2. The SQL statements that participate in the competition.

35 35 Handling Wait Events Latch Free: Cache Buffers Chains Solutions: Tune the application and SQL statements. Reduce the level of concurrency. Workarounds: Spread the hot blocks across multiple CBC latches. Consider increasing _SPIN_COUNT (Oracle9i and above, use _LATCH_CLASS and _LATCH_CLASSES). Consider increasing _DB_BLOCK_HASH_BUCKETS. Consider increasing _DB_BLOCK_HASH_LATCHES.

36 36 Handling Wait Events Buffer Busy Waits BBW contention is a symptom of? a) Read/read, read/write, or write/write contention. b) Corrupted buffer pin. c) Insufficient INITRANS. d) Large block size. Handling the BBW contention requires you to know:- 1. The amount of time a session spent waiting on the event. 2. The reason code that represents why a process fails to get a buffer pin. 3. The class of block that the buffer busy waits event is for. 4. The SQL statements that are associated with the event. 5. The segment that the buffer belongs to.

37 37 Handling Wait Events BBW: Solutions depend on the class of block and reason code: BBW contention for data block class (class #1), reason code 130 Reduce the level of concurrency or change the way the work is partitioned between the parallel threads. Optimize the SQL statement to reduce the number of physical and logical reads. Increase the number of FREELISTS and FREELIST GROUPS. BBW contention for data block class (class #1), reason code 220 Reduce the level of concurrency or change the partitioning method. Reduce the number of rows in the block. Rebuild the object in another tablespace with a smaller block size (Oracle9i and above).

38 38 Handling Wait Events BBW : Solutions depend on the class of block and reason code: BBW contention for data segment header (class #4) Increase the number of FREELISTS and FREELIST GROUPS of the identified object. Ensure the gap between PCTFREE and PCTUSED is not too small. Ensure the next extent size is not too small. BBW contention for undo segment header (class #17) **Applies to rollback segment, not the system-managed undo. Create additional rollback segments. Ensure the next extent size is not too small. BBW contention for undo blocks (class #18) Application tuning.

39 39 Handling Wait Events Free Buffer Waits Free Buffer Waits wait is symptomatic of? a) Small buffer cache. b) Insufficient number of DBWR processes. c) Inefficient SQL statement. d) Slow I/O subsystem. e) Delayed block cleanout. Handling the Free Buffer Waits event requires you to know:- 1. The amount of time a session spent waiting on the event. 2. The SQL statements that are associated with the event. 3. The number of DBWR processes. 4. The I/O operation and database storage system.

40 40 Handling Wait Events Solutions: Optimize the SQL statements. Increase the number of DBWR processes. Use appropriate I/O operation (async or sync). Lower the FAST_START_MTTR_TARGET value. Reduce the buffer cache size. Increase the buffer cache size. Pre-scan the table after each load. Free Buffer Waits

41 41 Handling Wait Events Log File Sync Log File Sync wait is symptomatic of? a) Oversized log buffer. b) High commit frequency. c) Bad application. d) Slow LGWR process. Handling the Log File Sync event requires you to know:- 1. The amount of time a session spent waiting on the event. 2. The type of job (batch or OLTP) that is associated with the event. Solution: Reduce the commit frequency. Workarounds: Reduce the log buffer size or lower the _LOG_IO_SIZE. Increase LGWR I/O throughput.

42 42 Handling Wait Events Enqueue Enqueue contention is symptomatic of? a) Concurrent access to the DBMS_AQ package. b) Concurrent transactions with incompatible lock requests for a database resource. c) Concurrent transactions with incompatible lock requests for a latch. d) Poor application design. Handling the Enqueue contention requires you to know:- 1. The type and mode of enqueue the sessions are competing for (All enqueues have independent wait event names in Oracle Database 10g). 2. The amount of time a session spent waiting on enqueues. 3. The SQL statement that is associated with the event.

43 43 Handling Wait Events TX enqueue in mode 6 (Exclusive) Contention for the TX enqueue in mode 6 is for row-level locks. In Oracle Database 10g, this is “enq: TX – row lock contention”. Solutions: Commit or rollback the transaction holding the lock. Fix the application so that sessions don’t go after the same rows. Workaround: None

44 44 Handling Wait Events TX enqueue in mode 4 (Share) Contention for the TX enqueue in mode 4 can be due to: ITL shortage - In Oracle Database 10g: “enq: TX – allocate ITL entry”) Unique key enforcement Bitmap index entry Solution depends on the object of contention: Increase the number of INITRANS. Prevent multiple sessions from inserting the same key value into a table. Don’t use bitmap indexes.

45 45 Handling Wait Events TM enqueue in mode 3,4,5 (Row-X, Share, Share Row_X) Contention for the TM enqueue in mode 3,4,5 is normally due to non-indexed foreign key columns. Solution: Index the foreign key columns of the object identified by the TM enqueue.

46 46 Handling Wait Events Chapters 5, 6, and 7 contain a detailed discussion of how to handle common Oracle wait events.

47 47 Handling Wait Events Do you think you can use the information presented in this session to identify performance bottlenecks?

48 48 Understanding Common Oracle Wait Events Q & A


Download ppt "1 Understanding Common Oracle Wait Events Kirtikumar Deshpande Dallas Oracle User Group September 15, 2005."

Similar presentations


Ads by Google