Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jonathan Lewis jonathanlewis.wordpress.com

Similar presentations


Presentation on theme: "Jonathan Lewis jonathanlewis.wordpress.com"— Presentation transcript:

1 Jonathan Lewis jonathanlewis.wordpress.com www.jlcomp.demon.co.uk
Reading an AWR Report Jonathan Lewis jonathanlewis.wordpress.com

2 Who am I ? Independent Consultant 31+ years in IT 26+ using Oracle
Strategy, Design, Review, Briefings, Educational, Trouble-shooting Oracle author of the year 2006 Select Editor’s choice 2007 UKOUG Inspiring Presenter 2011 ODTUG 2012 Best Presenter (d/b) UKOUG Inspiring Presenter 2012 UKOUG Lifetime Award (IPA) 2013 Member of the Oak Table Network Oracle ACE Director O1 visa for USA Jonathan Lewis © 2014

3 When you could be reading ADDM
Why ? When you could be reading ADDM Maybe you're using Statspack Not licensed / Using SE But it's not tracking a user problem Don't you have application instrumentation Can't you enable extended trace What about sampling through ASH Jonathan Lewis © 2014

4 What's slowing me down? NAME VALUE Good run
session logical reads ,302 CPU used by this session ,744 DB time , sec physical reads NAME WAITS WAIT_SEC AVG_CS MAX_CS db file sequential read NAME VALUE Bad run session logical reads ,302 CPU used by this session ,038 DB time ,924 3:19 physical reads ,184 NAME WAITS WAIT_SEC AVG_CS MAX_CS db file sequential read , Even very good instrumentation of key processes may not tell you WHY the process varies in performance beyond the fact that "the load on the machine was different". Sometimes the only way to find why MY process is slow is to scan "the system" for other processes that are denying me the resources I need. Jonathan Lewis © 2014

5 What's slowing me down? Variation in performance:
NAME VALUE Argh! session logical reads ,302 CPU used by this session ,401 DB time , :50 physical reads ,623 NAME WAITS WAIT_SEC AVG_CS MAX_CS db file sequential read , Variation in performance: 0:30 Everything is in the Oracle cache 3:20 A lot of data is read by Oracle, but in the SAN cache 14:50 Most blocks come from the SAN, from disk reads (not shown) It’s all on disc, and the discs are overloaded. Jonathan Lewis © 2014

6 What's the source? Time gives us a fourth dimension. Trouble-shooting means picking and aggregating the right slice Service Session System Cursor 1 Cursor 2 Cursor 3 Cursor 4 v$sesstat, v$session_event, v$sess_time_model v$service_stats, v$service_event, v$servicemetric_history, v$serv_mod_act_stats v$sysstat v$system_event v$sysmetric_history v$sys_time_model From 10g, and important for consolidation (esp. 12c) Latches Segments Locks Etc … Jonathan Lewis © 2014

7 AWR 12c Jonathan Lewis © 2014 Load Profile
Instance Efficiency Percentages Top 10 Foreground Events by Total Wait Time Wait Classes by Total Wait Time Host CPU Instance CPU IO Profile Memory Statistics Cache Sizes Shared Pool Statistics Time Model Statistics Operating System Statistics Operating System Statistics - Detail Foreground Wait Class Foreground Wait Events Background Wait Events Wait Event Histogram (x4) Service Statistics Service Wait Class Stats SQL ordered by ( x10) Key Instance Activity Stats Instance Activity Stats Instance Activity Stats - Absolute Values Instance Activity Stats - Thread Activity IOStat by Function summary IOStat by Filetype summary IOStat by Function/Filetype summary Tablespace IO Stats File IO Stats Buffer Pool Statistics Checkpoint Activity Instance Recovery Stats MTTR Advisory Buffer Pool Advisory PGA Aggr Summary PGA Aggr Target Stats PGA Aggr Target Histogram PGA Memory Advisory Shared Pool Advisory SGA Target Advisory Streams Pool Advisory Java Pool Advisory Buffer Wait Statistics Enqueue Activity Undo Segment Summary Undo Segment Stats Latch Activity Latch Sleep Breakdown Latch Miss Sources Mutex Sleep Summary Parent Latch Statistics Child Latch Statistics Segments by Logical Reads (14) In-Memory Segments by (x 4) Dictionary Cache Statsd Library Cache Activity Memory Dynamic Components Memory Resize Operations Summary Memory Resize Ops Process Memory Summary SGA Memory Summary SGA breakdown difference Replication System Resource Usage Replication SGA Usage GoldenGate Capture GoldenGate Capture Rate GoldenGate Apply Reader GoldenGate Apply Coordinator GoldenGate Apply Server GoldenGate Apply Coordinator Rate GoldenGate Apply Reader and Server Rate XStream Capture XStream Capture Rate XStream Apply Reader XStream Apply Coordinator XStream Apply Server XStream Apply Coordinator Rate XStream Apply Reader and Server Rate Table Statistics by DML Operations Table Statistics by Conflict Resolutions Replication Large Transaction Statistics Replication Long Running Transaction Statistics Streams CPU/IO Usage Streams Capture Streams Capture Rate Streams Apply Streams Apply Rate Buffered Queues Buffered Queue Subscribers Rule Set Persistent Queues Persistent Queues Rate Persistent Queue Subscribers Resource Limit Stats Shared Servers Activity Shared Servers Rates Shared Servers Utilization Shared Servers Common Queue Shared Servers Dispatchers init.ora Parameters init.ora Multi-Valued Parameters ASH report ADDM Report Jonathan Lewis © 2014

8 State your intention Know the environment Know the application Check:
Strategy State your intention Know the environment Know the application Check: Load profile Top N waits Time model OSStats Follow the clues Jonathan Lewis © 2014

9 Meaning (11.2.0.3) We’ll be coming back to this set of figures later.
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~ DB Time(s): DB CPU(s): Redo size: ,337, ,989.6 Logical reads: , ,242.4 Block changes: , Physical reads: , Physical writes: User calls: Parses: Hard parses: W/A MB processed: Logons: Executes: , Rollbacks: Transactions: This is a little busy (2M redo per second requires is not trivial). 3,000 executes per second suggests code that might be doing lots of little bits of activity rather than using efficient bulk-processing statements. Parses > User calls tells us that users are calling (probably) pl/sql procedures to get the work done and the pl/sql does more parsing and executing internally. Key point - if you don't know what the numbers mean how do you interpret them as "high", "acceptable" or "low" - e.g. what does "Physical reads:" actually cover. We’ll be coming back to this set of figures later. Jonathan Lewis © 2014

10 Meaning ( ) Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ DB Time(s): DB CPU(s): Redo size (bytes): , ,071.1 Logical read (blocks): , ,170.5 Block changes: Physical read (blocks): Physical write (blocks): Read IO requests: Write IO requests: Read IO (MB): Write IO (MB): User calls: Parses (SQL): Hard parses (SQL): SQL Work Area (MB): Logons: Executes (SQL): Rollbacks: Transactions: tries to be a little more information in the Load Profile. "Physical reads" is counting blocks, not read requests. Still have to check if the Read IO figures are about JUST the data blocks, or include control file, redo log, etc. Jonathan Lewis © 2014

11 Meaning (detail 1) physical reads 32,907,882 3,644.2 90.4
Instance Activity Stats DB/Inst: xxxxxxxx/xxxxxxxxx Snaps: nnnnn-nnnnn -> Ordered by statistic name Statistic Total per Second per Trans physical read IO requests ,616, physical read bytes ,581,369, ,853, ,571.5 physical read total IO requests ,709, physical read total bytes ,207,811, ,358, ,729.0 physical read total multi block ,970, physical reads ,907, , physical reads cache ,850, , physical reads cache prefetch ,247, , physical reads direct , physical reads direct (lob) physical reads direct temporary , physical reads prefetch warmup Lots of different types of reading go into the final "Physical reads" figure (and some of the reading mechanisms don't) End value Parameter Name Begin value (if different) db_file_multiblock_read_count 16 Jonathan Lewis © 2014

12 Meaning (detail 2) table scans (long tables) 89,451 9.9 0.3
Instance Activity Stats DB/Inst: xxxxxxxx/xxxxxxxxx Snaps: nnnnn-nnnnn -> Ordered by statistic name Statistic Total per Second per Trans table fetch by rowid ,345, , table fetch continued row ,419, , table scan blocks gotten ,994, , table scan rows gotten ,946,407, , ,082.6 table scans (direct read) table scans (long tables) , table scans (rowid ranges) table scans (short tables) ,552, index fast full scans (full) , The db_file_mulitblock_read_count = 16, so for a tablescan prefetch = 15/16 Almost all the reads are tablescan Cache Sizes Begin End ~~~~~~~~~~~ Buffer Cache: 10,240M 10,240M Std Block Size: K Shared Pool Size: 14,336M 14,336M Log Buffer: 71,232K Jonathan Lewis © 2014

13 Meaning (detail 3) Buffer Pool Statistics DB/Inst: xxxxxxxx/xxxxxxxx Snaps: nnnnn-nnnnn Free Writ Buffer Number of Pool Buffer Physical Physical Buff Comp Busy P Buffers Hit% Gets Reads Writes Wait Wait Waits D 1,209, ,789, ,849, ,715, E+06 Segments by Table Scans DB/Inst: xxxxxxxx/xxxxxxxxx Snaps: nnnnn-nnnnn -> Total Table Scans: ,716 -> Captured Segments account for % of Total Tablespace Subobject Obj Table Owner Name Object Name Name Type Scans %Total xxxxxx_AA xxxxxx_CC XXXXXXXXXXXXXXX P TABLE , SYS SYSTEM I_USER INDEX SYS SYSTEM I_OBJ INDEX xxxxxx_BB xxxxxx_DAT YYYYYYYYYYYYYYY TABLE xxxxxx_CC xxxxxx_CC PK_HOLIDAYS INDEX Adding cache doesn't (often) stop tablescans This set of segment stats is actually limited to long tablescans and index fast full scans. Jonathan Lewis © 2014

14 Meaning (detail 4) SQL ordered by Executions DB/Inst: xxxxxxxx/xxxxxxxxx Snaps: nnnnn-nnnnn -> %CPU - CPU Time as a percentage of Elapsed Time -> %IO - User I/O Time as a percentage of Elapsed Time -> Total Executions: ,880,039 -> Captured SQL account for % of Total Elapsed Executions Rows Processed Rows per Exec Time (s) %CPU %IO SQL Id 9,585, ,585, m8jv10f2nmn9 Module: ABCDEF begin :con := "MY_VPD_POLICY"."GET_PREDICATE_X"(:sn, :on); end; 7,604, ,603, r4tsa6t7c9z1 begin :con := "MY_VPD_POLICY"."GET_PREDICATE_Y"(:sn, :on); end; 1,527, ,527, d3xqdu13ufh4r SELECT SOME_COLUMN FROM SOME_TABLE WHERE X_CODE=:B2 AND Y_CODE=:B1 While checking for a possible source of 3,500,000 short tablescans we can see: a) the SQL that is inside pl/sql which might be 1.5M of them, possibly due to poor indexing. b) 17M executions of calls used by Row Level Security (Virtual Private Database) which might be reduced by making security functions context sensitive. c) A small fraction of the total execution count is due to the SQL we've captured - have we lost a lot of SQL executions in the interval ? Jonathan Lewis © 2014

15 Begin at the Beginning DB Name DB Id Instance Inst Num Startup Time Release RAC xxxxxxxx xxxxxxxxx Mar-14 05: YES Host Name Platform CPUs Cores Sockets Memory(GB) xxxxxxxxxxxx Linux x86 64-bit Snap Id Snap Time Sessions Curs/Sess Begin Snap: nnnnn 19-Mar-14 12:30: End Snap: nnnnn 19-Mar-14 15:00: Elapsed: (mins) DB Time: , (mins) Available CPU time = CPU Count * elapsed time in seconds = 24 * 9,030 = 216,700 Jonathan Lewis © 2014

16 "Top 5" Top 5 Timed Foreground Events ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Avg wait % DB Event Waits Time(s) (ms) time Wait Class resmgr:cpu quantum , , Scheduler DB CPU , db file scattered read ,160, , User I/O read by other session ,935, , User I/O PX Nsq: PQ load info query , , Other Resmgr: cpu quantum says: "you're out of CPU", but we know we're not using much CPU. Jonathan Lewis © 2014

17 Service Stats Service Statistics DB/Inst: xxxxxxxx/xxxxxxxxx Snaps: nnnnn-nnnnn -> ordered by DB Time Physical Logical Service Name DB Time (s) DB CPU (s) Reads (K) Reads (K) xxxxxx_AA_P , , ,262 xxxxxx_BB_P , , , ,667 SYS$USERS , ,247 HH_P , , , ,643 xxxxxx_CC_P , xxxxxx_DD_P , xxxxxx_EE_P , xxxxxx_FF_P , xxxxxx_GG_P , HH_P ,079 The answer is resource manager being used to limit services - one service is trying to go far beyond it's CPU allocation while the machine is not otherwise CPU loaded. Jonathan Lewis © 2014

18 Service Wait Class Stats
Service Wait Class Stats DB/Inst: xxxxxxxx/xxxxxxxxx Snaps: nnnnn-nnnnn -> Wait Class info for services in the Service Statistics section. -> Total Waits and Time Waited displayed for the following wait classes: User I/O, Concurrency, Administrative, Network -> Time Waited (Wt Time) in seconds Service Name User I/O User I/O Concurcy Concurcy Admin Admin Network Network Total Wts Wt Time Total Wts Wt Time Total Wts Wt Time Total Wts Wt Time xxxxxx_AA_P1 xxxxxx_BB_P1 SYS$USERS HH_P2 ... The stats reported for services are very limited - we can't add the various times reported for the xxxxxx_AA_P1 service to get its total time reported. This could be a problem for users of 12c multi-tenant, where each PDB gets its own service: there's not a lot of information in the AWR report about how a service is handling its load. Jonathan Lewis © 2014

19 Time Model Time Model Statistics DB/Inst: xxxxxxxx/xxxxxxxx3 Snaps: nnnnn-nnnnn -> Total time in database user-calls (DB Time): s Statistic Name Time (s) % of DB Time sql execute elapsed time , DB CPU , PL/SQL execution elapsed time , parse time elapsed , hard parse elapsed time , failed parse elapsed time hard parse (sharing criteria) elapsed time inbound PL/SQL rpc elapsed time connection management call elapsed time PL/SQL compilation elapsed time repeated bind elapsed time sequence load elapsed time hard parse (bind mismatch) elapsed time DB time ,064.9 background elapsed time ,397.3 background cpu time Time model stats don't often highlight anything interesting - but the parse times shown here are unusually high. Jonathan Lewis © 2014

20 Instance Efficiency Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Buffer Nowait %: Redo NoWait %: Buffer Hit %: In-memory Sort %: Library Hit %: Soft Parse %: Execute to Parse %: Latch Hit %: Parse CPU to Parse Elapsd %: % Non-Parse CPU: Statistic Total per Second per Trans parse count (describe) parse count (failures) , parse count (hard) , parse count (total) ,483, parse time cpu ,037, parse time elapsed ,982, execute count ,880, , Percentages are rarely useful because they hide scale. But if you know the scale (and the derivation) you may spot a surprising result and check further details Jonathan Lewis © 2014

21 Puzzle DB Name DB Id Instance Inst Num Startup Time Release RAC
xxxxx xxxxxx Aug-14 04: NO Host Name Platform CPUs Cores Sockets Memory(GB) xxxxxxxx AIX-Based Systems (64-bit) Cache Sizes Begin End ~~~~~~~~~~~ Buffer Cache: 258,048M 258,048M Std Block Size: K Shared Pool Size: 20,480M 20,480M Log Buffer: 550,172K SGA breakdown difference DB/Inst: xxxxxx/xxxxxx Snaps: nnnnnn-nnnnnn -> ordered by Pool, Name Pool Name Begin MB End MB % Diff java free memory , , large free memory , , shared private strands Big SGA allocate in 512MB granules, which can lead to a lot of "lost" memory Jonathan Lewis © 2014

22 Anomaly Instance Activity Stats DB/Inst: xxxxxx/xxxxxx Snaps: nnnnnn-nnnnnn -> Ordered by statistic name Statistic Total per Second per Trans HSC Compressed Segment Block Cha HSC Heap Segment Block Changes ,426, , Heap Segment Array Inserts , LOB table id lookup cache misses Number of read IOs issued , Statistic Total per Second per Trans IMU CR rollbacks , IMU Flushes , IMU Redo allocation size ,798, , IMU commits ,425, , IMU contention , IMU pool not allocated , IMU recursive-transaction flush , IMU undo allocation size ,757,286, ,319, ,175.8 IMU- failed to get a private str , Why are there no IMU stats. (Why isn't private redo being used?) splits Instance Activity Stats into Key Instance Activity Stats Other Instance Activity Stats Jonathan Lewis © 2014


Download ppt "Jonathan Lewis jonathanlewis.wordpress.com"

Similar presentations


Ads by Google