Buffer Cache Waits
#.2 In This Section 1.latch: cache buffers chains 2.latch: cache buffers lru chain 3.latch: cache buffer handles 4.Free Buffer Wait 5.Buffer Busy Wait 6.Write Complete Wait 7.Buffer Exterminate
#.3 Buffer Cache Redo Lib Cache Buffer Cache IO Locks Network
#.4 REDO Log Files Data Files DBWR LGWR User2 User1 User3 Log Buffer Buffer Cache Log Buffer Buffer Cache SGA Library Cache Oracle Memory Structures
#.5 Buffer Cache Access Buffer Cache Management Locating Free blocks Finding data blocks Managing LRU lists Cleaning Dirty Blocks Buffer Cache management can cause contention Different from IO ( reading blocks of disk )
#.6 Query 0. Parse statement 1.Find object information in data dictionary 2.Calculate execution plan 3.If full table scan Look at all blocks of table 4.If index find root of index and follow to key 5.Data Dictionary will have info about table or index block File # Block # 6.Once you know the block DBA (file# + block#) … Select ename from emp where empno = 12;
#.7 Is Block in cache? Now you have a file# and block# How do you know if a block is cached? ShadowProcess ? Do you search all the blocks? Could be 1000s of blocks to search. Buffer caches are in the multi Gig
#.8 Buffer Cache Find a block by: 1) Hash of Data file # Block# 2) Result = Bucket # 3) Search linked list for that bucket # What is a hash value What are Buckets What is the linked list?
#.9 Concepts To understand contention on the buffer cache, need to understand : 1.Linked Lists 2.Hashing 3.Buckets
#.10 Double Linked Lists 03C C C38F60 03C C38F60 03C39478 Address Next Previous
#.11 Hashing Function Simple hash could be a Mod function 1 mod 4 = 1 2 mod 4 = 2 3 mod 4 = 3 4 mod 4 = 0 5 mod 4 = 1 6 mod 4 = 2 7 mod 4 = 3 8 mod 4 = 0 Using “mod 4” as a hash funtion creates 4 “buckets” to store things
#.12 Hash Bucket Fill Data Block Hash Block’s file# block #’s Result in a bucket# Put Block in bucket ? ? ? ? Hash Block’s 1 file# 437 block #’s (1+437) mod 4 = 2 After a while the buckets become populated with blocks
#.13 Latches Protect Bucket Contents Buffer Headers Data Blocks Hash bucket latches Buffer Headers contents described by X$BH
#.14 X$bh Describes Contents of Buffer Headers SQL> desc x$bh Name Type ADDR RAW(4) DBARFIL NUMBER DBABLK NUMBER OBJ NUMBER HLADDR RAW(4) NXT_HASH RAW(4) PRV_HASH RAW(4) … much more ADDR DBARFIL DBABLK OBJ HLADDR NXT_HASH PRV_HASH … A each buffer header contains Information about the data block It points to and the previous and next Buffer header in a linked list
#.15 Cache 03C C C38F60 03C C38F60 03C39478 ADDR NXT_HASH PRV_HASH
#.16 X$BH describes Headers Buffer Headers Data Blocks Hash bucket latches HLADDR NXT_HASH PRV_HASH ADDR DBARFIL DBABLK OBJ x$bh ADDR DBARFIL DBABLK OBJ HLADDR NXT_HASH PRV_HASH
#.17 To Find a Block 1.Hash the block address 2.Get Bucket latch 3.Look for header 4.Found, read block in cache 5.Not Found Read block off disk ShadowProcess Buffer Headers Data Blocks Hash bucket latches
#.18 Cache Buffers Chains Hash Buckets s5 s4 s3 s2 s1 Sessions Contention if too many accesses on a bucket latches Block Headers Cache Buffer Chain Data Blocks
#.19 Examples S1 S2 S3 S4 1.Look up Table 2.Nested Loops Select t1.val, t2.val from t1, t2 where t1.c1 = {value} and t2.id = t1.id; t1 Index_t2 t2
#.20 CBC Solutions Find SQL ( Why is application hitting the block so hard? ) Nested loops, possibly Hash Partition Uses Hash Join Hash clusters Look up tables (“ select language from lang_table where...”) Change application Use plsql function Spread data out to reduce contention Select from dual Possibly use x$dual How do you find the SQL?
#.21 CBC: Statspack 9i Top 5 Timed Events ~~~~~~~~~~~~~~~~~~ % Total Event Waits Time (s) Ela Time latch free 21,428 1, CPU time PL/SQL lock timer SQL*Net message from dblink 4, db file sequential read 1, Top 5 Timed Events ~~~~~~~~~~~~~~~~~~ % Total Event Waits Time (s) Ela Time latch free 21,428 1, CPU time PL/SQL lock timer SQL*Net message from dblink 4, db file sequential read 1, Top 5 Timed Events ~~~~~~~~~~~~~~~~~~ % Total Event Waits Time (s) Ela Time latch free 21,428 1, CPU time PL/SQL lock timer SQL*Net message from dblink 4, db file sequential read 1, Latch Sleep breakdown for DB: CDB Instance: cdb Snaps: > ordered by misses desc Latch Name Requests Misses Sleeps Sleeps 1-> cache buffers chains 12,123, ,415 15,759 0/0/0/0/0 library cache pin 12,027, ,446 2, /743/8/1/0 library cache 12,072,503 98,065 2, /279/47/0/0 simulator lru latch /426/4/0/0 Latch Sleep breakdown for DB: CDB Instance: cdb Snaps: > ordered by misses desc Latch Name Requests Misses Sleeps Sleeps 1-> cache buffers chains 12,123, ,415 15,759 0/0/0/0/0 library cache pin 12,027, ,446 2, /743/8/1/0 library cache 12,072,503 98,065 2, /279/47/0/0 simulator lru latch /426/4/0/0 Fails to find SQL
#.22 CBC: Statspack 10g Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time CPU time latch: cache buffers chains latch: library cache pin latch: library cache log file sequential read Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time CPU time latch: cache buffers chains latch: library cache pin latch: library cache log file sequential read Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time CPU time latch: cache buffers chains latch: library cache pin latch: library cache log file sequential read Fails to find SQL
#.23 CBC: ASH select count(*), sql_id, nvl(o.object_name,ash.current_obj#) objn, substr(o.object_type,0,10) otype, CURRENT_FILE# fn, CURRENT_BLOCK# blockn from v$active_session_history ash, all_objects o where event like 'latch: cache buffers chains' and o.object_id (+)= ash.CURRENT_OBJ# group by sql_id, current_obj#, current_file#, current_block#, o.object_name,o.object_type order by count(*) / CNT SQL_ID OBJN OTYPE FN BLOCKN a09r4dwjpv01q MYDUAL TABLE SQL Statement: Success Extra: Hot block
#.24 CBC: OEM
#.25 CBC: ADDM Problem SQL Statement Solution?
#.26 CBC – Further Investigation select * from v$event_name where name = 'latch: cache buffers chains' EVENT# NAME latch: cache buffers chains PARAMETER1 PARAMETER2 PARAMETER address number tries NOTE: _db_block_hash_buckets = # of hash buckets _db_blocks_per_hash_latch = # of hash latches
#.27 CBC: what’s the hot block Can get it from ASH Current_file# Current_block# Where event=‘latch: cache buffers chains” Sometimes file and block = 0 Seems to happen for Nested Loops Get the hot block real time Use Hash Latch Address Ash.p2 = x$bh.hladdr
#.28 Hot Block: X$BH.TCH Updated when block read Updated by no more than 1 every 3 seconds Can be used to find “hot” blocks Note: set back to zero when block cycles through the buffer cache
#.29 CBC – Real Time select count(*), lpad(replace(to_char(p1,'XXXXXXXXX'),' ','0'),16,0) laddr from v$active_session_history where event= 'latch: cache buffers chains' group by p1; select o.name, bh.dbarfil, bh.dbablk, bh.tch from x$bh bh, obj$ o where tch > 100 and hladdr=' D ' and o.obj#=bh.obj order by tch COUNT(*) LADDR D NAME DBARFIL DBABLK TCH EMP_CLUSTER
#.30 Putting into one Query select name, file#, dbablk, obj, tch, hladdr from x$bh bh, obj$ o where o.obj#(+)=bh.obj and hladdr in ( select ltrim(to_char(p1,'XXXXXXXXXX') ) from v$active_session_history where event like 'latch: cache%' group by p1 having count(*) > 5 ) and tch > 5 order by tch NAME FILE# DBABLK OBJ TCH HLADDR BBW_INDEX BD91180 IDL_UB1$ BDB8A80 VIEW$ BD91180 VIEW$ BDB8A80 DUAL BDB8A80 DUAL BD91180 MGMT_EMD_PING BDB8A80 This can be misleading, as TCH gets set to 0 ever rap around the LRU and it only gets updated once every 3 seconds, so in this case DUAL was my problem table not MGMT_EMD_PING
#.31 Consistent Read Blocks Current Block (XCUR) s1 s2 Update Select Consistent Read (CR) Clone & Undo Both have same file# and block# and hash to same bucket
#.32 latches CBC: Consistent Read Blocks Cache Buffer Chain Contention: Too Many Buffers in Bucket s5 s4 s3 s2 s1 Hash Buckets Block Headers Max length : _db_block_max_cr_dba 10g = 6
#.33 Consistent Read Copies select count(*), name, file#, dbablk, hladdr from x$bh bh, obj$ o where o.obj#(+)=bh.obj and hladdr in ( select ltrim(to_char(p1,'XXXXXXXXXX') ) from v$active_session_history where event like 'latch: cache%' group by p1 ) group by name,file#, dbablk, hladdr having count(*) > 1 order by count(*); CNT NAME FILE# DBABLK HLADDR MYDUAL C9F4B20
#.34 CBC : Solution Fine the SQL causing the problem Change Application Logic Eliminate hot spots Look up tables Uses pl/sql functions Minimize data per block Possibly using x$dual instead of dual Index Nested loops Hash join Hash partition index Hah Cluster Updates, inserts, select for update on blocks while reading those blocks Cause multiple copies select ash.sql_id, count(*), sql_text from v$active_session_history ash, v$sqlstats sql where event='latch: cache buffers chains' and sql.sql_id(+)=ash.sql_id group by ash.sql_id, sql_text;
#.35 Latch: cache buffer handles Buffers can be pinned Possibly increase _db_handles_cached 5 Unsupported Used when pinning block headers for expected reuse
#.36 Free Buffer Wait Data Block Cache lack free buffers Tune by Increase data blocks Try to tune DBWR Improving Inefficient SQL requesting large # of blocks
#.37 Free Buffer Wait Finding a Free Block If the data block isn’t in cache Get a free block and header in the buffer cache Read it off disk Update the free header Read the block into the buffer cache Need Free Block to Read in New Data Block
#.38 Finding a Free Block ShadowProcess When a session reads a block Into the bufffer cache how does it find a FREE spot?
#.39 Finding a Free Block Buffer Headers Data Blocks Hash bucket latches 1.Arrange the Buffer Headers into an LRU List 2.Scan LRU for a free block
#.40 Cache Buffers LRU = entry in x$bh
#.41 X$bh Describes Buffer Headers SQL> desc x$bh Name Type ADDR RAW(4) DBARFIL NUMBER DBABLK NUMBER OBJ NUMBER HLADDR RAW(4) NXT_HASH RAW(4) PRV_HASH RAW(4) NXT_REPL RAW(4) PRV_REPL RAW(4) NXT_REPL RAW(4) PRV_REPL RAW(4) HLADDR RAW(4) NXT_HASH RAW(4) PRV_HASH RAW(4) Cache buffer chains LRU
#.42 LRU Chain 03C C38F60 03C C38F60 03C39478 ADDR NXT_HASH PRV_HASH 03C C C C C385F4 03C38554 NXT_REPL PRV_REPL
#.43 Cache Buffers LRU list
#.44 Cache Buffers LRU list LRU Chain of Buffer Headers Buffer Cache
#.45 Cache Buffers LRU Latch MRU LRU Buffer Headers “Cold” LRU = Least Recently Used MRU = Most Recently Used One LRU Latch protects the linked list during changes to the list “Hot” LRU latch
#.46 Session Searching for Free Blocks MRU LRU Buffer Headers Session Shadow 1.Go to the LRU end of data blocks 2.Look for first non-dirty block 3.If search too many post DBWR to make free 4.Free Buffer wait
#.47 Free Buffer Wait Solutions Tune by Increase data blocks Try to tune DBWR ASYNC If no ASYNC use I/O Slaves (dbwr_io_slaves) Multiple DBWR (db_writer_processes) Direct I/O Tune Inefficient SQL requesting large # of blocks
#.48 Session Finding a Free BlockMRU LRU Hot End Mid-Point Insertion Get LRU Latch Find Free Block Insert Header Release LRU Latch session LRU Latch
#.49 DBWR taking Dirty Blocks offMRU LRU Buffer Headers LRU DBWR Dirty List of Buffer Headers LRUW latch LRU latch also covers DBWR list of dirty blocs
#.50 Cache Buffers LRU Latch MRU LRU Mid-Point Insertion Oracle Tracks the touch count of blocks. As the block is pushed to the LRU end, if it’s touch count is 3 or more, it’s promoted to the MRU end
#.51 Multiple Sets Solution: Multiple Sets _db_block_lru_latches = 8 10gR2 with cpu_count = 2 X$KCBWDS – set descriptor Set 1 Set 2 LRU Latch 1 LRU Latch 2
#.52 Working Sets select ds.set_id, ds.blk_size, bp.BUFFERS, nvl(bp.name.’unused’) from x$kcbwds ds, v$buffer_pool bp where ds.set_id >= bp.lo_setid (+) and ds.set_id <= bp.hi_setid (+) / SET_ID BLK_SIZE BUFFERS NAME DEFAULT DEFAULT
#.53 Test Case 8 Sessions reading separate tables Tables were too big to hold in cache cache option set on each table Result : lots of buffer cache churn Expected to get “latch: cache buffer chains LRU”
#.54 simulator lru latch
#.55 CBC – Further Investigation select p2, count(*) from v$active_session_history where event= 'latch free' group by p2 select * from v$latchname where latch#=127 P2 COUNT(*) LATCH# NAME simulator lru latch select * from v$event_name where name = 'latch free' PARAMETER1 PARAMETER2 PARAMETER address number tries
#.56 db_cache_advice Alter system set db_cache_advice=off; Group “other” is very small compared to I/O wait time – not a problem
#.57 Cache Buffers LRU Latch : Solution Other Increase Size of Buffer Cache Using multiple cache buffers Keep, recycle Possibly increase _db_block_lru_latches Not supported
#.58 Buffer Busy Waits User 1 tries to change a buffer header User 2 has buffer header “locked” (pinned) User1 User2
#.59 BBW Solution Paths 1.Find Block type Resolve if possible 2.Tune SQL Find SQL How often is it called By how many Users 3.Eliminate Hot Block Find Object Find Block Type Block Types: Undo Header use AUM (or add more RBS) Undo Block – hot spot in UNDO Data index – hot spot, partition table – free lists, ASSM, partition Segment header – free lists table datablock -> freelists Freelist blocks – free lists groups File Header Block – look at extent allocation There is a hot block, eliminate the hot block
#.60 BBW: Statspack Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time(s) (ms) Time buffer busy waits 5, log file parallel write read by other session db file parallel write 2, db file sequential read Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time(s) (ms) Time buffer busy waits 5, log file parallel write read by other session db file parallel write 2, db file sequential read Class Waits Wait Time (s) Avg Time (ms) file header block data block 6, undo header segment header Class Waits Wait Time (s) Avg Time (ms) file header block data block 6, undo header segment header fails to find Object
#.61 BBW: ASH Finds Object Block Type SQL Statement CNT OBJ OTYPE SQL_ID BLOCK_TYPE TBS BBW_INDEX_VAL_I INDEX 635xhydd6fzgg segment header SYSTEM xhydd6fzgg usn 5 header UNDOTBS hsb81ypyrfs5 file header block UNDOTBS1 32 BBW_INDEX_VAL_I INDEX 1hsb81ypyrfs5 data block SYSTEM 33 BBW_INDEX_VAL_I INDEX 6avm49ys4k7t6 data block SYSTEM 34 BBW_INDEX_VAL_I INDEX 5wqps1quuxqr4 data block SYSTEM
#.62 BBW: OEM
#.63 Solutions
#.64 BBW Block Types select rownum n,ws.class from v$waitstat; NAME P1 P2 P buffer busy waits file# block# class# NAME P1 P2 P buffer busy waits file# block# class# select * from v$event_name where name = 'buffer busy waits' N CLASS data block 2 sort block 3 save undo block 4 segment header 5 save undo header 6 free list 7 extent map 8 1st level bmb 9 2nd level bmb 10 3rd level bmb 11 bitmap block 12 bitmap index block 13 file header block 14 unused 15 system undo header 16 system undo block 17 undo header 18 undo block Note: Before 10g, P3 was BBW type If P3 in 100,110,120,130 then read Now “read by other session” Else Write, P3 in 200,210,220,230, 231
#.65 Joining ASH with v$waitstat select o.object_name obj, o.object_type otype, ash.SQL_ID, w.class from v$active_session_history ash, ( select rownum class#, class from v$waitstat ) w, all_objects o where event='buffer busy waits' and w.class#(+)=ash.p3 and o.object_id (+)= ash.CURRENT_OBJ# Order by sample_time; OBJ OTYPE SQL_ID CLASS TOTO1 TABLE 8gz51m9hg5yuf data block TOTO1 TABLE 8gz51m9hg5yuf segment header TOTO1 TABLE 8gz51m9hg5yuf data block
#.66 Alternative to ASH: AWR select to_char(BEGIN_INTERVAL_TIME,'DD-MON HH:MI'), o.name, s.BUFFER_BUSY_WAITS_DELTA from dba_hist_seg_stat s, dba_hist_snapshot sn, obj$ o where BUFFER_BUSY_WAITS_DELTA > 100 and sn.snap_id = s.snap_id and o.obj# = s.obj#; TO_CHAR(BEGI NAME BUFFER_BUSY_WAITS_DELTA JAN 10:21 TOTO
#.67 Example: BBW with Insert Concurrent inserts will insert into the same block Each session has to wait for the previous session to finish it’s write Usually pretty fast Contention builds on highly concurrent applications Lack of Free Lists Not Using ASSM (Automatic Segment Space Management)
#.68 Example: Lack of Free List S1 S2 S3 S4 4 Sessions running Insert into toto values (null, ‘a’); Commit; OBJN OTYPE FILEN BLOCKN SQL_ID BLOCK_TYPE TOTO1 TABLE gz51m9hg5yuf data block TOTO1 TABLE gz51m9hg5yuf segment header
#.69 Solution1: Free Lists S1 S2 S3 S4 4 Sessions running Insert into toto values (null, ‘a’); Commit;
#.70 Solution 2: ASSM Multiple Bitmap Blocks Track Free Space Unformatted Up to 25% Free Up to 50% Free Up to 75% Free Full Free block chosen by Process ID Possibly instance # for RAC
#.71 Solution 2: ASSM Header Level 2 Level 1 DataBlocks BitmapBlocks
#.72 Tablespace Types : ASSM select tablespace_name, extent_management LOCAL, allocation_type EXTENTS, segment_space_management ASSM, initial_extent from dba_tablespaces TABLESPACE_NAME LOCAL EXTENTS ASSM SYSTEM LOCAL SYSTEM MANUAL UNDOTBS1 LOCAL SYSTEM MANUAL SYSAUX LOCAL SYSTEM AUTO TEMP LOCAL UNIFORM MANUAL USERS LOCAL SYSTEM AUTO EXAMPLE LOCAL SYSTEM AUTO DATA LOCAL SYSTEM MANUAL create tablespace data2 datafile '/d3/kyle/data2_01.dbf' size 200M segment space management auto;
#.73 BBW: ASSM Consider using Freelists instead of ASSM Normally waits on ASSM blocks should be too small to warrant using Freelists ASSM is easier, automatically managed 1st level bmb 2nd level bmb 3rd level bmb
#.74 BBW on Index Index Session 1 Session 2 Session 3 Increasing index key creates a hot spot on the leading index leaf OBJN OTYPE FILEN BLOCKN SQL_ID BLOCK_TYPE BBW_INDEX_INDEX dgthz60u28d data block 1 Use Reverse Key indexes Breaks Index scans Hash Partition Index More IOs per index access
#.75 BBW on Index : ADDM Recs Also consider “reversing” the key
#.76 Example: BBW on RBS IF BBW happen on old style RBS Class# > 18 Switch to UNDO Old style RBS, the DBA had to figure out # of RBS Segments With UNDO, it is automatically managed alter system set undo_management=auto scope=spfile;
#.77 BBW and RBS Segs OBJN OTYPE FILEN BLOCKN SQL_ID BLOCK_TYPE TOTO1 TABLE gz51m9hg5yuf data block TOTO1 TABLE gz51m9hg5yuf segment header gz51m9hg5yuf 87 Select CURRENT_OBJ#||' '||o.object_name objn, o.object_type otype, CURRENT_FILE# filen, CURRENT_BLOCK# blockn, ash.SQL_ID, w.class ||' '||to_char(ash.p3) block_type from v$active_session_history ash, (select rownum class#, class from v$waitstat ) w, all_objects o where event='buffer busy waits' and w.class#(+)=ash.p3 and o.object_id (+)= ash.CURRENT_OBJ# Order by sample_time;
#.78 Further Investigation RBS Old Style RBS if Class# > 18 P1 P2 P3 SQL_ID COUNT(*) CLASS wa5hjpzr0by gkmtvxzu6p2m zx1krfcgn88t 8 data block s29zyzr55z2t 1 select segment_name, segment_type from dba_extents where file_id = P1 and P2 between block_id and block_id + blocks – 1; select segment_name, segment_type from dba_extents where file_id = P1 and P2 between block_id and block_id + blocks – 1; SEGMENT_NAME SEGMENT_TYPE R2 ROLLBACK
#.79 ADDM finds old style RBS
#.80 BBW: File Header Querying ASH, make sure P1=current_file# P2=current_block# If not, use p1, p2 and not current_object# Time P1 P2 OBJN OTYPE FN BLOCKN BLOCK_TYPE : file header block 11: TOTO TABLE file header block SELECT A.OBJECT_ID FROM ALL_OBJECTS A, ( SELECT * FROM ALL_OBJECTS WHERE ROWNUM < 1000) B ORDER BY A.OBJECT_NAME
#.81 BBW : File Header Time P1 P2 OBJN OTYPE FN BLOCKN BLOCK_TYPE : TOTO TABLE file header block Solution is make initial and next extent larger in Temp Table Space ADDM doesn’t say much
#.82 write complete waits Usually happens in tandem with free buffer Tune by Increase data block cache Happens because shadow wants to access blocks that are currently being written to disk by DBWR also seen it happen when there is a lot of write to sort the waits are on block 2 of the temp tablespace file
#.83 Write Complete Waits LRU DBWR Dirty List of Buffer Headers LRUW Sessio n
#.84 Buffer Exterminate Buffer cache dynamically resized V$SGA_DYNAMIC_COMPONENTS displays information about the dynamic SGA components. This view summarizes information based on all completed SGA resize operations since instance startup. V$SGA_CURRENT_RESIZE_OPS displays information about SGA resize operations which are currently in progress. An operation can be a grow or a shrink of a dynamic SGA component. V$SGA_DYNAMIC_FREE_MEMORY displays information about the amount of SGA memory available for future dynamic SGA resize operations. Alter system set db_cache_size=50M;
#.85 Summary Buffer Cache Waits 1.latch: cache buffers chains - find SQL Eliminate hot spots 2.latch: cache buffers lru chain – increase sets 3.Free Buffer Wait - increase cache size 4.Buffer Busy Wait Index : alleviate hot spots, partition Data DML : add free lists or use ASSM File Segment Header : looked at high extent allocations 5.Write Complete Waits - increase cache size