Saturday, 7 May 2011

Identifying hot objects/blocks in Oracle database

Thanks to my colleague Jim for this!

We recently experienced a problem with one of our applications whereby a particular element was very sluggish - everything else appeared normal.

We identified the particular query responsible, via statspack, however the explain plan on it was optimal, i.e. good use of indexes as expected.

So where do we go from here?

Let's look at the waits:

SELECT v$active_session_history.event, SUM(vash.wait_time + vash.time_waited) ttl_wait_time
FROM v$active_session_history vash WHERE vash.sample_time BETWEEN sysdate - 60/2880 AND sysdate
GROUP BY vash.event
ORDER BY 2;

EVENTTTL_WAIT_TIME
os thread startup42384
log file parallel write296628
log file sync315965
control file parallel write516435
latch free521779
db file scattered read10627019
db file sequential read19487981
read by other session21554323
latch: cache buffers chains48295239
 140879051


Here we can see the figures for "read by other session" and "latch: cache buffer chains" are particularly high.

Both waits are related to blocks in the buffer cache so we look to see if there are any obvious hot objects, by looking at the child latches of cache buffer chains where there are high sleep counts (>1500) and number of block hits is greater than 5:

SELECT /*+ RULE */e.owner ||'.'|| e.segment_name segment_name, e.extent_id extent#, x.dbablk - e.block_id + 1 block#, x.tch, l.child#
FROM sys.v$latch_children l, sys.x$bh x, sys.dba_extents e
WHERE x.hladdr IN (SELECT addr
FROM v$latch_children
WHERE name='cache buffers chains'
AND sleeps>1500)
AND e.file_id = x.file
AND x.hladdr = l.addr
AND x.dbablk between e.block_id
AND e.block_id + e.blocks -1
AND x.tch > 5
ORDER BY x.tch desc;

SEGMENT_NAMEEXTENT#BLOCK#TCHCHILD#
AAA.BBB6079397577
AAA.BBB8198386861
AAA.BBB62104383205
AAA.BBB81130375033
........       
........       
AAA.BBB7968965947



As you can see 1 object is pretty much the cause of these wait events, this is also consistent with the object in the query identified by statspack.

So what this effectively means is that multiple sessions are simultaneously trying to access rows of the table contained within the same block. We are getting waits whilst a second session tries to access the blocks as session one loads them.

So what can we do to try and reduce the problem? Well the best we can do is rebuild the object with adjusted PCTUSED (smaller) and PCTFREE (higher) so that fewer rows will occupy each block.

Hopefully you will then find reduced contention as few rows in a block mean less chance of two queries going for the same blocks at the same time.

Should this not resolve the situation an application review should be conducted to reduce the contention placed on this object.

For further reading look at oracle support 163424.1 732891.1
Published with Blogger-droid v1.6.7