16K CR gets for UQ index scan? Easy. Kind of.

I planned to write on this for quite some time, but failed to do so. Sorry about that. Today I finally got time and desire to describe a situation from the title. It was observed on an Oracle 9.2.0.8 running Solaris SPARC; manifested itself as a severe CPU burning at 100% utilization with ‘latch free’ on the first place of the Top 5 wait events in Statspack report.

Here is part of a 1 hour report:

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:            -72,280.51            -18,601.83
              Logical reads:            411,881.06            106,000.09
              Block changes:              2,515.53                647.39
             Physical reads:                665.17                171.18
            Physical writes:                218.18                 56.15
                 User calls:                394.87                101.62
                     Parses:                 79.46                 20.45
                Hard parses:                  0.94                  0.24
                      Sorts:                885.81                227.97
                     Logons:                  0.05                  0.01
                   Executes:              1,160.92                298.77
               Transactions:                  3.89

  % Blocks changed per Read:    0.61    Recursive Call %:     75.88
 Rollback per transaction %:    5.17       Rows per Sort:      5.95

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:   99.98       Redo NoWait %:    100.00
            Buffer  Hit   %:   99.84    In-memory Sort %:    100.00
            Library Hit   %:   99.93        Soft Parse %:     98.82
         Execute to Parse %:   93.16         Latch Hit %:     89.32
Parse CPU to Parse Elapsd %:   84.20     % Non-Parse CPU:     99.57

 Shared Pool Statistics        Begin   End
                               ------  ------
             Memory Usage %:   90.41   91.42
    % SQL with executions>1:   31.20   35.28
  % Memory for SQL w/exec>1:   30.16   30.05

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total
Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
latch free                                      2,529,903      49,710    48.22
CPU time                                                       31,694    30.75
db file sequential read                         1,228,775      13,164    12.77
buffer busy waits                                 269,229       3,076     2.98
enqueue                                             2,276       2,567     2.49 

And this is Latch sleep breakdown:

                                      Get                            Spin &
Latch Name                       Requests      Misses      Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
cache buffers chains        4,190,999,464 ###########   2,524,798 467719813/20
                                                                  75503/172064
                                                                  /31324/0
library cache                  27,510,901      44,028         931 43116/895/15
                                                                  /2/0
row cache enqueue latch        10,692,790      27,738          16 27722/16/0/0
                                                                  /0
cache buffer handles            6,478,263       7,240          66 7175/64/1/0/
                                                                  0
shared pool                     2,657,115       5,706         412 5300/400/6/0
                                                                  /0
...

As you can see, ‘latch free’ waits are almost entirely accounted to the infamous ‘cache buffers chains’ latch gets. You can easily find information on what it is and why it could happen, and how to diagnose such contention. Here I’ll just show you what I’ve found by accident (just staring at the report). I noticed that two queries – #2 and #3 in the SQL Ordered by Gets section – which are nothing but a unique index access to a table, had ~16K gets per execution:

                                                     CPU      Elapsd
  Buffer Gets    Executions  Gets per Exec  %Total Time (s)  Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
    333,759,439       21,538       15,496.3   22.4  7028.63  24546.46 4091048759
Module: JDBC Thin Client
SELECT 1 FROM T WHERE ID = :B1 FOR UPDATE

    259,164,764       16,137       16,060.3   17.4  5391.01  18934.74  307775861
Module: JDBC Thin Client
select ID from T where ID=:1

How that could possibly happen? At first I thought the index used by queries was missing, but that was not the case. Second thought – the index has exploded due to modifications (I found one query that probably had inserted lots of rows to the table, but rows_processed for the statement was zero) and its height was increased, which caused other sessions to do much more work when accessing both index and table blocks. That was just an idea which I didn’t know how to confirm or dismiss so I asked wise guys and they were kind enough to point out to me that index height increase is not a necessity to have a spike in CR gets for concurrent SQL access. All that is needed is just an uncommitted transaction that inserts rows to the table somewhere nearby the normally accessed data. Yep, it sounded reasonable to me and I was OK with this explanation. I did confirm that the long running INSERT statement was indeed adding millions of rows to the table due to an application bug (missing condition), but failed to do so because of transaction timeout on the WebLogic set to half an hour. So the statement actually was rolled back and that’s why rows_processed for the query was zero.
But when I tried to reproduce this situation to prepare test case for this blog entry, I was not able to do so. A simple access by index to the table with an in-flight transaction inserting huge amount of data is definitely not enough to reproduce such situation. So it must have been one of those scenarios when a (table or index) block has been updated many times, and subsequent transactions had to apply undo to reverse changes to a block made by other transactions. It’s very easy to setup so I won’t do that here. Also I think it’s impossible to identify which block caused the issue having just a standard Statspack report in hands. Anyway, here is statistics comparison that shows undo access was the cause for increased LIO and ‘cache buffers chain’ latch contention:

Statname Normal R1 R2 Comments
branch node splits

29

329

421

consistent changes

44,192,320

779,500,412

1,269,485,409

number of times undo was applied to construct a consistent block version
consistent gets

209,640,205

1,035,154,852

1,466,065,189

consistent gets – examination

142,031,460

889,574,780

1,366,583,799

number of times a block (such as undo or index root) was accessed under protection of ‘cache buffers chain’ latch
data blocks consistent reads – undo records applied

43,973,030

779,361,353

1,269,187,599

number of undo records applied while constructing CR block copy
leaf node splits

6,200

90,367

113,873

transaction tables consistent read – undo records applied

1,824

80,503

4,863

user calls

6,526,934

2,277,792

1,429,810

6 Comments

  1. Posted November 21, 2011 at 11:47 | Permalink | Reply

    Nice example with an explanation from Jonathan Lewis of a similar issue (but due to transaction table rollbacks due to block cleanout).

  2. Posted November 21, 2011 at 11:54 | Permalink | Reply

    Good post, Timur! Thanks for update your blog 🙂

  3. Posted November 24, 2011 at 01:39 | Permalink | Reply

    Timur,

    The very large number of leaf node splits is a clue.
    I did a little test as follows and got a result on just 10,000 rows inserted that seems to be heading in the right direction:

    rem	=========
    rem	Session 1
    rem	=========
    
    create table t1 (id number , v1 varchar2(40), v2 varchar2(10));
    
    insert into t1 values (1,rpad('x',40),'x');
    insert into t1 values (1000000,rpad('x',40),'x');
    commit;
    
    alter table t1 add constraint t1_pk primary key (id,v1);
    execute dbms_stats.gather_table_stats(user,'t1');
    
    rem	=========
    rem	Session 2
    rem	=========
    
    set transaction read only;
    
    rem	=========
    rem	Session 1
    rem	=========
    
    begin
    	for i in reverse 2..10000  loop
    		insert into t1 values(i,rpad('x',40),'x');
    	end loop;
    end;
    /
    
    rem	=========
    rem	Session 2
    rem	=========
    
    select id from t1 where id = 1;
    
    
    -- key session stats for session 2.
    session logical reads                                                       24
    consistent gets                                                             24
    consistent gets from cache                                                  24
    consistent gets - examination                                               22
    consistent changes                                                          67
    CR blocks created                                                            1
    data blocks consistent reads - undo records applied                         67
    IMU CR rollbacks                                                            47
    
    
    • Posted November 24, 2011 at 10:26 | Permalink | Reply

      Jonathan,

      thank you for test case. Indeed, I was able to repeat something very similar to your setup even with a single insert statement, like this (I use specific value for start with of the sequence to simulate real life index):

      drop table t1 cascade constraints purge;
      create table t1 (id number, pad char(200) default 'x');
      
      drop sequence t1_seq;
      create sequence t1_seq start with 12345678912345678910;
      
      insert into t1 values (1, 'x');
      insert into t1 values (t_seq.nextval, 'x');
      commit;
      
      alter table t1 add constraint t1_pk primary key (id);
      execute dbms_stats.gather_table_stats(user, 't1');
      
      --
      
      insert into t1 select t1_seq.nextval, 'x' from dual connect by level <= 100000;
      
      SQL> @sesstat consistent
      
      NAME                                                                              VALUE
      ---------------------------------------------------------------------- ----------------
      consistent gets                                                                  19,738
      consistent gets from cache                                                       19,738
      consistent gets from cache (fastpath)                                             9,089
      consistent gets - examination                                                    10,529
      consistent gets direct                                                                0
      consistent changes                                                                9,339
      transaction tables consistent reads - undo records applied                            0
      transaction tables consistent read rollbacks                                          0
      data blocks consistent reads - undo records applied                               9,226
      no work - consistent read gets                                                    7,716
      cleanouts only - consistent read gets                                                 1
      rollbacks only - consistent read gets                                                 0
      cleanouts and rollbacks - consistent read gets                                       23
      
      13 rows selected.
      
      SQL> select * from t1 where id = 12345678912345678910;
      
      no rows selected
      
      SQL> @sesstat consistent
      
      NAME                                                                              VALUE
      ---------------------------------------------------------------------- ----------------
      consistent gets                                                                  20,200
      consistent gets from cache                                                       20,200
      consistent gets from cache (fastpath)                                             9,089
      consistent gets - examination                                                    10,990
      consistent gets direct                                                                0
      consistent changes                                                                9,800
      transaction tables consistent reads - undo records applied                            0
      transaction tables consistent read rollbacks                                          0
      data blocks consistent reads - undo records applied                               9,686
      no work - consistent read gets                                                    7,716
      cleanouts only - consistent read gets                                                 1
      rollbacks only - consistent read gets                                                 0
      cleanouts and rollbacks - consistent read gets                                       24
      
      13 rows selected.

      So it’s almost 500 gets…

    • Posted November 24, 2011 at 11:07 | Permalink | Reply

      Timur,

      Here’s a little detail that I haven’t followed up.

      I put the “set transaction read only” into the code because I thought this would make it easier to run and see the excessive logical I/O. In fact the number of consistent gets is much higher if I don’t have that call in place.

  4. Posted November 24, 2011 at 01:50 | Permalink | Reply

    Timur,

    Correction – the stats I’ve quoted came from a predicate “id = 9999”.
    For “id = 1” I got 245 session logical reads.

    I should have said I was running 10.2.0.3 at the time.

One Trackback

  1. […] Timur Akhmadeev wonders about 16K CR gets for UQ index scan? […]

Leave a comment