FBI trouble

In our application we extensively use a function-based index on an important table. Couple of days ago I’ve seen an interesting issue associated with this FBI, view and a GROUP BY query. I have to say I don’t have an explanation what exactly it is and how I should call it properly, hence just “trouble” in the subject line.

This is a synthetic setup to demonstrate the case:

drop table t cascade constraints purge
;
create table t (
    id    number not null,
    x     varchar2(100),
    y     number,
    pad   varchar2(50)
);
insert into t
with g as (select /*+ materialize */ null from all_source where rownum <= 1000)
select mod(rownum, 1000) id,
       lpad('x', rownum/1e5, 'x') x,
       1 y,
       lpad('x', 50, 'x') pad
  from g g1, g g2
 where rownum <= 1e6
;

create index t_indx on t(id, coalesce(upper(x), to_char(y)))
;
exec dbms_stats.gather_table_stats(user, 't', method_opt=>'for all columns size 1', cascade=>true)
exec dbms_stats.gather_table_stats(user, 't', method_opt=>'for all hidden columns size 1', no_invalidate=>false)

create or replace view v1
as
select t.*, coalesce(upper(x), to_char(y)) func from t
;
create or replace view v2
as
select v1.id, nvl(coalesce(upper(x), to_char(y)), 1) y, func from v1
;

So there’s a table and a function-based index defined. There’re also two views selecting data from table: one that adds FBI column, the other tries to restrict selected data to that stored in the index. So far so good:

SQL> explain plan for select * from v2 where id = :1;

Explained.

SQL> select * from table(dbms_xplan.display(null,null,'+outline +projection'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
Plan hash value: 4058602070

---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |  1000 | 10000 |     5   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| T_INDX |  1000 | 10000 |     5   (0)| 00:00:01 |
---------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX(@"SEL$5C160134" "T"@"SEL$3" "T_INDX")
      OUTLINE(@"SEL$3")
      OUTLINE(@"SEL$2")
      MERGE(@"SEL$3")
      OUTLINE(@"SEL$335DD26A")
      OUTLINE(@"SEL$1")
      MERGE(@"SEL$335DD26A")
      OUTLINE_LEAF(@"SEL$5C160134")
      ALL_ROWS
      DB_VERSION('11.2.0.3')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T"."ID"=TO_NUMBER(:1))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "T"."ID"[NUMBER,22], COALESCE(UPPER("X"),TO_CHAR("Y"))[VARCHAR2,1
       00]

39 rows selected.

A query gets pure index access and does not touch the table at all, as wanted. Now kind of dumb query is fired against this view:

SQL> explain plan for
  2  select count(t2.y)
  3    from (select 1 id from dual) t1
  4        ,(select id, y from v2 where id = :1) t2
  5  where t1.id = t2.id(+)
  6  group by t1.id
  7  ;

Explained.

SQL> select * from table(dbms_xplan.display(null,null,'+outline +projection'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
Plan hash value: 2568130530

----------------------------------------------------------------------------------------
| Id  | Operation                     | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |        |     1 |    27 |     7   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT         |        |     1 |    27 |     7   (0)| 00:00:01 |
|   2 |   NESTED LOOPS OUTER          |        |     1 |    27 |     7   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL          | DUAL   |     1 |     2 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS BY INDEX ROWID| T      |     1 |    25 |     5   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN          | T_INDX |     1 |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_NL(@"SEL$70D5F4CE" "T"@"SEL$5")
      LEADING(@"SEL$70D5F4CE" "DUAL"@"SEL$2" "T"@"SEL$5")
      INDEX_RS_ASC(@"SEL$70D5F4CE" "T"@"SEL$5" "T_INDX")
      FULL(@"SEL$70D5F4CE" "DUAL"@"SEL$2")
      OUTLINE(@"SEL$5")
      OUTLINE(@"SEL$4")
      MERGE(@"SEL$5")
      OUTLINE(@"SEL$7286615E")
      OUTLINE(@"SEL$3")
      MERGE(@"SEL$7286615E")
      OUTLINE(@"SEL$C8360722")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      MERGE(@"SEL$C8360722")
      MERGE(@"SEL$2")
      OUTLINE_LEAF(@"SEL$70D5F4CE")
      ALL_ROWS
      DB_VERSION('11.2.0.3')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("T"."ID"(+)=TO_NUMBER(:1))
       filter("T"."ID"(+)=CASE  WHEN (ROWID IS NOT NULL) THEN 1 ELSE 1 END )

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=1) 1[2], COUNT(CASE  WHEN ROWID IS NOT NULL THEN
       NVL(COALESCE(UPPER("T"."X"),TO_CHAR("T"."Y")),'1') ELSE NULL END )[22]
   2 - (#keys=0) ROWID[ROWID,10], "T"."X"[VARCHAR2,100], "T"."Y"[NUMBER,22]
   3 - ROWID[ROWID,10]
   4 - ROWID[ROWID,10], "T"."X"[VARCHAR2,100], "T"."Y"[NUMBER,22]
   5 - ROWID[ROWID,10]

56 rows selected.

See how Oracle reacted. It’s now accessing the table, and it’s not wanted at all. Optimizer trace shows that the query undergoes several view merging and is transformed to

select count(case
                 when "T".ROWID is not null then
                  nvl(coalesce(upper("T"."X"), to_char("T"."Y")), 1)
                 else
                  null
             end) "COUNT(T2.Y)"
  from "SYS"."DUAL" "DUAL", TIM."T" "T"
 where case
           when "DUAL".ROWID is not null then
            1
           else
            1
       end = "T"."ID"(+)
   and "T"."ID"(+) = :b1
 group by 1

I assume this transformation resulted in additional CASE expression in the select list and this is what made Oracle to access table. Well, now I know why it happens and how to get rid of it, but I really curious to know how to call and control it. There’s really no information in 10053 trace about this feature, except that the predicate first appears under “Outer Join Elimination” transformation – but turning OJE off does not change anything. If anyone knows more on this topic – welcome to comments.

Update Dec 19th I’ve uploaded 10053 trace generated for a modified query (see comments): link to download. Please note that the file is plain *.txt with .doc extension.

About these ads

16 Comments

  1. Posted December 14, 2011 at 23:50 | Permalink | Reply

    Hi,

    It makes sense to me.

    In your first query, table T can be accessed by ID, which is the first column of the index. So the index range scan has a nice selectivity.
    In your second query, you first have an outer join. This means you need all the ID values from t2. Then you count the data on t2.y. So the optimizer needs to decide the value of t2.y which is not a part of the index. So the table has to be read to check the predicate.

    Since you are using an nvl in your view, it would make more sense to query count(t2.id). The result will be the same. Even though an outer join is out of place here anyway.

    Regards,

    Arian

    • Posted December 15, 2011 at 16:18 | Permalink | Reply

      Hi Arian

      thanks for comment. If you try to change a subquery by adding rownum like this:

      explain plan for
      select count(t2.y)
        from (select 1 id from dual) t1
           , (select id, y from v2 where id = :1 and rownum > 0) t2
       where t1.id = t2.id(+)
       group by t1.id;
      

      You’ll see that Oracle is able to avoid table access. It’s still same query, same outer join – and yet no table access.

      NVL is there just to apply a function, in reality there’s something different.

      • Posted December 16, 2011 at 02:06 | Permalink | Reply

        Hi,

        thanks. That’s an interesting development. But now you are materializing the view first. I don’t have a database at the moment. But I wonder what happens when you put t2 in a with clause.
        Would Oracle still transform it to the same view? And if so, what happens when you add a materialize hint?

        Regards,

        Arian

        • Posted December 16, 2011 at 11:07 | Permalink | Reply

          Arian

          thanks for your desire to help :) to overcome the issue, but please read my reply about the purpose of this post.

          But I wonder what happens when you put t2 in a with clause.

          Originally it was there; it doesn’t change the way how original query behaves. However, the presence of the subquery factoring (which I don’t like; I will write a post on this some day) makes the fix dependent on the Oracle version in use. But, again – this isn’t my point here.

          • Posted December 16, 2011 at 23:22 | Permalink | Reply

            Hi Timur,

            I’m afraid we have a bit of a miscommunication here. I did read your original post, and I understand it very well. But I would like to know the full details of this case. You didn’t post the entire 10053 trace, but claim that it is caused by case in the select statement. Being a longer time reader of your blog, I believe your diagnosis. But it is not sufficient enough for me to understand when and where this would happen. That is why I am asking for alternatives, to see what exactly goes on.

            In your blog post you ask if somebody has seen this issue before. No, I don’t know the issue. If that was the only reason for your blog post, we’re done. But I’d prefer your blog to be a bit more substantial, and to make your readers understand what’s going on.

            Just a thought.

            Regards,

            Arian

            • Posted December 19, 2011 at 11:42 | Permalink | Reply

              Hi Arian

              thank you for feedback. I’ll post the complete 10053 trace and update the post probably today.

            • Posted December 19, 2011 at 17:24 | Permalink | Reply

              I’ve updated the post with a complete 10053 trace file.

  2. A.
    Posted December 15, 2011 at 17:53 | Permalink | Reply

    Timur, Hi dude.

    You can make your example more simple, “group by” is redundant here. And I suggest you will not see CASE there.

    • Posted December 15, 2011 at 18:00 | Permalink | Reply

      Hi

      I know I can do different things to get rid of this – but the point is to find out why CASE expression appears here, how this transformation is called and how to control it without query modification on Oracle parameters level.

      • A.
        Posted December 15, 2011 at 18:37 | Permalink | Reply

        I believe that the original problem is not in CASE, but in NVL + outer join. Did you check your query without groupby/count? And without FBI and with just usual index?

        • Posted December 16, 2011 at 10:58 | Permalink | Reply

          I can only repeat: I do know how to get rid of this issue in different ways, so the “fix” is not what is really interesting to me. I’d like to know how it’s called and if there’s a way to control it without any change to the test case.

          • A.
            Posted December 16, 2011 at 13:11 | Permalink | Reply

            I just said that if you delete groupby & count from your statement the execution plan will still be the same, but I expect that you will not see CASE in the transformed query. I did not propose to fix the issue. My simple point was – if you make your example more simple (without redundant “group by”) you maybe will have more ideas about the issue.

            • Posted December 16, 2011 at 16:34 | Permalink | Reply

              Thanks for that. Yes, the minimal query lacks CASE filter in the plan, although 10053 trace still shows this expression in the SELECT list. I also tried to understand at which OFE level the query starts working index only, and it looks like it does so until 8.1.7:

              SQL> explain plan for
                2  select /*+ optimizer_features_enable('8.1.7') */ t2.y
                3    from (select 1 id from dual) t1
                4       , (select id, y from v2 where id = :1) t2
                5   where t1.id = t2.id(+);
              
              Explained.
              
              SQL> @xp
              
              PLAN_TABLE_OUTPUT
              -----------------------------------------------------------------------
              Plan hash value: 250159355
              
              --------------------------------------------------------------
              | Id  | Operation           | Name   | Rows  | Bytes | Cost  |
              --------------------------------------------------------------
              |   0 | SELECT STATEMENT    |        |  1000 | 65000 |     2 |
              |   1 |  MERGE JOIN OUTER   |        |  1000 | 65000 |     2 |
              |   2 |   FAST DUAL         |        |     1 |       |     1 |
              |   3 |   VIEW              |        |  1000 | 65000 |     1 |
              |*  4 |    FILTER           |        |       |       |       |
              |*  5 |     INDEX RANGE SCAN| T_INDX |  1000 | 13000 |     1 |
              --------------------------------------------------------------
              
              Predicate Information (identified by operation id):
              ---------------------------------------------------
              
                 4 - filter(TO_NUMBER(:1)=1)
                 5 - access("T"."ID"=TO_NUMBER(:1))
              
              Note
              -----
                 - cpu costing is off (consider enabling it)
              
              22 rows selected.
              
              SQL> explain plan for
                2  select /*+ optimizer_features_enable('9.0.0') */ t2.y
                3    from (select 1 id from dual) t1
                4       , (select id, y from v2 where id = :1) t2
                5   where t1.id = t2.id(+);
              
              Explained.
              
              SQL> @xp
              
              PLAN_TABLE_OUTPUT
              -----------------------------------------------------------------------
              Plan hash value: 3556404565
              
              -----------------------------------------------------------------------
              | Id  | Operation                    | Name   | Rows  | Bytes | Cost  |
              -----------------------------------------------------------------------
              |   0 | SELECT STATEMENT             |        |     1 |    20 |     2 |
              |   1 |  MERGE JOIN OUTER            |        |     1 |    20 |     2 |
              |   2 |   FAST DUAL                  |        |     1 |       |     1 |
              |   3 |   TABLE ACCESS BY INDEX ROWID| T      |     1 |    20 |     1 |
              |*  4 |    INDEX RANGE SCAN          | T_INDX |     1 |       |     1 |
              -----------------------------------------------------------------------
              
              Predicate Information (identified by operation id):
              ---------------------------------------------------
              
                 4 - access("T"."ID"(+)=TO_NUMBER(:1))
                     filter("T"."ID"(+)=1)
              
              Note
              -----
                 - cpu costing is off (consider enabling it)
              
              21 rows selected.
              
  3. Posted December 25, 2011 at 22:00 | Permalink | Reply

    The test case can be simpler:

    drop table t purge;
    create table t (id number not null, x number);
    create index i on t(id, nvl(x,1));
    
    explain plan for
    select a
    from (select 1 id from dual) t1
       , (select id, nvl(x,1) a from (select * from t)) t2
    where t1.id = t2.id(+);
    select * from table(dbms_xplan.display);

    And in 11.2 I got NL2 instead of CASE in 10053.

    SELECT NVL2("T".ROWID,NVL("T"."X",1),NULL) "A" FROM "SYS"."DUAL" "DUAL","TEST_USER"."T" "T" WHERE "T"."ID"(+)=1
    • Posted December 27, 2011 at 09:11 | Permalink | Reply

      There are a lot of bugs with outer join and FBI based on the functions that may return null for not null arguments on oracle support. So it seems to be special case code in the optimizer which covers the function with CASE or NVL2 + rowid and probably eliminates access path: index (IndexOnly). My previous example works fine (with no access table) on 10.2 and goes wrong on 11.1 and 11.2.

      • Posted December 27, 2011 at 14:37 | Permalink | Reply

        Hi Valentin

        thanks for even more simpler test case.

        So it seems to be special case code in the optimizer which covers the function with CASE or NVL2 + rowid and probably eliminates access path: index (IndexOnly). My previous example works fine (with no access table) on 10.2 and goes wrong on 11.1 and 11.2.

        It might be so. Looks like the same “fix” was “ported” to 10.2.0.5 which also reports table access, while 10.2.0.4 does not.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 290 other followers