About a year ago I’ve discovered nice feature of Oracle 10gR2 CBO: to overcome an issue with calculated selectivity for predicates on multiple columns, it can use DISTINCT_KEYS of the available index. Couple of weeks ago the bug fix mentioned in the OTN thread actually helped to solve a performance issue of a query. And about a week ago I found that this fix doesn’t work with function-based indexes. So, this post is about it.
A test case:
drop table t cascade constraints purge;
create table t as
with g as (select /*+ materialize */ null from all_objects where rownum <= 1000)
select rownum id,
'x' || lpad(mod(rownum, 100), 9, '0') x,
'y' || lpad(mod(rownum, 100), 9, '0') y
from g g1, g g2
where rownum <= 1e5;
exec dbms_stats.gather_table_stats(user, 't', method_opt=>'for all columns size 1', estimate_percent=>null)
alter session set "_fix_control"='5765456:0';
create index t_indx on t(x, y);
explain plan for select /*+ index(t t_indx) */ * from t where x = :1 and y = :2;
@xp
alter session set "_fix_control"='5765456:7';
explain plan for select /*+ full(t) */ * from t where x = :1 and y = :2;
@xp
explain plan for select /*+ index(t t_indx) */ * from t where x = :1 and y = :2;
@xp
And excerpts of output when running it on 10.2.0.4P35 with comments
-- defaults: good cardinality for IRS, but bad for table access
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 260 | 449 (0)| 00:00:06 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 10 | 260 | 449 (0)| 00:00:06 |
|* 2 | INDEX RANGE SCAN | T_INDX | 1000 | | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"=:1 AND "Y"=:2)
-- enabled fix for bug 5765456 and FTS: all OK
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 26000 | 103 (1)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 1000 | 26000 | 103 (1)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X"=:1 AND "Y"=:2)
-- enabled fix for bug 5765456 and IRS: all OK too
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 26000 | 449 (0)| 00:00:06 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1000 | 26000 | 449 (0)| 00:00:06 |
|* 2 | INDEX RANGE SCAN | T_INDX | 1000 | | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"=:1 AND "Y"=:2)
Now let’s add a function-based index and try similar queries:
create index t_indx2 on t(x, upper(y)); exec dbms_stats.gather_table_stats(user, 't', method_opt=>'for all hidden columns size 1', estimate_percent=>null) explain plan for select /*+ full(t) */ * from t where x = :1 and upper(y) = :2; @xp explain plan for select /*+ index(t t_indx2) */ * from t where x = :1 and upper(y) = :2; @xp
Oops:
-- enabled fix for bug 5765456 and FTS: wrong estimates
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 270 | 103 (1)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 10 | 270 | 103 (1)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X"=:1 AND UPPER("Y")=:2)
-- -- enabled fix for bug 5765456 and IRS: wrong estimates for table access
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 270 | 449 (0)| 00:00:06 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 10 | 270 | 449 (0)| 00:00:06 |
|* 2 | INDEX RANGE SCAN | T_INDX2 | 1000 | | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"=:1 AND UPPER("Y")=:2)
So, 10.2.0.4 is not good in this scenario. What about 11.2.0.1 (Win32)?
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 27000 | 449 (0)| 00:00:06 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1000 | 27000 | 449 (0)| 00:00:06 |
|* 2 | INDEX RANGE SCAN | T_INDX | 1000 | | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"=:1 AND "Y"=:2)
-- fix 5765456 can't be disabled in 11.2.0.1
ERROR:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-00940: invalid ALTER command
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 27000 | 127 (1)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 1000 | 27000 | 127 (1)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X"=:1 AND "Y"=:2)
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 27000 | 449 (0)| 00:00:06 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1000 | 27000 | 449 (0)| 00:00:06 |
|* 2 | INDEX RANGE SCAN | T_INDX | 1000 | | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"=:1 AND "Y"=:2)
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 38000 | 127 (1)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 1000 | 38000 | 127 (1)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X"=:1 AND UPPER("Y")=:2)
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 38000 | 449 (0)| 00:00:06 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1000 | 38000 | 449 (0)| 00:00:06 |
|* 2 | INDEX RANGE SCAN | T_INDX2 | 1000 | | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"=:1 AND UPPER("Y")=:2)
So in 11.2.0.1:
- bug fix 5765456 is enabled by default and can’t be disabled
- in all cases Oracle calculates cardinality estimate properly
- it’s great, but requires an index (though you can make it unusable and save associated overheads). When I tried to use pure extended statistics in 11.2.0.1, I’ve encountered several issues. I’m copy-pasting output with comments
-- adding virtual column with the same function as in existing function-based index fails
SQL> alter table t add (upper_y as (upper(y)));
alter table t add (upper_y as (upper(y)))
*
ERROR at line 1:
ORA-54015: Duplicate column expression was specified
SQL> drop table t cascade constraints purge;
Table dropped.
SQL> create table t as
2 with g as (select /*+ materialize */ null from all_objects where rownum <= 1000)
3 select rownum id,
4 'x' || lpad(mod(rownum, 100), 9, '0') x,
5 'y' || lpad(mod(rownum, 100), 9, '0') y
6 from g g1, g g2
7 where rownum <= 1e5;
Table created.
SQL> alter table t add (upper_y as (upper(y)));
Table altered.
SQL> select column_name, hidden_column, virtual_column, avg_col_len from user_tab_cols where table_name = 'T';
COLUMN_NAME HID VIR AVG_COL_LEN
------------------------------ --- --- --------------------
UPPER_Y NO YES NULL
Y NO NO NULL
X NO NO NULL
ID NO NO NULL
4 rows selected.
-- can't gather statistics on virtual column explicitly
SQL> exec dbms_stats.gather_table_stats(user, 't', method_opt=>'for all virtual columns size 1', estimate_percent=>null)
BEGIN dbms_stats.gather_table_stats(user, 't', method_opt=>'for all virtual columns size 1', estimate_percent=>null); END;
*
ERROR at line 1:
ORA-20000: Cannot parse for clause: for all virtual columns size 1
ORA-06512: at "SYS.DBMS_STATS", line 20337
ORA-06512: at "SYS.DBMS_STATS", line 20360
ORA-06512: at line 1
SQL> exec dbms_stats.gather_table_stats(user, 't', method_opt=>'for virtual columns size 1', estimate_percent=>null)
BEGIN dbms_stats.gather_table_stats(user, 't', method_opt=>'for virtual columns size 1', estimate_percent=>null); END;
*
ERROR at line 1:
ORA-20000: Cannot parse for clause: for virtual columns size 1
ORA-06512: at "SYS.DBMS_STATS", line 20337
ORA-06512: at "SYS.DBMS_STATS", line 20360
ORA-06512: at line 1
-- and can't add extended statistics:
SQL> select dbms_stats.create_extended_stats(user, 't', '(upper_y)') from dual;
select dbms_stats.create_extended_stats(user, 't', '(upper_y)') from dual
*
ERROR at line 1:
ORA-20001: Invalid Extension: Not an expression
SQL> select dbms_stats.create_extended_stats(user, 't', '(upper(y))') from dual;
select dbms_stats.create_extended_stats(user, 't', '(upper(y))') from dual
*
ERROR at line 1:
ORA-20007: extension (upper(y)) already exists in the table
ORA-06512: at "SYS.DBMS_STATS", line 6674
ORA-06512: at "SYS.DBMS_STATS", line 28195
-- because virtual column added it behind the scene already:
SQL> select extension_name, extension
2 from user_stat_extensions
3 /
EXTENSION_NAME
------------------------------
EXTENSION
-------------------------------------
UPPER_Y
(UPPER("Y"))
-- OK then, now back to the main topic - FBI and statistics:
SQL> select dbms_stats.create_extended_stats(user, 't', '(x, upper_y)') from dual;
select dbms_stats.create_extended_stats(user, 't', '(x, upper_y)') from dual
*
ERROR at line 1:
ORA-20001: Error when processing extension - virtual column is referenced in a column expression
ORA-06512: at "SYS.DBMS_STATS", line 6815
ORA-06512: at "SYS.DBMS_STATS", line 28195
SQL> select dbms_stats.create_extended_stats(user, 't', '(x, upper(y))') from dual
2 /
select dbms_stats.create_extended_stats(user, 't', '(x, upper(y))') from dual
*
ERROR at line 1:
ORA-20001: Invalid Extension: Column group can contain only columns seperated by comma
Yes, I wasn’t able to make extended statistics work in case of an expression. And yes, last exception has a typo and this is not my mistake.
If you know how to make it work in 11g – don’t hesitate to leave a comment.


3 Comments
Hello
Have you tried ?
Exec dbms_stats.gather_table_stats(null,’t',method_opt => ‘for columns upper_y size 1′);
or
Exec dbms_stats.gather_table_stats(null,’t',method_opt => ‘for columns (upper(y)) size 1′);
or
Exec dbms_stats.gather_table_stats(null,’t',method_opt => ‘for all columns size 1′);
Each case seems to work.
I have only recently learned about the 1% selectivity assumption behind function driven indexes. Quite a surprise!
Using Oracle 11.1 I had similar issues to you with dbms_stats.create_extended_stats when trying to graft a proper histogram response to a function driven index.
My example was based on using upper(last_name) on a copy of hr.employeesC where 99% of names were ‘SMITH’.
My workaround is to use
DBMS_STATS.gather_table_stats (‘HR’, ‘EMPLOYEESC’, method_opt => ‘for all indexed columns size auto for columns (upper(last_name)) size auto)’);
Prompted by recent Jonathan Lewis post I’ve re-run tests on a 11.2.0.3. All the same, except the final exception: in this version Oracle allows to create extended statistics with expressions:
SQL> select dbms_stats.create_extended_stats(user, 't', '(x, upper(y))') from dual 2 ; DBMS_STATS.CREATE_EXTENDED_STATS(USER,'T','(X,UPPER(Y))') ---------------------------------------------------------------------------------- SYS_STUP01RWF0EEKLXC2TYUD_3IG3 SQL> select extension_name, extension from user_stat_extensions where table_name = 'T'; EXTENSION_NAME EXTENSION ------------------------------ ---------------------------------------- UPPER_Y (UPPER("Y")) SYS_STUP01RWF0EEKLXC2TYUD_3IG3 ("X",UPPER("Y"))