Distinct placement

As a follow-up to a recent post on different names Oracle can use for the intermediate views, here is a quick example of the technique called distinct placement.

drop table t1 cascade constraints purge;
drop table t2 cascade constraints purge;

create table t1 as
with g as (select /*+ materialize */ null from all_source where rownum <= 2000)
select rownum id
     , cast('x' as char(200)) pad
  from g g1, g g2
 where rownum <= 1e6;

create table t2 as
select trunc(rownum/1e4) id
     , cast('x' as char(200)) pad
  from t1;

create index t1_indx on t1(id);

exec dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1', estimate_percent=>null)
exec dbms_stats.gather_table_stats(user, 't2', method_opt=>'for all columns size 1', estimate_percent=>null)

explain plan for
select distinct t1.pad, t2.pad
  from t1, t2
 where t1.id = t2.id;

@xp

The plan from the Oracle 11.2.0.2:

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |     1 |   410 |  8219   (1)| 00:01:39 |
|   1 |  HASH UNIQUE                  |                 |     1 |   410 |  8219   (1)| 00:01:39 |
|   2 |   NESTED LOOPS                |                 |       |       |            |          |
|   3 |    NESTED LOOPS               |                 |    72 | 29520 |  8218   (1)| 00:01:39 |
|   4 |     VIEW                      | VW_DTP_AE9E49E8 |    72 | 14688 |  8073   (1)| 00:01:37 |
|   5 |      HASH UNIQUE              |                 |    72 | 14688 |  8073   (1)| 00:01:37 |
|   6 |       TABLE ACCESS FULL       | T2              |  1000K|   194M|  8040   (1)| 00:01:37 |
|*  7 |     INDEX RANGE SCAN          | T1_INDX         |     1 |       |     2   (0)| 00:00:01 |
|   8 |    TABLE ACCESS BY INDEX ROWID| T1              |     1 |   206 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   7 - access("T1"."ID"="ITEM_1")

So it’s VW_DTP_%08X (not VW_DIS_%08X as I thought originally) that is used for a view constructed using distinct placement query transformation. Here are excerpts of 10053 trace relevant to the subject:

****************************************
 Cost-Based Group-By/Distinct Placement
****************************************
GBP/DP: Checking validity of GBP/DP for query block SEL$1 (#1)
GBP: Checking validity of group-by placement for query block SEL$1 (#1)
GBP: Bypassed: Query has invalid constructs.
DP: Checking validity of distinct placement for query block SEL$1 (#1)
  Column (#1): ID(
    AvgLen: 5 NDV: 1000000 Nulls: 0 Density: 0.000001 Min: 1 Max: 1000000
  Column (#1): ID(
    AvgLen: 3 NDV: 101 Nulls: 0 Density: 0.009901 Min: 0 Max: 100
  Column (#2): PAD(
    AvgLen: 201 NDV: 1 Nulls: 0 Density: 1.000000

DP: Using search type: linear
DP: Considering distinct placement on query block SEL$1 (#1)
DP: Starting iteration 1, state space = (1) : (0)
DP: Original query
...
DP: Costing query block.
...
DP: Updated best state, Cost = 36643.22
DP: Starting iteration 2, state space = (1) : (1)
DP: Using DP transformation in this iteration.
...
DP: Transformed query
******* UNPARSED QUERY IS *******
SELECT DISTINCT "T1"."PAD" "PAD","VW_DTP_AE9E49E8"."ITEM_2" "PAD" FROM  (SELECT DISTINCT "T2"."ID" "ITEM_1","T2"."PAD" "ITEM_2" FROM "TIM"."T2" "T2") "VW_DTP_AE9E49E8","TIM"."T1" "T1" WHERE "T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
FPD: Considering simple filter push in query block SEL$08FE944A (#1)
"T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
try to generate transitive predicate from check constraints for query block SEL$08FE944A (#1)
finally: "T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
...
DP: Updated best state, Cost = 8219.26
DP: Doing DP on the preserved QB.

Cost-based transformer re-writes the original query to the following

select distinct t1.pad, VW_DTP_AE9E49E8.ITEM_2
  from (select distinct t2.id item_1, t2.pad item_2
          from t2
       ) VW_DTP_AE9E49E8
     , t1
 where t1.id = VW_DTP_AE9E49E8.ITEM_1

and, after calculating its cost, accepts it. You can force Oracle to discard such transformation with a hint NO_PLACE_DISTINCT:

  1  explain plan for
  2  select /*+ qb_name(main) no_place_distinct(@main t2) */
  3         distinct t1.pad, t2.pad
  4    from t1, t2
  5*  where t1.id = t2.id
SQL> /

Explained.

SQL> @xp

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 975306333

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |   410 |       | 36643   (1)| 00:07:20 |
|   1 |  HASH UNIQUE        |      |     1 |   410 |       | 36643   (1)| 00:07:20 |
|*  2 |   HASH JOIN         |      |  1000K|   391M|   206M| 36610   (1)| 00:07:20 |
|   3 |    TABLE ACCESS FULL| T2   |  1000K|   194M|       |  8040   (1)| 00:01:37 |
|   4 |    TABLE ACCESS FULL| T1   |  1000K|   196M|       |  8040   (1)| 00:01:37 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."ID"="T2"."ID")

Interestingly, cardinality estimate for a HASH UNIQUE step in the plan with transformation is 72 – although the PAD column has NDV=1. I don’t know how CBO came to this number. Extended statistics on the column group for some reason doesn’t help in this situation:

SQL> select DBMS_STATS.CREATE_EXTENDED_STATS(user, 't2', '(id, pad)') from dual;

DBMS_STATS.CREATE_EXTENDED_STATS(USER,'T2','(ID,PAD)')
--------------------------------------------------------------------------------
SYS_STU1T#WGBLRARKHVCYH_JM8FIE

SQL> exec dbms_stats.gather_table_stats(user, 't2', method_opt=>'for all columns size 1', estimate_percent=>null, no_invalidate=>false)

SQL> select t.num_distinct from user_tab_col_statistics t
  2  where t.column_name = 'SYS_STU1T#WGBLRARKHVCYH_JM8FIE';

        NUM_DISTINCT
--------------------
                 101

SQL> explain plan for
  2  select /*+ */
  3         distinct t1.pad, t2.pad
  4    from t1, t2
  5   where t1.id = t2.id
  6  /

Explained.

SQL> @xp

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3429193153

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |     1 |   410 |  8219   (1)| 00:01:39 |
|   1 |  HASH UNIQUE                  |                 |     1 |   410 |  8219   (1)| 00:01:39 |
|   2 |   NESTED LOOPS                |                 |       |       |            |          |
|   3 |    NESTED LOOPS               |                 |    72 | 29520 |  8218   (1)| 00:01:39 |
|   4 |     VIEW                      | VW_DTP_AE9E49E8 |    72 | 14688 |  8073   (1)| 00:01:37 |
|   5 |      HASH UNIQUE              |                 |    72 | 14688 |  8073   (1)| 00:01:37 |
|   6 |       TABLE ACCESS FULL       | T2              |  1000K|   194M|  8040   (1)| 00:01:37 |
|*  7 |     INDEX RANGE SCAN          | T1_INDX         |     1 |       |     2   (0)| 00:00:01 |
|   8 |    TABLE ACCESS BY INDEX ROWID| T1              |     1 |   206 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   7 - access("T1"."ID"="ITEM_1")

2 Comments

Timur Akhmadeev

Posted June 1, 2011 at 13:35 | Permalink | Reply

Here is a “wrong results” bug associated with the functionality: Bug 10094732 Wrong results from SQL with DISTINCT function and an outer join in 11.2

Mikhail Velikikh

Posted May 19, 2014 at 12:38 | Permalink | Reply

Hi Timur,

Recently our application developers changed one big query and noticed that distinct placement transformation stopped to work.
We are using ORACLE 11.2.0.3.7.
I’m DBA at that company and they asked for assistance in query optimization.
I investigated this issue further and have found that distinct placement transformation taken place with no more than 5 tables in the optimized query block.

10053 trace shows next information when distinct placement is bypassed:

DP: Bypassed: Query has too many tables.

In my case, I working around this issue by adding non-mergeable view.
That’s effectively decreased number of tables in query block to 5 and the DP transformation taken place.
Unfortunately, I did not found yet how can we increase number of tables for DP transformation.

drop table t1 cascade constraints purge;
drop table t2 cascade constraints purge;
 
create table t1 as
with g as (select /*+ materialize */ null from all_source where rownum <= 2000)
select rownum id
     , cast('x' as char(10)) pad
  from g g1, g g2
 where rownum <= 1e6;
 
create table t2 as
select trunc(rownum/1e4) id
     , cast('x' as char(10)) pad
  from t1;
 
create index t1_indx on t1(id);
 
exec dbms_stats.gather_table_stats( '', 't1', method_opt=>'for all columns size 1', estimate_percent=>null)
exec dbms_stats.gather_table_stats( '', 't2', method_opt=>'for all columns size 1', estimate_percent=>null)

alter session set events 'trace [sql_optimizer.*]';
 
explain plan for
select distinct t1.pad, t2.pad
  from t1, t2
 where t1.id = t2.id;

@?/rdbms/admin/utlxpls

doc
  distinct placement working with 5 tables
#

explain plan for
select distinct t1.pad, t21.pad, t22.pad, t23.pad, t24.pad
  from t1, t2 t21, t2 t22, t2 t23, t2 t24
 where t1.id = t21.id
   and t22.id = t21.id
   and t23.id = t21.id
   and t24.id = t21.id;

@?/rdbms/admin/utlxpls

doc
  distinct placement not working with 6 tables
#

explain plan for
select distinct t1.pad, t21.pad, t22.pad, t23.pad, t24.pad, t25.pad
  from t1, t2 t21, t2 t22, t2 t23, t2 t24, t2 t25
 where t1.id = t21.id
   and t22.id = t21.id
   and t23.id = t21.id
   and t24.id = t21.id
   and t25.id = t21.id;

@?/rdbms/admin/utlxpls

doc
  distinct placement works again. Non-mergeable view effectively decreased number of tables in problem QB.
#

explain plan for
select distinct t1.pad, t21.pad, t22.pad, t23.pad, t24.pad24, t24.pad25
  from t1, t2 t21, t2 t22, t2 t23, 
       (select /*+ no_merge*/t24.id, t24.pad pad24, t25.pad pad25 from t2 t24, t2 t25 where t25.id = t24.id) t24
 where t1.id = t21.id
   and t22.id = t21.id
   and t23.id = t21.id
   and t24.id = t21.id;

@?/rdbms/admin/utlxpls

select value from v$diag_info where name='Default Trace File';

Timur Akhmadeev's blog