Distinct placement

As a follow-up to a recent post on different names Oracle can use for the intermediate views, here is a quick example of the technique called distinct placement.

drop table t1 cascade constraints purge;
drop table t2 cascade constraints purge;

create table t1 as
with g as (select /*+ materialize */ null from all_source where rownum <= 2000)
select rownum id
     , cast('x' as char(200)) pad
  from g g1, g g2
 where rownum <= 1e6;

create table t2 as
select trunc(rownum/1e4) id
     , cast('x' as char(200)) pad
  from t1;

create index t1_indx on t1(id);

exec dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1', estimate_percent=>null)
exec dbms_stats.gather_table_stats(user, 't2', method_opt=>'for all columns size 1', estimate_percent=>null)

explain plan for
select distinct t1.pad, t2.pad
  from t1, t2
 where t1.id = t2.id;

@xp

The plan from the Oracle 11.2.0.2:

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |     1 |   410 |  8219   (1)| 00:01:39 |
|   1 |  HASH UNIQUE                  |                 |     1 |   410 |  8219   (1)| 00:01:39 |
|   2 |   NESTED LOOPS                |                 |       |       |            |          |
|   3 |    NESTED LOOPS               |                 |    72 | 29520 |  8218   (1)| 00:01:39 |
|   4 |     VIEW                      | VW_DTP_AE9E49E8 |    72 | 14688 |  8073   (1)| 00:01:37 |
|   5 |      HASH UNIQUE              |                 |    72 | 14688 |  8073   (1)| 00:01:37 |
|   6 |       TABLE ACCESS FULL       | T2              |  1000K|   194M|  8040   (1)| 00:01:37 |
|*  7 |     INDEX RANGE SCAN          | T1_INDX         |     1 |       |     2   (0)| 00:00:01 |
|   8 |    TABLE ACCESS BY INDEX ROWID| T1              |     1 |   206 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   7 - access("T1"."ID"="ITEM_1")

So it’s VW_DTP_%08X (not VW_DIS_%08X as I thought originally) that is used for a view constructed using distinct placement query transformation. Here are excerpts of 10053 trace relevant to the subject:

****************************************
 Cost-Based Group-By/Distinct Placement
****************************************
GBP/DP: Checking validity of GBP/DP for query block SEL$1 (#1)
GBP: Checking validity of group-by placement for query block SEL$1 (#1)
GBP: Bypassed: Query has invalid constructs.
DP: Checking validity of distinct placement for query block SEL$1 (#1)
  Column (#1): ID(
    AvgLen: 5 NDV: 1000000 Nulls: 0 Density: 0.000001 Min: 1 Max: 1000000
  Column (#1): ID(
    AvgLen: 3 NDV: 101 Nulls: 0 Density: 0.009901 Min: 0 Max: 100
  Column (#2): PAD(
    AvgLen: 201 NDV: 1 Nulls: 0 Density: 1.000000

DP: Using search type: linear
DP: Considering distinct placement on query block SEL$1 (#1)
DP: Starting iteration 1, state space = (1) : (0)
DP: Original query
...
DP: Costing query block.
...
DP: Updated best state, Cost = 36643.22
DP: Starting iteration 2, state space = (1) : (1)
DP: Using DP transformation in this iteration.
...
DP: Transformed query
******* UNPARSED QUERY IS *******
SELECT DISTINCT "T1"."PAD" "PAD","VW_DTP_AE9E49E8"."ITEM_2" "PAD" FROM  (SELECT DISTINCT "T2"."ID" "ITEM_1","T2"."PAD" "ITEM_2" FROM "TIM"."T2" "T2") "VW_DTP_AE9E49E8","TIM"."T1" "T1" WHERE "T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
FPD: Considering simple filter push in query block SEL$08FE944A (#1)
"T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
try to generate transitive predicate from check constraints for query block SEL$08FE944A (#1)
finally: "T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
...
DP: Updated best state, Cost = 8219.26
DP: Doing DP on the preserved QB.

Cost-based transformer re-writes the original query to the following

select distinct t1.pad, VW_DTP_AE9E49E8.ITEM_2
  from (select distinct t2.id item_1, t2.pad item_2
          from t2
       ) VW_DTP_AE9E49E8
     , t1
 where t1.id = VW_DTP_AE9E49E8.ITEM_1

and, after calculating its cost, accepts it. You can force Oracle to discard such transformation with a hint NO_PLACE_DISTINCT:

  1  explain plan for
  2  select /*+ qb_name(main) no_place_distinct(@main t2) */
  3         distinct t1.pad, t2.pad
  4    from t1, t2
  5*  where t1.id = t2.id
SQL> /

Explained.

SQL> @xp

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 975306333

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |   410 |       | 36643   (1)| 00:07:20 |
|   1 |  HASH UNIQUE        |      |     1 |   410 |       | 36643   (1)| 00:07:20 |
|*  2 |   HASH JOIN         |      |  1000K|   391M|   206M| 36610   (1)| 00:07:20 |
|   3 |    TABLE ACCESS FULL| T2   |  1000K|   194M|       |  8040   (1)| 00:01:37 |
|   4 |    TABLE ACCESS FULL| T1   |  1000K|   196M|       |  8040   (1)| 00:01:37 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."ID"="T2"."ID")

Interestingly, cardinality estimate for a HASH UNIQUE step in the plan with transformation is 72 – although the PAD column has NDV=1. I don’t know how CBO came to this number. Extended statistics on the column group for some reason doesn’t help in this situation:

SQL> select DBMS_STATS.CREATE_EXTENDED_STATS(user, 't2', '(id, pad)') from dual;

DBMS_STATS.CREATE_EXTENDED_STATS(USER,'T2','(ID,PAD)')
--------------------------------------------------------------------------------
SYS_STU1T#WGBLRARKHVCYH_JM8FIE

SQL> exec dbms_stats.gather_table_stats(user, 't2', method_opt=>'for all columns size 1', estimate_percent=>null, no_invalidate=>false)

SQL> select t.num_distinct from user_tab_col_statistics t
  2  where t.column_name = 'SYS_STU1T#WGBLRARKHVCYH_JM8FIE';

        NUM_DISTINCT
--------------------
                 101

SQL> explain plan for
  2  select /*+ */
  3         distinct t1.pad, t2.pad
  4    from t1, t2
  5   where t1.id = t2.id
  6  /

Explained.

SQL> @xp

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3429193153

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |     1 |   410 |  8219   (1)| 00:01:39 |
|   1 |  HASH UNIQUE                  |                 |     1 |   410 |  8219   (1)| 00:01:39 |
|   2 |   NESTED LOOPS                |                 |       |       |            |          |
|   3 |    NESTED LOOPS               |                 |    72 | 29520 |  8218   (1)| 00:01:39 |
|   4 |     VIEW                      | VW_DTP_AE9E49E8 |    72 | 14688 |  8073   (1)| 00:01:37 |
|   5 |      HASH UNIQUE              |                 |    72 | 14688 |  8073   (1)| 00:01:37 |
|   6 |       TABLE ACCESS FULL       | T2              |  1000K|   194M|  8040   (1)| 00:01:37 |
|*  7 |     INDEX RANGE SCAN          | T1_INDX         |     1 |       |     2   (0)| 00:00:01 |
|   8 |    TABLE ACCESS BY INDEX ROWID| T1              |     1 |   206 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   7 - access("T1"."ID"="ITEM_1")
About these ads

One Comment

  1. Posted June 1, 2011 at 13:35 | Permalink | Reply

    Here is a “wrong results” bug associated with the functionality: Bug 10094732 Wrong results from SQL with DISTINCT function and an outer join in 11.2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 287 other followers