As a follow-up to a recent post on different names Oracle can use for the intermediate views, here is a quick example of the technique called distinct placement.
drop table t1 cascade constraints purge;
drop table t2 cascade constraints purge;
create table t1 as
with g as (select /*+ materialize */ null from all_source where rownum <= 2000)
select rownum id
, cast('x' as char(200)) pad
from g g1, g g2
where rownum <= 1e6;
create table t2 as
select trunc(rownum/1e4) id
, cast('x' as char(200)) pad
from t1;
create index t1_indx on t1(id);
exec dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1', estimate_percent=>null)
exec dbms_stats.gather_table_stats(user, 't2', method_opt=>'for all columns size 1', estimate_percent=>null)
explain plan for
select distinct t1.pad, t2.pad
from t1, t2
where t1.id = t2.id;
@xp
The plan from the Oracle 11.2.0.2:
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 410 | 8219 (1)| 00:01:39 |
| 1 | HASH UNIQUE | | 1 | 410 | 8219 (1)| 00:01:39 |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 72 | 29520 | 8218 (1)| 00:01:39 |
| 4 | VIEW | VW_DTP_AE9E49E8 | 72 | 14688 | 8073 (1)| 00:01:37 |
| 5 | HASH UNIQUE | | 72 | 14688 | 8073 (1)| 00:01:37 |
| 6 | TABLE ACCESS FULL | T2 | 1000K| 194M| 8040 (1)| 00:01:37 |
|* 7 | INDEX RANGE SCAN | T1_INDX | 1 | | 2 (0)| 00:00:01 |
| 8 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 206 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - access("T1"."ID"="ITEM_1")
So it’s VW_DTP_%08X (not VW_DIS_%08X as I thought originally) that is used for a view constructed using distinct placement query transformation. Here are excerpts of 10053 trace relevant to the subject:
****************************************
Cost-Based Group-By/Distinct Placement
****************************************
GBP/DP: Checking validity of GBP/DP for query block SEL$1 (#1)
GBP: Checking validity of group-by placement for query block SEL$1 (#1)
GBP: Bypassed: Query has invalid constructs.
DP: Checking validity of distinct placement for query block SEL$1 (#1)
Column (#1): ID(
AvgLen: 5 NDV: 1000000 Nulls: 0 Density: 0.000001 Min: 1 Max: 1000000
Column (#1): ID(
AvgLen: 3 NDV: 101 Nulls: 0 Density: 0.009901 Min: 0 Max: 100
Column (#2): PAD(
AvgLen: 201 NDV: 1 Nulls: 0 Density: 1.000000
DP: Using search type: linear
DP: Considering distinct placement on query block SEL$1 (#1)
DP: Starting iteration 1, state space = (1) : (0)
DP: Original query
...
DP: Costing query block.
...
DP: Updated best state, Cost = 36643.22
DP: Starting iteration 2, state space = (1) : (1)
DP: Using DP transformation in this iteration.
...
DP: Transformed query
******* UNPARSED QUERY IS *******
SELECT DISTINCT "T1"."PAD" "PAD","VW_DTP_AE9E49E8"."ITEM_2" "PAD" FROM (SELECT DISTINCT "T2"."ID" "ITEM_1","T2"."PAD" "ITEM_2" FROM "TIM"."T2" "T2") "VW_DTP_AE9E49E8","TIM"."T1" "T1" WHERE "T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
FPD: Considering simple filter push in query block SEL$08FE944A (#1)
"T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
try to generate transitive predicate from check constraints for query block SEL$08FE944A (#1)
finally: "T1"."ID"="VW_DTP_AE9E49E8"."ITEM_1"
...
DP: Updated best state, Cost = 8219.26
DP: Doing DP on the preserved QB.
Cost-based transformer re-writes the original query to the following
select distinct t1.pad, VW_DTP_AE9E49E8.ITEM_2
from (select distinct t2.id item_1, t2.pad item_2
from t2
) VW_DTP_AE9E49E8
, t1
where t1.id = VW_DTP_AE9E49E8.ITEM_1
and, after calculating its cost, accepts it. You can force Oracle to discard such transformation with a hint NO_PLACE_DISTINCT:
1 explain plan for
2 select /*+ qb_name(main) no_place_distinct(@main t2) */
3 distinct t1.pad, t2.pad
4 from t1, t2
5* where t1.id = t2.id
SQL> /
Explained.
SQL> @xp
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 975306333
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 410 | | 36643 (1)| 00:07:20 |
| 1 | HASH UNIQUE | | 1 | 410 | | 36643 (1)| 00:07:20 |
|* 2 | HASH JOIN | | 1000K| 391M| 206M| 36610 (1)| 00:07:20 |
| 3 | TABLE ACCESS FULL| T2 | 1000K| 194M| | 8040 (1)| 00:01:37 |
| 4 | TABLE ACCESS FULL| T1 | 1000K| 196M| | 8040 (1)| 00:01:37 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."ID"="T2"."ID")
Interestingly, cardinality estimate for a HASH UNIQUE step in the plan with transformation is 72 – although the PAD column has NDV=1. I don’t know how CBO came to this number. Extended statistics on the column group for some reason doesn’t help in this situation:
SQL> select DBMS_STATS.CREATE_EXTENDED_STATS(user, 't2', '(id, pad)') from dual;
DBMS_STATS.CREATE_EXTENDED_STATS(USER,'T2','(ID,PAD)')
--------------------------------------------------------------------------------
SYS_STU1T#WGBLRARKHVCYH_JM8FIE
SQL> exec dbms_stats.gather_table_stats(user, 't2', method_opt=>'for all columns size 1', estimate_percent=>null, no_invalidate=>false)
SQL> select t.num_distinct from user_tab_col_statistics t
2 where t.column_name = 'SYS_STU1T#WGBLRARKHVCYH_JM8FIE';
NUM_DISTINCT
--------------------
101
SQL> explain plan for
2 select /*+ */
3 distinct t1.pad, t2.pad
4 from t1, t2
5 where t1.id = t2.id
6 /
Explained.
SQL> @xp
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3429193153
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 410 | 8219 (1)| 00:01:39 |
| 1 | HASH UNIQUE | | 1 | 410 | 8219 (1)| 00:01:39 |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 72 | 29520 | 8218 (1)| 00:01:39 |
| 4 | VIEW | VW_DTP_AE9E49E8 | 72 | 14688 | 8073 (1)| 00:01:37 |
| 5 | HASH UNIQUE | | 72 | 14688 | 8073 (1)| 00:01:37 |
| 6 | TABLE ACCESS FULL | T2 | 1000K| 194M| 8040 (1)| 00:01:37 |
|* 7 | INDEX RANGE SCAN | T1_INDX | 1 | | 2 (0)| 00:00:01 |
| 8 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 206 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - access("T1"."ID"="ITEM_1")


One Comment
Here is a “wrong results” bug associated with the functionality: Bug 10094732 Wrong results from SQL with DISTINCT function and an outer join in 11.2