Enabling constraint in parallel

Recently I did some tuning of data generation scripts, which purpose is to build large amount of representative data for application testing. Direct-path inserts are in use and as a prerequisite all constraints and indexes on target tables are disabled before the load and are enabled after it. Since I wanted to utilize available resources on the machine for that task, almost each step uses parallel execution. Well, kind of almost, because enabling constraints didn’t run in parallel, although I’ve politely asked Oracle to do so. I’ll explain here why it didn’t work.


This is a test setup:

drop table t1 cascade constraints purge;
drop table t2 cascade constraints purge;

create table t1 as
with g as (select /*+ materialize */ null from all_source where rownum <= 2000)
select rownum id, lpad('x', 200, 'x') pad
  from g g1, g g2
 where rownum <= 4000000;

create unique index t1_uq on t1(id);

create table t2 as
select rownum id, trunc(rownum/4)+1 t1_id, lpad('x', 200, 'x') pad
  from t1 t11, t1 t12
 where rownum < 16000000;

create index t2_indx on t2(t1_id);

exec dbms_stats.gather_table_stats(user, 't1', estimate_percent=>null, method_opt => 'for all columns size 1', cascade => true, no_invalidate => false);
exec dbms_stats.gather_table_stats(user, 't2', estimate_percent=>null, method_opt => 'for all columns size 1', cascade => true, no_invalidate => false);

alter table t1 add constraint t1_pk primary key (id);
alter table t2 add constraint t2_fk foreign key (t1_id) references t1 disable;

Parent-child related tables T1 and T2 are approximately 1G and 4G in size, which is about 10 times smaller than real tables. Now I’m going to enable T2_FK as advertised and see what happens (the scripts used here are available for download):

alter session force parallel query;

alter table t2 parallel 2;
alter table t2 enable novalidate constraint t2_fk;

@sesstat parallel
alter system flush shared_pool;
alter system flush shared_pool;
@46
@53

alter table t2 enable constraint t2_fk;

@53off
@46off
@sesstat parallel

And here is output of running it on Oracle 11.2.0.2 (32-bit WinXP):

SQL> alter table t2 enable constraint t2_fk;

Table altered.

SQL>
SQL> @53off

Session altered.


Default trace
-------------------------------------------------------------------------------------
D:\ORACLE\diag\rdbms\test11g\test11g\trace\test11g_ora_240.trc

1 row selected.

SQL> @46off

PL/SQL procedure successfully completed.

SQL> @sesstat parallel

NAME                                                                            VALUE
---------------------------------------------------------------- --------------------
DBWR parallel query checkpoint buffers written                                      0
queries parallelized                                                                0
DML statements parallelized                                                         0
DDL statements parallelized                                                         0
DFO trees parallelized                                                              0
Parallel operations not downgraded                                                  0
Parallel operations downgraded to serial                                            0
Parallel operations downgraded 75 to 99 pct                                         0
Parallel operations downgraded 50 to 75 pct                                         0
Parallel operations downgraded 25 to 50 pct                                         0
Parallel operations downgraded 1 to 25 pct                                          0

11 rows selected.

Session statistics tells that parallel execution never occurred in the current session, and none of SQL was executed serially due to resource limitation. Let’s have a look on the statement Oracle uses to get data that doesn’t satisfy the constraint:

select /*+ all_rows ordered */ A.rowid, :1, :2, :3 
from
 "TIM"."T2" A , "TIM"."T1" B where( "A"."T1_ID" is not null) and( "B"."ID" (+)
  = "A"."T1_ID") and( "B"."ID" is null)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1     10.84      82.33     103731      46242         21           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3     10.84      82.33     103731      46242         21           0

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS   (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  HASH JOIN ANTI (cr=46242 pr=103731 pw=57536 time=82331490 us cost=38805 size=3840000 card=160000)
  15999999   15999999   15999999   INDEX FAST FULL SCAN T2_INDX (cr=37442 pr=37414 pw=0 time=9295004 us cost=10202 size=287999982 card=15999999)(object id 70220)
   4000000    4000000    4000000   INDEX FAST FULL SCAN T1_UQ (cr=8800 pr=8781 pw=0 time=2905208 us cost=2395 size=24000000 card=4000000)(object id 70218)

Apparently it is a simple outer join, transformed to anti-join by CBO. However, run-time plan is serial, even though the session has PQ_STATUS set to FORCED. And trace file is missing CBO trace data for the statement. This is caused by the fact Oracle uses a a recursive session for constraint validation. The recursive session explains why I didn’t get CBO trace data for that statement as well as why the statement has parallel plan if I explain it in the current session:

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------
Plan hash value: 729437326

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |          |   160K|  3750K|       | 10052   (1)| 00:02:01 |        |      |            |
|   1 |  PX COORDINATOR            |          |       |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)      | :TQ10002 |   160K|  3750K|       | 10052   (1)| 00:02:01 |  Q1,02 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN ANTI BUFFERED |          |   160K|  3750K|   114M| 10052   (1)| 00:02:01 |  Q1,02 | PCWP |            |
|   4 |     PX RECEIVE             |          |    15M|   274M|       |  2834   (1)| 00:00:35 |  Q1,02 | PCWP |            |
|   5 |      PX SEND HASH          | :TQ10000 |    15M|   274M|       |  2834   (1)| 00:00:35 |  Q1,00 | P->P | HASH       |
|   6 |       PX BLOCK ITERATOR    |          |    15M|   274M|       |  2834   (1)| 00:00:35 |  Q1,00 | PCWC |            |
|*  7 |        INDEX FAST FULL SCAN| T2_INDX  |    15M|   274M|       |  2834   (1)| 00:00:35 |  Q1,00 | PCWP |            |
|   8 |     PX RECEIVE             |          |  4000K|    22M|       |   665   (1)| 00:00:08 |  Q1,02 | PCWP |            |
|   9 |      PX SEND HASH          | :TQ10001 |  4000K|    22M|       |   665   (1)| 00:00:08 |  Q1,01 | P->P | HASH       |
|  10 |       PX BLOCK ITERATOR    |          |  4000K|    22M|       |   665   (1)| 00:00:08 |  Q1,01 | PCWC |            |
|  11 |        INDEX FAST FULL SCAN| T1_UQ    |  4000K|    22M|       |   665   (1)| 00:00:08 |  Q1,01 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("B"."ID"="A"."T1_ID")
   7 - filter("A"."T1_ID" IS NOT NULL)

But still, the table’s default parallel degree is set to 2 and CBO doesn’t consider running the SQL in parallel. Or does it? I’ve reverted PQ_STATUS of the session to ENABLED (which is the default setting and is therefore used by a recursive SYS session executing the statement) and traced optimizer decisions for the SQL:

OPTIMIZER STATISTICS AND COMPUTATIONS
Join order[1]:  T2[A]#0  T1[B]#1
...
Best:: JoinMethod: HashAnti
       Cost: 38805.11  Degree: 1  Resp: 38805.11  Card: 159999.99 Bytes: 24
...
****** Recost for parallel table scan  *******
Access path analysis for T2
***************************************
SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for T2[A] 

*** 2011-02-13 23:09:12.237
** Performing dynamic sampling initial checks. **
** Dynamic sampling initial checks returning FALSE.
  Table: T2  Alias: A
    Card: Original: 15999999.000000  Rounded: 15999999  Computed: 15999999.00  Non Adjusted: 15999999.00
  Access Path: TableScan
    Cost:  131682.45  Resp: 73156.91  Degree: 0
      Cost_io: 131575.00  Cost_cpu: 3039999810
      Resp_io: 73097.22  Resp_cpu: 1688888783
  Best:: AccessPath: TableScan
         Cost: 73156.91  Degree: 2  Resp: 73156.91  Card: 15999999.00  Bytes: 18

Join order[1]:  T2[A]#0  T1[B]#1
Join order aborted: cost > best plan cost

The highlighted line says it all. CBO started to re-cost plan for parallel table access, but stopped right after understanding T2 parallel access cost is higher than the current best cost. It is the same issue as with PARALLEL hint explained by Jonathan Lewis in his Hints on Hints paper: single PARALLEL hint is not enough to make the plan go parallel.
So how to make sure the plan for the query issued by Oracle to validate a constraint will be parallel? If you want 100% guarantee, you can use baselines. I’ve tested this approach and it worked as expected. Here is a run-time plan which used a baseline with hints

leading(a b) full(a) parallel(a) use_nl(b) no_parallel(b)
SQL_ID  8mpvqt5srnmpc, child number 2
-------------------------------------
 select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "TIM"."T2" A ,
"TIM"."T1" B where( "A"."T1_ID" is not null) and( "B"."ID" (+)=
"A"."T1_ID") and( "B"."ID" is null)

Plan hash value: 3500321617

---------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |       |       |  8964K(100)|          |        |      |            |
|   1 |  PX COORDINATOR       |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM) | :TQ10000 |   160K|  3750K|  8964K  (1)| 29:52:59 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    NESTED LOOPS ANTI  |          |   160K|  3750K|  8964K  (1)| 29:52:59 |  Q1,00 | PCWP |            |
|   4 |     PX BLOCK ITERATOR |          |       |       |            |          |  Q1,00 | PCWC |            |
|*  5 |      TABLE ACCESS FULL| T2       |    15M|   274M| 73157   (1)| 00:14:38 |  Q1,00 | PCWP |            |
|*  6 |     INDEX UNIQUE SCAN | T1_UQ    |  3999K|    22M|     1   (0)| 00:00:01 |  Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access(:Z>=:Z AND :Z<=:Z)
       filter("A"."T1_ID" IS NOT NULL)
   6 - access("B"."ID"="A"."T1_ID")

Note
-----
   - SQL plan baseline SQL_PLAN_8kycw3na5zp5f2046d83f used for this statement

If you just want let Oracle to decide which option is “the best”, then adequate optimizer statistics on the tables are needed. I don’t think it should be very accurate though – my opinion is in most cases it can be block_sample‘d on a small percent of data and be enough for the CBO to make a balanced decision. Whether the resulting plan will be “good” or “bad” in run-time is another question to consider.

About these ads

2 Comments

  1. Posted February 17, 2011 at 13:00 | Permalink | Reply

    Btw, you can use ORADEBUG EVENT syntax to set a trace event at the process level, (SESSION_EVENT would set it only for current session), that way also recursive sessions under this process would be traced…

    • Posted February 17, 2011 at 22:01 | Permalink | Reply

      Hi Tanel,

      I didn’t know about such ORADEBUG option (because I use it very rarely), thank you. It works as expected.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 287 other followers