Cartesian Merge Join

Have you ever had an execution plan which gave you a Cartesian join that you knew just couldn’t be happening ?

Surely you put a join condition in for every table ! Well maybe you did, and maybe the optimizer took some of your join conditions away. When the (not always) terrible Cartesian join appears, make sure you check the predicates section of your execution plan.  Here’s a demonstration of why:

create table t1 as
select
        trunc((rownum-1)/15)	n1,
        trunc((rownum-1)/15)	n2,
        rpad(rownum,215)	v1
from    all_objects
where   rownum <= 3000
;          

create index t_i1 on t1(n1);

After gathering stats on the table and index, look at what 9i gives you for the execution plan of the following query (using dbms_xplan.display):

select
        tA.n2,
        tB.n2
from
        t1	tA,
        t1	tB
where
        tA.n1 = 15
and     tB.n1 = tA.n1
;           

------------------------------------------------------------------------------
| Id  | Operation                     |  Name       | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             |   225 |  3600 |    32 |
|   1 |  MERGE JOIN CARTESIAN         |             |   225 |  3600 |    32 |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1          |    15 |   120 |     2 |
|*  3 |    INDEX RANGE SCAN           | T_I1        |    15 |       |     1 |
|   4 |   BUFFER SORT                 |             |    15 |   120 |    30 |
|   5 |    TABLE ACCESS BY INDEX ROWID| T1          |    15 |   120 |     2 |
|*  6 |     INDEX RANGE SCAN          | T_I1        |    15 |       |     1 |
-----------------------------------------------------------------------------           

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("TA"."N1"=15)
   6 - access("TB"."N1"=15)

Note how the join predicate has disappeared, and we are doing a (perfectly reasonable) Cartesian merge join. The problem is transtive closure.

If ta.n1 = 15, and tb.n1 = ta.n1, then the optimizer can infer that tb.n1 = 15. And in 9i it throws away the middle (join) step, leaving you with the predicates listed above. (Note – there are cases where the join cannot be thrown away, so the behaviour is not entirely consistent).

 WHen you get to 10g, things are different. There is a (hidden) parameter _optimizer_transitivity_retain with the description: “retain equi-join pred upon transitive equality pred generation”, and this is set to true. As a consequence, this is the execution plan I got from 10g (autotrace in this case).

---------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost  |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |   225 |  3600 |     6 |
|*  1 |  HASH JOIN                   |      |   225 |  3600 |     6 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    15 |   120 |     2 |
|*  3 |    INDEX RANGE SCAN          | T_I1 |    15 |       |     1 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T1   |    15 |   120 |     2 |
|*  5 |    INDEX RANGE SCAN          | T_I1 |    15 |       |     1 |
---------------------------------------------------------------------        

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("TB"."N1"="TA"."N1")
   3 - access("TA"."N1"=15)
   5 - access("TB"."N1"=15)

In this case, we have an improvement – the join condition has been kept so Oracle is able to do a much more efficient hash join. Bear in mind, though, you win some, you lose some. When Oracle generates a new predicate and keeps the join predicate it may, in effect, double-count the impact of the generated predicate – leaving you with a join cardinality that is much too low and leading to an inappropriate execution plan from that point onwards.

Footnote:  You’ll notice the “buffer sort” that appeared in the 9i Cartesian join. Since Oracle is joining every row to every row, a sort isn’t really necessary; but I think Oracle is taking advantage of the buffer sort mechanism to copy the second row source out of its data buffers so that the run-time engine doesn’t have to keep hitting latches and re-visiting data buffers for the second row source as it processes each row from the first row source.

Despite this optimisation, the arithmetic for the cost still seems to be the traditional nested loop calculation: cost of 1st rowsource + (cardinality of 1st rowsource * cost of 2nd rowsource).

Source : http://jonathanlewis.wordpress.com/2006/12/13/cartesian-merge-join/

One thought on “Cartesian Merge Join

  1. I’m really loving the theme/design of your website. Do you
    ever run into any browser compatibility problems? A couple of
    my blog readers have complained about my website not working correctly
    in Explorer but looks great in Firefox. Do you have any recommendations to help fix this problem?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s