Upload data from a flat file to Oracle tables

This article provides 10 practical examples on how to upload data from a flat file to Oracle tables.

Input data file for SQL*Loader

This is the input text file that contains the data that needs to be loaded into an oracle table. Each and every records needs to be in a separate line, and the column values should be delimited by some common delimiter character. For some of the examples mentioned below, we’ll use the following employee.txt file to upload the data to the employee table.

$ cat employee.txt
100,Thomas,Sales,5000
200,Jason,Technology,5500
300,Mayla,Technology,7000
400,Nisha,Marketing,9500
500,Randy,Technology,6000
501,Ritu,Accounting,5400

SQL*Loader Control File

This contains the instructions to the sqlldr utility. This tells sqlldr the location of the input file, the format of the input file, and other optional meta data information required by the sqlldr to upload the data into oracle tables.

$ cat example1.ctl
load data
 infile '/home/ramesh/employee.txt'
 into table employee
 fields terminated by ","
 ( id, name, dept, salary )

The above control file indicates the following:

  • infile – Indicates the location of the input data file
  • into table – Indicates the table name where this data should be inserted
  • fields terminated by – Indicates the delimiter that is used in the input file to separate the fields
  • ( id, name, dept, salary ) – Lists the name of the column names in the table into which the data should be uploaded

1. Basic Upload Example Using SQL*Loader

First, create the employee table as shown below.

SQL> create table employee
(
  id integer,
  name varchar2(10),
  dept varchar2(15),
  salary integer,
  hiredon date
)

Next create the control file that explains what needs to be upload and where.

$ cat sqlldr-add-new.ctl
load data
 infile '/home/ramesh/employee.txt'
 into table employee
 fields terminated by ","
 ( id, name, dept, salary )

Note: If you have the values inside the data file enclosed with double quote, use this in your control file: fields terminated by “,” optionally enclosed by ‘”‘

Note: If you don’t have the table created, you’ll get the following error message:

SQL*Loader-941: Error during describe of table EMPLOYEE
ORA-04043: object EMPLOYEE does not exist

You can pass the userid and password to the sqlldr command using any one of the following format. As you see below, both of these will prompt you for control file location, as it was not given in the command line.

$ sqlldr scott/tiger

(or)

$ sqlldr userid=scott/tiger
control =
SQL*Loader-287: No control file name specified.

Execute the sqlldr command to upload these new record to the empty table by specifying both uid/pwd and the control file location as shown below.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-add-new.ctl
Commit point reached - logical record count 5

Verify the the records are created in the database

SQL> select * from employee;

        ID NAME       DEPT                SALARY HIREDON
---------- ---------- --------------- ---------- -------
       100 Thomas     Sales                 5000
       200 Jason      Technology            5500
       300 Mayla      Technology            7000
       400 Nisha      Marketing             9500
       500 Randy      Technology            6000

This will create the output log file in the same name as the data file, but with the .log extension (instead of .ctl). Partial output shown below.

$ cat sqlldr-add-new.log

Control File:   /home/ramesh/sqlldr-add-new.ctl
Data File:      /home/ramesh/employee.txt

Table EMPLOYEE:
  5 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Elapsed time was:     00:00:00.04
CPU time was:         00:00:00.00

If you are new to Oracle database, and like to install it, follow this Oracle 11g installation guide.

2. Inserting Additional Records

Let us say you want to add two new employees to the employee table from the following newemployee.txt file.

$ vi newemployee.txt
600,Ritu,Accounting,5400
700,Jessica,Marketing,7800

If you create a similar control file like the previous example, you might get the following error message.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-add-more.ctl
SQL*Loader-601: For INSERT option, table must be empty.  Error on table EMPLOYEE

The above indicates that the table should be empty before you can upload data using sql*loader.

If you like to insert more data to the tables without having to delete the existing rows, use the “append’ command as shown in the following control file.

$ vi sqlldr-append-more.ctl
load data
 infile '/home/ramesh/newemployee.txt'
 append
 into table employee
 fields terminated by ","
 ( id, name, dept, salary )

Now, if you do sqlldr this will append the data.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-append-more.ctl
Commit point reached - logical record count 2

Verify that the records are appended successfully

SQL> select * from employee;

        ID NAME       DEPT                SALARY HIREDON
---------- ---------- --------------- ---------- -------
       100 Thomas     Sales                 5000
       200 Jason      Technology            5500
       300 Mayla      Technology            7000
       400 Nisha      Marketing             9500
       500 Randy      Technology            6000
       600 Ritu       Accounting            5400
       700 Jessica    Marketing             7800

3. Data inside the Control File using BEGINDATA

You can also specify the data directly inside the control file itself using BEGINDATA keyword. i.e Anything that comes after BEGINDATA will be treated as data to be uploaded to the table as shown below.

$ cat sqlldr-add-new-with-data.ctl
load data
 infile *
 into table employee
 fields terminated by ","
 ( id, name, dept, salary )
begindata
100,Thomas,Sales,5000
200,Jason,Technology,5500
300,Mayla,Technology,7000
400,Nisha,Marketing,9500
500,Randy,Technology,6000

Note: The infile will say ‘*’ in this case, as there is no input data file name for this example.

Execute sqlldr to upload the data from the control file.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-add-new-with-data.ctl

4. Date format and Different Delimiter

This example shows how to specify a date format in the control file and how to handle different delimiters in a data file

The following example has different delimiters ($ after name, ^ after department).

$ cat employee-date.txt
100,Thomas$Sales^5000,31-JAN-2008
200,Jason$Technology^5500,01-Feb-2005
300,Mayla$Technology^7000,10-Aug-2000
400,Nisha$Marketing^9500,12-Dec-2011
500,Randy$Technology^6000,01-JAN-2007

Create the following control file and indicate the field delimiters for each and every field using “terminated by” as shown below.

$ cat sqlldr-date.ctl
load data
 infile '/home/ramesh/employee-date.txt'
 into table employee
 fields terminated by ","
 ( id, name terminated by "$", dept terminated by "^", salary, hiredon DATE "dd-mon-yyyy" )

Load the data using sqlldr as shown below.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-date.ctl

Verify that the data got loaded properly as shown below.

SQL> select * from employee;

        ID NAME       DEPT                SALARY HIREDON
---------- ---------- --------------- ---------- ---------
       100 Thomas     Sales                 5000 31-JAN-08
       200 Jason      Technology            5500 01-FEB-05
       300 Mayla      Technology            7000 10-AUG-00
       400 Nisha      Marketing             9500 12-DEC-11
       500 Randy      Technology            6000 01-JAN-07

5. Fixed Length Data Upload

If you have a data file without data that are fixed length (i.e without any delimiter), you can use this example to upload this data.

For this example, let us use the following file which has data that are of fixed length. For example, 1st three characters are always employee number, Next 5 characters are always employee name, etc.

$ cat employee-fixed.txt
200JasonTechnology5500
300MaylaTechnology7000
400NishaTechnology9500
500RandyTechnology6000

Create the following control file, where you specific the position of each and every field as shown below usig the “Position(start:end)” syntax.

$ cat sqlldr-fixed.ctl
load data
 infile '/home/ramesh/employee-fixed.txt'
 into table employee
 fields terminated by ","
 ( id position(1:3), name position(4:8), dept position(9:18), salary position(19:22) )

Load this fixed length data using the sqlldr as shown below.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-fixed.ctl

Verify that the data got loaded.

SQL> select * from employee;

        ID NAME       DEPT                SALARY HIREDON
---------- ---------- --------------- ---------- ---------
       200 Jason      Technology            5500
       300 Mayla      Technology            7000
       400 Nisha      Technology            9500
       500 Randy      Technology            6000

6. Change the data during upload

You can also massage the data and change it during upload based on certain rules.

In the following control file:

  • id is incremented by 999 before uploading. i.e if the emp id is 100 in the data file, it will be loaded as 1099
  • Convert the name to upper case and load it. This uses the upper function.
  • If the department contains the value “Technology” change it to “Techies”. This uses decode function
$ cat sqlldr-change-data.ctl
load data
 infile '/home/ramesh/employee.txt'
 into table employee
 fields terminated by ","
 ( id ":id+999",
   name "upper(:name)",
   dept  "decode(:dept,'Technology','Techies', :dept)",
   salary
  )

Load the data using this control file which will massage the data before uploading it.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-change-data.ctl

Verify that the data got changed while loading as per our rules.

SQL> select * from employee;

        ID NAME       DEPT                SALARY HIREDON
---------- ---------- --------------- ---------- ---------
      1099 THOMAS     Sales                 5000
      1199 JASON      Techies               5500
      1299 MAYLA      Techies               7000
      1399 NISHA      Marketing             9500
      1499 RANDY      Techies               6000

7. Load data from multiple files

To load data from multiple files, you just have to specify multiple infile in the control file.

The following control file loads data from two different data files (employee.txt and newemployee.txt) to the employee table.

$ sqlldr-add-multiple.ctl
load data
 infile '/home/ramesh/employee.txt'
 infile '/home/ramesh/newemployee.txt'
 into table employee
 fields terminated by ","
 ( id, name, dept, salary )

Load the data using this control file which will upload data from multiple data files as shown below.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-add-multiple.ctl
Commit point reached - logical record count 5
Commit point reached - logical record count 7

8. Load data to Multiple Tables

Create another table called bonus which will have employee id and bonus columns.

create table bonus
  ( id integer,
    bonus integer
  );

Create the employee-bonus.txt data file that contains the fields: id, name, department, salary, bonus

$ cat employee-bonus.txt
100 Thomas Sales      5000 1000
200 Jason  Technology 5500 2000
300 Mayla  Technology 7000 2000
400 Nisha  Marketing  9500 1000
500 Randy  Technology 6000 3000

Create the control file as shown below, which will upload the data from the above file to two different tables. As shown below, you should have two “into table” commands, and specify the position of the data which needs to be used to upload the data to that column.

$ cat sqlldr-multiple-tables.ctl
load data
 infile '/home/ramesh/employee-bonus.txt'
 into table employee
 ( id position(1:3),
   name position(5:10),
   dept position(12:21),
   salary position(23:26))
 into table bonus
 ( id position(1:3),
   bonus position(28:31))

Load the data to multiple tables using this control file as shown below.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-multiple-tables.ctl

Verify that the data got loaded to multiple tables successfully.

SQL> select * from employee;

        ID NAME       DEPT                SALARY HIREDON
---------- ---------- --------------- ---------- ---------
       100 Thomas     Sales                 5000
       200 Jason      Technology            5500
       300 Mayla      Technology            7000
       400 Nisha      Marketing             9500
       500 Randy      Technology            6000

SQL> select * from bonus;

        ID      BONUS
---------- ----------
       100       1000
       200       2000
       300       2000
       400       1000
       500       3000

9. Handling Bad (Rejected) Records

In the following example, we have two bad records. Employee id 300 and 500 has salary column which is not numeric.

$ cat employee-bad.txt
100,Thomas,Sales,5000
200,Jason,Technology,5500
300,Mayla,Technology,7K
400,Nisha,Marketing,9500
500,Randy,Technology,6K

Use the following control file for this example.

$ cat sqlldr-bad.ctl
load data
 infile '/home/ramesh/employee-bad.txt'
 into table employee
 fields terminated by ","
 ( id, name, dept, salary )

Load the data (including the invalid records) using this control file as shown below.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-bad.ctl
Commit point reached - logical record count 5

As you see from the abvoe output, it still says “logical record count 5″, but you should check the log files to see if it has rejected any records.

The log file indicates that 2 records are rejected as shown below:

Control File:   /home/ramesh/sqlldr-bad.ctl
Data File:      /home/ramesh/employee-bad.txt
  Bad File:     /home/ramesh/employee-bad.bad
  Discard File:  none specified

Table EMPLOYEE:
  3 Rows successfully loaded.
  2 Rows not loaded due to data errors.

By default the rejected records are stored in a file that has the same name as the data file (but with .bad extension)

$ cat employee-bad.bad
300,Mayla,Technology,7K
500,Randy,Technology,6K

As you see below, the employee table has only 3 records (as 2 of them were rejected).

SQL> select * from employee;

        ID NAME       DEPT                SALARY HIREDON
---------- ---------- --------------- ---------- ---------
       100 Thomas     Sales                 5000
       200 Jason      Technology            5500
       400 Nisha      Marketing             9500

10. Load Specific Rows from a datafile

If you want to load only a specific records from a data file use the WHEN in the control file.

Add the line “when” next to “into table” line. In the following control file, the when clause indicates that it will load only the records that have dept as “Technology”.

$ cat sqlldr-when.ctl
load data
 infile '/home/ramesh/employee.txt'
 into table employee
 when dept = 'Technology'
 fields terminated by ","
 ( id, name, dept, salary )

Load the selective data (only the “Technology” records) using this control file as shown below.

$ sqlldr scott/tiger control=/home/ramesh/sqlldr-when.ctl
Commit point reached - logical record count 5

As you see from the above output, it still says “logical record count 5″, but you should check the log files to see how many records were loaded, and how many records were discarded because it didn’t match the when condition.

The following from the log file shows that 5 records were read, and 2 of them were discarded as it didn’t match the when condition.

Discard File:  none specified
Total logical records read:             5
Total logical records discarded:        2

Verify that only the selective records were loaded into the table.

SQL> select * from employee;

        ID NAME       DEPT                SALARY HIREDON
---------- ---------- --------------- ---------- ---------
       200 Jason      Technology            5500
       300 Mayla      Technology            7000
       500 Randy      Technology            6000
Advertisements

Use Teradata Parallel Transporter to Move Data Without Landing the Data to Disk

Teradata Parallel Transporter (Teradata PT) supports moving data from a table on a Teradata Database system, or from an ODBC-compliant RDBMS, to a Teradata table on a different Teradata Database system without landing the data to disk.

There are performance benefits, cost savings, and ease of script maintenance associated with using Teradata PT to move data without landing it to disk.
Performance benefits

Unlike the traditional Teradata utilities, Teradata PT can start data exporting and data loading at the same time without having to wait for data exporting to complete. A Teradata PT producer operator, such as the Export operator, can export data from a table and send it to a Teradata PT data stream. A Teradata PT consumer operator, such as the Load operator, can consume the data from the data stream and load it into a Teradata Database table. Data streams consist of in-memory buffers where data is passed from a producer to a consumer operator.

A user of the traditional Teradata utilities can run the Teradata FastExport utility to export data to a disk file and then the Teradata FastLoad utility to load data from the disk file to an empty Teradata Database table. If a user FastExports data into a named pipe, the user can then use FastLoad to read the data from the named pipe.

We should add that when two scripts are needed to move data, there is always the possibility of inadvertently running FastLoad, for example, before FastExport.
Cost savings

When a very large amount of data (greater than 50 gigabytes) is moved through a Teradata PT data stream without landing the data to disk, a Teradata PT user saves on the cost of purchasing additional disk drives. A user can use the disk space that was used for some other purpose.
Single-script maintenance

Unlike the traditional Teradata utilities, a single Teradata PT job script can be coded to move data without landing it to disk. If a user needs to make a change to the Teradata PT job, the user only needs to change the Teradata PT job script.

Traditional Teradata utilities require two scripts: one FastExport script to export the data and one script to load the data.
Sample scripts

Here are two sample Teradata PT job scripts.

In each script, the Export operator extracts the data from the SOURCE_EMP_TABLE table and the Load operator loads the data into an empty table called TARGET_EMP_TABLE.

Both scripts use job variables. But the second script uses more job variables than the first, in addition to concatenating quoted scripts and job variables.
Sample Script 1
DEFINE JOB MOVE_DATA_WITHOUT_LANDING_TO_DISK

DESCRIPTION ‘MOVE DATA WITHOUT LANDING THE DATA TO DISK’

(

/*** Schema Definition ***/

DEFINE SCHEMA EMPLOYEE_SCHEMA

DESCRIPTION ‘SAMPLE EMPLOYEE SCHEMA’

(

EMP_ID INTEGER,

EMP_NAME CHAR(10)

);

/*** Export Operator Definition ***/

DEFINE OPERATOR EXPORT_OPERATOR

DESCRIPTION ‘TERADATA PARALLEL TRANSPORTER EXPORT OPERATOR’

TYPE EXPORT

SCHEMA EMPLOYEE_SCHEMA

ATTRIBUTES

(

VARCHAR PrivateLogName = ‘exportoper_privatelog’,

INTEGER MaxSessions = 8,

INTEGER MinSessions,

VARCHAR TdpId = @SourceTdpId,

VARCHAR UserName = @SourceUserName,

VARCHAR UserPassword = @SourceUserPassword,

VARCHAR SelectStmt = ‘SELECT * FROM SOURCE_EMP_TABLE;’

);

/*** Load Operator Definition ***/

DEFINE OPERATOR LOAD_OPERATOR

DESCRIPTION ‘TERADATA PARALLEL TRANSPORTER LOAD OPERATOR’

TYPE LOAD

SCHEMA EMPLOYEE_SCHEMA

ATTRIBUTES

(

VARCHAR PrivateLogName = ‘loadoper_privatelog’,

INTEGER MaxSessions = 16,

INTEGER MinSessions,

VARCHAR TargetTable = ‘TARGET_EMP_TABLE’,

VARCHAR TdpId = @TargetTdpId,

VARCHAR UserName = @TargetUserName,

VARCHAR UserPassword = @TargetUserPassword,

VARCHAR ErrorTable1 = ‘LOADOPER_ERRTABLE1’,

VARCHAR ErrorTable2 = ‘LOADOPER_ERRTABLE2’,

VARCHAR LogTable = ‘LOADOPER_LOGTABLE’

);

/*** Apply Statement ***/

APPLY

(‘INSERT INTO TARGET_EMP_TABLE (:EMP_ID, :EMP_NAME);’)

TO OPERATOR (LOAD_OPERATOR [1])

SELECT * FROM OPERATOR (EXPORT_OPERATOR [2]);

);
Sample job variable file for Sample Script 1
SourceTdpId = ‘MySourceTdpid’

,SourceUserName = ‘MySourceUserName’

,SourceUserPassword = ‘MySourceUserPassword’

,TargetTdpId = ‘MyTargetTdpId’

,TargetUserName = ‘MyTargetUserName’

,TargetUserPassword = ‘MyTargetUserPassword’
Sample Script 2
DEFINE JOB MOVE_DATA_WITHOUT_LANDING_TO_DISK

DESCRIPTION ‘MOVE DATA WITHOUT LANDING THE DATA TO DISK’

(

/*** Schema Definition ***/

DEFINE SCHEMA EMPLOYEE_SCHEMA

DESCRIPTION ‘SAMPLE EMPLOYEE SCHEMA’

(

EMP_ID INTEGER,

EMP_NAME CHAR(10)

);

/*** Export Operator Definition ***/

DEFINE OPERATOR EXPORT_OPERATOR

DESCRIPTION ‘TERADATA PARALLEL TRANSPORTER EXPORT OPERATOR’

TYPE EXPORT

SCHEMA EMPLOYEE_SCHEMA

ATTRIBUTES

(

VARCHAR PrivateLogName = ‘exportoper_privatelog’,

INTEGER MaxSessions = 8,

INTEGER MinSessions,

VARCHAR TdpId = @SourceTdpId,

VARCHAR UserName = @SourceUserName,

VARCHAR UserPassword = @SourceUserPassword,

VARCHAR SelectStmt = ‘SELECT * FROM ‘ || @SourceTable || ‘;’

);

/*** Load Operator Definition ***/

DEFINE OPERATOR LOAD_OPERATOR

DESCRIPTION ‘TERADATA PARALLEL TRANSPORTER LOAD OPERATOR’

TYPE LOAD

SCHEMA EMPLOYEE_SCHEMA

ATTRIBUTES

(

VARCHAR PrivateLogName = ‘loadoper_privatelog’,

INTEGER MaxSessions = 16,

INTEGER MinSessions,

VARCHAR TargetTable = @TargetTable,

VARCHAR TdpId = @TargetTdpId,

VARCHAR UserName = @TargetUserName,

VARCHAR UserPassword = @TargetUserPassword,

VARCHAR ErrorTable1 = @ErrorTable1,

VARCHAR ErrorTable2 = @ErrorTable2,

VARCHAR LogTable = @LogTable

);

/*** Apply Statement ***/

APPLY

(‘INSERT INTO ‘ || @TargetTable || ‘ (:EMP_ID, :EMP_NAME);’)

TO OPERATOR (LOAD_OPERATOR [1])

SELECT * FROM OPERATOR (EXPORT_OPERATOR [2]);

);
Sample job variable file for Sample Script 2
SourceTdpId = ‘MySourceTdpid’

,SourceUserName = ‘MySourceUserName’

,SourceUserPassword = ‘MySourceUserPassword’

,TargetTdpId = ‘MyTargetTdpId’

,TargetUserName = ‘MyTargetUserName’

,TargetUserPassword = ‘MyTargetUserPassword’

,SourceTable = ‘SOURCE_EMP_TABLE’

,TargetTable = ‘TARGET_EMP_TABLE’

,ErrorTable1 = ‘LOADOPER_ERRTABLE1’

,ErrorTable2 = ‘LOADOPER_ERRTABLE2’

,LogTable = ‘LOADOPER_LOGTABLE’

For information on the concatenation of quoted strings and job variables, see Teradata Parallel Transporter User Guide.

Source : http://developer.teradata.com/tools/articles/use-teradata-parallel-transporter-to-move-data-without-landing-the-data-to-disk

Efficient Data Upload Method Via External Table

If you ever have tried to upload data to Netezza using convention query tool methods which create individual inserts, you will know this can be EXTREMELY SLOW. The way to upload data efficiently is to create an on-the-fly external table on your local datafile and use that in a CTAS (create table as) or an insert into a table. This is similar to the data download method in the previous post. In my example I create a temp table, but it could also be a perm table. This example is for comma delimited with backslash as the escape character, but you can change this to match your source file. If you cut and paste these examples, becareful your quotes don’t become open and close quotes.

Via ODBC Like Aginity Workbench

CREATE TEMP TABLE BUBBA5 AS

SELECT * FROM EXTERNAL ‘C:\\Users\\ValuedCustomer\\customer dim dump.csv’ ( CUSTOMER_FIRST_NAME VARCHAR(50) NOT NULL,

CUSTOMER_LAST_NAME VARCHAR(50) NOT NULL,

CUSTOMER_ADDRESS VARCHAR(200) NOT NULL,

CUSTOMER_CITY VARCHAR(50) NOT NULL,

CUSTOMER_STATE VARCHAR(20) NOT NULL )

USING ( QUOTEDVALUE ‘DOUBLE’ NULLVALUE ” CRINSTRING TRUE CTRLCHARS TRUE

LOGDIR ‘C:\\Users\\ValuedCustomer\\AppData\\Local\\Temp\\mylog.tmp\\’ DELIMITER ‘,’ Y2BASE 2000 ENCODING ‘internal’ REMOTESOURCE ‘ODBC’ ESCAPECHAR ‘\’ )

Via JDBC Like with Squirrel SQL

CREATE TEMP TABLE BUBBA5 AS

SELECT * FROM EXTERNAL ‘C:\\Users\\ValuedCustomer\\customer dim dump.csv’ ( CUSTOMER_FIRST_NAME VARCHAR(50) NOT NULL,

CUSTOMER_LAST_NAME VARCHAR(50) NOT NULL,

CUSTOMER_ADDRESS VARCHAR(200) NOT NULL,

CUSTOMER_CITY VARCHAR(50) NOT NULL,

CUSTOMER_STATE VARCHAR(20) NOT NULL )

USING ( QUOTEDVALUE ‘DOUBLE’ NULLVALUE ” CRINSTRING TRUE CTRLCHARS TRUE

LOGDIR ‘C:\\Users\\ValuedCustomer\\AppData\\Local\\Temp\\mylog.tmp\\’ DELIMITER ‘,’ Y2BASE 2000 ENCODING ‘internal’ REMOTESOURCE ‘JDBC’ ESCAPECHAR ‘\’ )

Source : http://nztips.com/

Unable to start opmn “RCV: Permission denied”

Cause :

You have previously started the opmn process as an incorrect user (such as root) but are now logged in as the unix user that owns the Application Server installation. The AS unix owner then experiences permission related errors communicating with the root initiated OPMN process.

It can be confirmed that this has taken place by listing the ownership of the running “opmn -d” processes:
$ ps -ef|grep ‘opmn -d’|grep -v ‘grep’
root 31011 1 0 09:08 ? 00:00:00
/as1013/app/oracle/product/mid101300a/opmn/bin/opmn -d
root 31012 31011 0 09:08 ? 00:00:00
/as1013/app/oracle/product/mid101300a/opmn/bin/opmn -d
Solution

To implement the solution, please execute the following steps:

1. Capture output from “ps -ef|grep ‘opmn -d’|grep -v ‘grep'” to identify current opmn daemon processes to be terminated.
2. Capture output from “opmnctl status -l” for future reference – it may be necessary to manually kill processes that the current “opmn -d” processes have started
3. Log in to the UNIX account that owns the current “opmn -d” processes (this may require root privileges).
4. Use “opmnctl stopall” to stop the opmn daemon and all processes 5. Use “ps -ef|grep ‘opmn -d’|grep -v ‘grep'” to confirm the two “opmn -d” processes have been terminated
– use “kill -9 ” on each PID to force this if necessary (again, this may need root privileges)
6. Review the PID column from the “opmnctl status -l” and confirm that each process has been stopped.
– again use “kill -9 ” on each PID to force this if necessary (again, this may need root privileges)
7. For the directory of each OC4J instance that exists below $ORACLE_HOME/j2ee,
rename the “persistence” directory to “persistence.old”
8. Confirm if any files under the $ORACLE_HOME have been created under the incorrect unix account:

$ find $ORACLE_HOME -user root -print
With the exception of files below the $ORACLE_HOME/Apache/Apache/bin directory, use “chown” to reset the owner and group of each files that has changed.

For example:

$ cd $ORACLE_HOME/opmn/conf
$ ls -ld1 .*
drwx—— 3 mid101300a oinstall 4096 Sep 7 09:08 .
drwx—— 8 mid101300a oinstall 4096 May 30 09:00 ..
-r——– 1 root root 21 Sep 7 09:08 .formfactor
Use the ownership of the directory guide the “chown” command needed:

$ chown mid101300a:oinstall .formfactor
9. Log out from the incorrect UNIX account and log back in as the UNIX user owning the AS installation.
10. Start opmn and desired managed processes:

$ opmnctl start

$ opmnctl startproc process-type=HTTP_Server
opmnctl: starting opmn managed processes…

$ opmnctl startproc process-type=home
opmnctl: starting opmn managed processes…
11. Confirm via “ps -ef|grep ‘opmn -d’|grep -v ‘grep’ ” that opmn daemon processes are owned by the expected user
12. Use “opmnctl status -l” to confirm all processes have started correctly.

source : https://forums.oracle.com/forums/thread.jspa?messageID=2365576

CACHE | NOCACHE | CACHE READS

CACHE | NOCACHE | CACHE READS

Use the CACHE clauses to indicate how Oracle Database should store blocks in the buffer cache. If you specify neither CACHE nor NOCACHE, then:

In a CREATE TABLE statement, NOCACHE is the default

In an ALTER TABLE statement, the existing value is not changed.

CACHE For data that is accessed frequently, this clause indicates that the blocks retrieved for this table are placed at the most recently used end of the least recently used (LRU) list in the buffer cache when a full table scan is performed. This attribute is useful for small lookup tables.

As a parameter in the LOB_storage_clause, CACHE specifies that the database places LOB data values in the buffer cache for faster access.

Restriction on CACHE You cannot specify CACHE for an index-organized table. However, index-organized tables implicitly provide CACHE behavior.

NOCACHE For data that is not accessed frequently, this clause indicates that the blocks retrieved for this table are placed at the least recently used end of the LRU list in the buffer cache when a full table scan is performed. NOCACHE is the default for LOB storage.

As a parameter in the LOB_storage_clause, NOCACHE specifies that the LOB value either is not brought into the buffer cache or is brought into the buffer cache and placed at the least recently used end of the LRU list. The latter is the default behavior.

Restriction on NOCACHE You cannot specify NOCACHE for an index-organized table.

CACHE READS CACHE READS applies only to LOB storage. It specifies that LOB values are brought into the buffer cache only during read operations but not during write operations.

parallel_clause

The parallel_clause lets you parallelize creation of the table and set the default degree of parallelism for queries and the DML INSERT, UPDATE, DELETE, and MERGE after table creation.

Note:

The syntax of the parallel_clause supersedes syntax appearing in earlier releases of Oracle. Superseded syntax is still supported for backward compatibility but may result in slightly different behavior from that documented.
NOPARALLEL Specify NOPARALLEL for serial execution. This is the default.

PARALLEL Specify PARALLEL if you want Oracle to select a degree of parallelism equal to the number of CPUs available on all participating instances times the value of the PARALLEL_THREADS_PER_CPU initialization parameter.

PARALLEL integer Specification of integer indicates the degree of parallelism, which is the number of parallel threads used in the parallel operation. Each parallel thread may use one or two parallel execution servers. Normally Oracle calculates the optimum degree of parallelism, so it is not necessary for you to specify integer.

MONITORING | NOMONITORING

In earlier releases, you could use these clauses to start or stop the collection of modification statistics on this table. These clauses have been deprecated.

If you formerly collected modification statistics on tables by using the DBMS_STATS package in GATHER AUTO or GATHER STALE mode, then you no longer have to do anything. Oracle Database now collects these statistics automatically, and the MONITORING and NOMONITORING keywords in existing code will not cause an error.

If, for performance reasons, you do not want to collect modification statistics on any tables, then you should set the STATISTICS_LEVEL initialization parameter to BASIC. Be aware, however, that doing so disables a number of manageability features.

source : http://docs.oracle.com

Oracle 11g: Access Control List and ORA-24247

With ACL’s, Oracle offers more fine-grained access control for users to access external network resources.

The packages UTL_MAIL, UTL_SMTP, UTL_HTTP, UTL_TCP etc. allow communication beyond the database server to the outside world, but when granted access, all hosts can be accessed. This can be interpreted as a security flaw as no login is required when using UTL_TCP for example. DBA’s are advised to revoke the execute privileges from public on these kind of packages.

Since Oracle 11g, the Access Control List is introduced. You not only can control who has access to these packages by granting, but now you can also control which resources they can call.
For instance, when a user is granted to send emails using UTL_MAIL, you can also control that he/she is only able to send through a specified mail server.
At first this looks like a obstacle (ORA-24247), but since the Voyager worm struck Oracle databases a year ago, it is introduced as an extra security measurement.

I will use the UTL_MAIL package as an example, please scroll to the end of this blog to enable UTL_MAIL as it is disabled by default.

ORA-24247: network access denied by access control list (ACL)

If a user is not allowed to connect to a specific server due to ACL restrictions, the following message will appear: ORA-24247: network access denied by access control list (ACL).

I will walk through the solution by using UTL_MAIL as an example; try to execute the following as SCOTT:

SQL> connect scott/tiger
Connected.
begin
  utl_mail.send(
  sender     => 'scott@tiger.com',
  recipients => 'john@doe.org',
  message    => 'Hello World'
  );
end;
ERROR at line 1:
ORA-24247: network access denied by access control list (ACL)
ORA-06512: at "SYS.UTL_MAIL", line 654
ORA-06512: at "SYS.UTL_MAIL", line 671
ORA-06512: at line 1

This is because the SCOTT does not have the privilege to access the mail/smtp server. So it must be added to the ACL.

ACL: Access Control List

The ACL is created as a file and it’s file name is used as the key in the process of adding and removing privileges.

Create ACL and privileges

Now first create an ACL as SYS (or any other user with DBMS_NETWORK_ACL_ADMIN execute granted), this will hold the privileges. You can add as many privileges as you like to this file, but I would recommend to split privileges in the ACL to specific tasks or users. You must create an ACL with at least one privilege, so lets start with the ‘connect’ privilege for user SCOTT, (also a role can be added as principal):

begin
  dbms_network_acl_admin.create_acl (
    acl         => 'utl_mail.xml',
    description => 'Allow mail to be send',
    principal   => 'SCOTT',
    is_grant    => TRUE,
    privilege   => 'connect'
    );
    commit;
end;

Add Privilege

Great, now that the ACL is created, you can add more privileges like the ‘resolve’ privilege:

begin
  dbms_network_acl_admin.add_privilege (
  acl       => 'utl_mail.xml',
  principal => 'SCOTT',
  is_grant  => TRUE,
  privilege => 'resolve'
  );
  commit;
end;

Assign ACL

Cool, you granted SCOTT to connect and resolve, but you have not defined to which resources he is allowed to connect:

begin
  dbms_network_acl_admin.assign_acl(
  acl  => 'utl_mail.xml',
  host => 'smtp server host name or address'
  );
  commit;
end;

Try again

SQL> connect scott/tiger
Connected.
begin
  utl_mail.send(
  sender     => 'scott@tiger.com',
  recipients => 'john@doe.org',
  message    => 'Hello World'
  );
  commit;
end;
PL/SQL procedure successfully completed.

Access to websites and ports

The ACL also allows you to control begin and end ports, begin and end dates.

Run as SCOTT:

SQL> select utl_http.request('http://www.tiger.com') from dual;
select utl_http.request('http://www.tiger.com') from dual
       *
ERROR at line 1:
ORA-29273: HTTP request failed
ORA-06512: at "SYS.UTL_HTTP", line 1722
ORA-24247: network access denied by access control list (ACL)
ORA-06512: at line 1

Solution Create ACL, add privileges and assign the ACL with ports

Run as SYS:

begin
  dbms_network_acl_admin.create_acl (
    acl         => 'utl_http.xml',
    description => 'HTTP Access',
    principal   => 'SCOTT',
    is_grant    => TRUE,
    privilege   => 'connect',
    start_date  => null,
    end_date    => null
  );

  dbms_network_acl_admin.add_privilege (
    acl        => 'utl_http.xml',
    principal  => 'SCOTT',
    is_grant   => TRUE,
    privilege  => 'resolve',
    start_date => null,
    end_date   => null
  );

  dbms_network_acl_admin.assign_acl (
    acl        => 'utl_http.xml',
    host       => 'www.tiger.com',
    lower_port => 80,
    upper_port => 80
  );
  commit;
end;

The hosts parameter in dbms_network_acl_admin.assign_acl, can also contain wild cards like ‘*.tiger.com’ or even ‘*’.

Try again

Run as SCOTT:

SQL> select utl_http.request('http://www.tiger.com') from dual;
UTL_HTTP.REQUEST('HTTP://WWW.TIGER.COM')
----------------------------------------
[result here]

Now try to access the same URL, but with another port. You will see this fails, because only port 80 is privileged.

SQL> select utl_http.request('http://www.tiger.com:1234') from dual;
select utl_http.request('http://www.tiger.com:1234') from dual
       *
ERROR at line 1:
ORA-29273: HTTP request failed
ORA-06512: at "SYS.UTL_HTTP", line 1722
ORA-24247: network access denied by access control list (ACL)
ORA-06512: at line 1

You can specify lower and upper ports like a range when assigning ACL’s.

dba_network_acls

You can view ACL’s and privileges by querying dba_network_acls.

select host, lower_port, upper_port, acl
  from dba_network_acls
  where ACL='/sys/acls/'utl_http.xml';

Removing ACL and priviliges

Of course , removing ACL’s and privileges is also possible and  self explainable.

Run the following as SYS:

Unassign ACL

begin
  dbms_network_acl_admin.unassign_acl(
    acl        => 'utl_http.xml',
    host       => 'www.tiger.com',
    lower_port => 80,
    upper_port => 80
  );
end;

Delete Privilege

begin
  dbms_network_acl_admin.delete_privilege(
    'utl_http.xml', 'SCOTT', NULL, 'connect'
  );
end;

Drop ACL

begin
  dbms_network_acl_admin.drop_acl(
    'utl_http.xml'
  );
end;

Enabling UTL_MAIL

In these examples I use UTL_MAIL, this package is disabled by default, run the following statements as SYS to enable it:

SQL> @?/rdbms/admin/utlmail.sql
SQL> @?/rdbms/admin/prvtmail.plb
SQL> alter system set smtp_out_server = '<smtp host>' scope=spfile;
SQL> shutdown immediate
SQL> startup

Source : http://blog.whitehorses.nl/2010/03/17/oracle-11g-access-control-list-and-ora-24247/

Interpreting Explain Plan

What’s an explain plan?

An explain plan is a representation of the access path that is taken when a query is executed within Oracle.

Query processing can be divided into 7 phases:
[1] Syntactic 	Checks the syntax of the query
[2] Semantic 	Checks that all objects exist and are accessible
[3] View Merging 	Rewrites query as join on base tables as opposed to using views
[4] Statement
     Transformation 	Rewrites query transforming some complex constructs into simpler ones where appropriate (e.g. subquery merging, in/or transformation)
[5] Optimization 	Determines the optimal access path for the query to take. With the Rule Based Optimizer (RBO) it uses a set of heuristics to determine access path. With the Cost Based Optimizer (CBO) we use statistics to analyze the relative costs of accessing objects.
[6] QEP Generation 	QEP = Query Evaluation Plan
[7] QEP Execution 	QEP = Query Evaluation Plan

Steps [1]-[6] are handled by the parser. Step [7] is the execution of the statement.

The explain plan is produced by the parser. Once the access path has been decided upon it is stored in the library cache together with the statement itself. We store queries in the library cache based upon a hashed representation of that query. When looking for a statement in the library cache, we first apply a hashing algorithm to the statement and then we look for this hash value in the library cache. This access path will be used until the query is reparsed.

Terminology Row Source A set of rows used in a query may be a select from a base object or the result set returned by joining 2 earlier row sources
Predicate where clause of a query
Tuples rows
Driving Table This is the row source that we use to seed the query. If this returns a lot of rows then this can have a negative affect on all subsequent operations
Probed Table This is the object we lookup data in after we have retrieved relevant key data from the driving table.

How does Oracle access data?

At the physical level Oracle reads blocks of data. The smallest amount of data read is a single Oracle block, the largest is constrained by operating system limits (and multiblock i/o). Logically Oracle finds the data to read by using the following methods:

#Full Table Scan (FTS)
#Index Lookup (unique & non-unique)
#Rowid

Explain plan Hierarchy

Simple explain plan:

Query Plan
-----------------------------------------
SELECT STATEMENT     [CHOOSE] Cost=1234
  TABLE ACCESS FULL LARGE [:Q65001] [ANALYZED]

The rightmost uppermost operation of an explain plan is the first thing that the explain plan will execute. In this case TABLE ACCESS FULL LARGE is the first operation. This statement means we are doing a full table scan of table LARGE. When this operation completes then the resultant row source is passed up to the
next level of the query for processing. In this case it is the SELECT STATEMENT which is the top of the query.

[CHOOSE] is an indication of the optimizer_goal for the query. This DOES NOT necessarily indicate that plan has actually used this goal. The only way to confirm this is to check the
cost= part of the explain plan as well. For example the following query indicates that the CBO has been used because there is a cost in the cost field:

SELECT STATEMENT     [CHOOSE] Cost=1234

However the explain plan below indicates the use of the RBO because the cost field is blank:

SELECT STATEMENT     [CHOOSE] Cost=

The cost field is a comparative cost that is used internally to determine the best cost for particular plans. The costs of different statements are not really directly comparable.

[:Q65001] indicates that this particular part of the query is being executed in parallel. This number indicates that the operation will be processed by a parallel query slave as opposed to being executed serially.

[ANALYZED] indicates that the object in question has been analyzed and there are currently statistics available for the CBO to use. There is no indication of the ‘level’ of analysis done.

Access Methods in detail

Full Table Scan (FTS)

In a FTS operation, the whole table is read up to the high water mark (HWM). The HWM marks the last block in the table that has ever had data written to it. If you have deleted all the rows then you will still read up to the HWM. Truncate resets the HWM back to the start of the table. FTS uses multiblock i/o to read the blocks from disk. Multiblock i/o is controlled by the parameter .

This defaults to:

db_block_buffers / ( (PROCESSES+3) / 4 )

Maximum values are OS dependant

Buffers from FTS operations are placed on the Least Recently Used (LRU) end of the buffer cache so will be quickly aged out. FTS is not recommended for large tables unless you are reading >5-10% of it (or so) or you intend to run in parallel.

Example FTS explain plan:

SQL> explain plan for select * from dual;

Query Plan
-----------------------------------------
SELECT STATEMENT     [CHOOSE] Cost=
  TABLE ACCESS FULL DUAL

Index lookup

Data is accessed by looking up key values in an index and returning rowids. A rowid uniquely identifies an individual row in a particular data block. This block is read via single block i/o.

In this example an index is used to find the relevant row(s) and then the table is accessed to lookup the ename column (which is not included in the index):

SQL> explain plan for
select empno,ename from emp where empno=10;
Query Plan

------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
    INDEX UNIQUE SCAN EMP_I1

Notice the ‘TABLE ACCESS BY ROWID’ section. This indicates that the table data is not being accessed via a FTS operation but rather by a rowid lookup. In this case the rowid has been produced by looking up values in the index first. The index is being accessed by an ‘INDEX UNIQUE SCAN’ operation. This is explained below. The index name in this case is EMP_I1. If all the required data resides in the index then a table lookup may be unnecessary and all you will see is an index access with no table access.

In the following example all the columns (empno) are in the index. Notice that no table access takes place:

SQL> explain plan for
select empno from emp where empno=10;

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
  INDEX UNIQUE SCAN EMP_I1

Indexes are presorted so sorting may be unecessary if the sort order required is the same as the index.

SQL> explain plan for select empno,ename from emp
where empno > 7876 order by empno;

Query Plan
-------------------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
  INDEX RANGE SCAN EMP_I1 [ANALYZED]

In this case the index is sorted so ther rows will be returned in the order of the index hence a sort is unecessary.

SQL> explain plan for
select /*+ Full(emp) */ empno,ename from emp
where empno> 7876 order by empno; 

Query Plan
-------------------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=9
  SORT ORDER BY
    TABLE ACCESS FULL EMP [ANALYZED]  Cost=1 Card=2 Bytes=66

Because we have forced a FTS the data is unsorted and so we must sort the data
after it has been retrieved.
There are 4 methods of index lookup:
index unique scan
index range scan
index full scan
index fast full scan


Index unique scan

Method for looking up a single key value via a unique index. Always returns a single value You must supply AT LEAST the leading column of the index to access data via the index, However this may return > 1 row as the uniqueness will not be guaranteed.

SQL> explain plan for
select empno,ename from emp where empno=10;

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
INDEX UNIQUE SCAN EMP_I1

Index range scan

Method for accessing multiple column values You must supply AT LEAST the leading column of the index to access data via the index Can be used for range operations (e.g. > < >= explain plan for select empno,ename from emp
where empno > 7876 order by empno;

Query Plan
-------------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
INDEX RANGE SCAN EMP_I1 [ANALYZED]

A non-unique index may return multiple values for the predicate col1 = 5 and will use an index range scan

SQL> explain plan for select mgr from emp where mgr = 5

Query plan
--------------------
SELECT STATEMENT [CHOOSE] Cost=1
INDEX RANGE SCAN EMP_I2 [ANALYZED]

Index Full Scan

In certain circumstances it is possible for the whole index to be scanned as opposed to a range scan (i.e. where no constraining predicates are provided for a table). Full index scans are only available in the CBO as otherwise we are unable to determine whether a full scan would be a good idea or not. We choose an index Full Scan when we have statistics that indicate that it is going to be more efficient than a Full table scan and a sort.

For example we may do a Full index scan when we do an unbounded scan of an index and want the data to be ordered in the index order. The optimizer may decide that selecting all the information from the index and not sorting is more efficient than doing a FTS or a Fast Full Index Scan and then sorting.

An Index full scan will perform single block i/o’s and so it may prove to be inefficient. Index BE_IX is a concatenated index on big_emp (empno,ename)

SQL> explain plan for select empno,ename
     from big_emp order by empno,ename; 

Query Plan
------------------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=26
  INDEX FULL SCAN BE_IX [ANALYZED]

Index Fast Full Scan

Scans all the block in the index Rows are not returned in sorted order Introduced in 7.3 and requires V733_PLANS_ENABLED=TRUE and CBO may be hinted using INDEX_FFS hint uses multiblock i/o can be executed in parallel can be used to access second column of concatenated indexes. This is because we are selecting all of the index.

Note that INDEX FAST FULL SCAN is the mechinism behind fast index create and recreate. Index BE_IX is a concatenated index on big_emp (empno,ename)

SQL> explain plan for select empno,ename from big_emp;

Query Plan
------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=1
  INDEX FAST FULL SCAN BE_IX [ANALYZED]

Selecting the 2nd column of concatenated index:

SQL> explain plan for select ename from big_emp;

Query Plan
------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=1
INDEX FAST FULL SCAN BE_IX [ANALYZED]

Rowid

This is the quickest access method available Oracle simply retrieves the block specified and extracts the rows it is interested in. Most frequently seen in explain plans as Table access by Rowid

SQL> explain plan for select * from dept where rowid = ':x';

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS BY ROWID DEPT [ANALYZED] 

Table is accessed by rowid following index lookup: 

SQL> explain plan for
select empno,ename from emp where empno=10;

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
    INDEX UNIQUE SCAN EMP_I1

Joins

A Join is a predicate that attempts to combine 2 row sources We only ever join 2 row sources together Join steps are always performed serially even though underlying row sources may have been accessed in parallel. Join order – order in which joins are performed

The join order makes a significant difference to the way in which the query is executed. By accessing particular row sources first, certain predicates may be satisfied that are not satisfied by with other join orders. This may prevent certain access paths from being taken.

Suppose there is a concatenated index on A(a.col1,a.col2). Note that a.col1 is the leading column. Consider the following query:

select A.col4
from   A,B,C
where  B.col3 = 10
and    A.col1 = B.col1
and    A.col2 = C.col2
and    C.col3 = 5 

We could represent the joins present in the query using the following schematic: 

  B      A     C
col3=10                col3=5

There are really only 2 ways we can drive the query: via B.col3 or C.col3. We would have to do a Full scan of A to be able to drive off it. This is unlikely to be efficient with large tables;

If we drive off table B, using predicate B.col3=10 (as a filter or lookup key) then we will retrieve the value for B.col1 and join to A.col1. Because we have now filled the leading column of the concatenated index on table A we can use this index to give us values for A.col2 and join to A.

However if we drive of table c, then we only get a value for a.col2 and since this is a trailing column of a concatenated index and the leading column has not been supplied at this point, we cannot use the index on a to lookup the data.

So it is likely that the best join order will be B A C. The CBO will obviously use costs to establish whether the individual access paths are a good idea or not.

If the CBO does not choose this join order then we can hint it by changing the from
clause to read:

from B,A,C 

and using the /*+ ordered */ hint. The resultant query would be: 

select /*+ ordered */ A.col4
from   B,A,C
where  B.col3 = 10
and    A.col1 = B.col1
and    A.col2 = C.col2
and    C.col3 = 5 

Join Types
Sort Merge Join (SMJ)
Nested Loops (NL)
Hash Join

Sort Merge Join

Rows are produced by Row Source 1 and are then sorted Rows from Row Source 2 are then produced and sorted by the same sort key as Row Source 1. Row Source 1 and 2 are NOT accessed concurrently Sorted rows from both sides are then merged together (joined)

                   MERGE
                 /      \
            SORT        SORT
             |             |
        Row Source 1  Row Source 2

If the row sources are already (known to be) sorted then the sort operation is unecessary as long as both ‘sides’ are sorted using the same key. Presorted row sources include indexed columns and row sources that have already been sorted in earlier steps. Although the merge of the 2 row sources is handled serially, the row sources could be accessed in parallel.

SQL> explain plan for
select /*+ ordered */ e.deptno,d.deptno
from emp e,dept d
where e.deptno = d.deptno
order by e.deptno,d.deptno; 

Query Plan
-------------------------------------
SELECT STATEMENT [CHOOSE] Cost=17
  MERGE JOIN
    SORT JOIN
      TABLE ACCESS FULL EMP [ANALYZED]
    SORT JOIN
      TABLE ACCESS FULL DEPT [ANALYZED]

Sorting is an expensive operation, especially with large tables. Because of this, SMJ is often not a particularly efficient join method.

Nested Loops

First we return all the rows from row source 1 Then we probe row source 2 once for each row returned from row source 1

Row source 1
~~~~~~~~~~~~
Row 1 --------------       -- Probe ->       Row source 2
Row 2 --------------       -- Probe ->       Row source 2
Row 3 --------------       -- Probe ->       Row source 2 

Row source 1 is known as the outer table
Row source 2 is known as the inner table

Accessing row source 2 is known a probing the inner table For nested loops to be efficient it is important that the first row source returns as few rows as possible as this directly controls the number of probes of the second row source. Also it helps if the access method for row source 2 is efficient as this operation is being repeated once for every row returned by row source 1.

SQL> explain plan for
select a.dname,b.sql
from dept a,emp b
where a.deptno = b.deptno; 

Query Plan
-------------------------
SELECT STATEMENT [CHOOSE] Cost=5
  NESTED LOOPS
    TABLE ACCESS FULL DEPT [ANALYZED]
    TABLE ACCESS FULL EMP [ANALYZED]

Hash Join

New join type introduced in 7.3 More efficient in theory than NL & SMJ Only accessible via the CBO Smallest row source is chosen and used to build a hash table and a bitmap The second row source is hashed and checked against the hash table looking for joins. The bitmap is used as a quick lookup to check if rows are in the hash table and are especially useful when the hash table is too large to fit in memory.

SQL> explain plan for
select /*+ use_hash(emp) */ empno
from emp,dept
where emp.deptno = dept.deptno; 

Query Plan
----------------------------
SELECT STATEMENT  [CHOOSE] Cost=3
  HASH JOIN
    TABLE ACCESS FULL DEPT
    TABLE ACCESS FULL EMP

Hash joins are enabled by the parameter HASH_JOIN_ENABLED=TRUE in the init.ora or session. TRUE is the default in 7.3

Cartesian Product

A Cartesian Product is done where they are no join conditions between 2 row sources and there is no alternative method of accessing the data Not really a join as such as there is no join! Typically this is caused by a coding mistake where a join has been left out. It can be useful in some circumstances – Star joins uses cartesian products.

Notice that there is no join between the 2 tables:

SQL> explain plan for
select emp.deptno,dept,deptno
from emp,dept 

Query Plan
------------------------------
SLECT STATEMENT [CHOOSE] Cost=5
  MERGE JOIN CARTESIAN
    TABLE ACCESS FULL DEPT
    SORT JOIN
      TABLE ACCESS FULL EMP

The CARTESIAN keyword indicate that we are doing a cartesian product.

Operations

Operations that show up in explain plans
#sort
#filter
#view
#Sorts

There are a number of different operations that promote sorts

#order by clauses
#group by
#sort merge join

Note that if the row source is already appropriately sorted then no sorting is required. This is now indicated in 7.3:

SORT GROUP BY NOSORT
INDEX FULL SCAN …..

In this case the group by operation simply groups the rows it does not do the sort operation as this has already been completed.

Sorts are expensive operations especially on large tables where the rows do not fit in memory and spill to disk. By default sort blocks are placed into the buffer cache. This may result in aging out of other blocks that may be reread by other processes. To avoid this you can use the parameter which does not place sort blocks into the buffer cache.

Filter

Has a number of different meanings used to indicate partition elimination may also indicate an actual filter step where one row source is filtering another functions such as min may introduce filter steps into query plans

In this example there are 2 filter steps. The first is effectively like a NL except that it stops when it gets something that it doesn’t like (i.e. a bounded NL). This is there because of the not in. The second is filtering out the min value:

SQL> explain plan for select * from emp
     where empno not in (select min(empno)
     from big_emp group by empno); 

Query Plan
------------------
SELECT STATEMENT [CHOOSE]  Cost=1
  FILTER     **** This is like a bounded nested loops
    TABLE ACCESS FULL EMP [ANALYZED]
     FILTER   **** This filter is introduced by the min
        SORT GROUP BY NOSORT
          INDEX FULL SCAN BE_IX

This example is also interesting in that it has a NOSORT function. The group by does not need to sort because the index row source is already pre sorted.

Views

When a view cannot be merged into the main query you will often see a projection view operation. This indicates that the ‘view’ will be selected from directly as opposed to being broken down into joins on the base tables. A number of constructs make a view non mergeable. Inline views are also non mergeable.

In the following example the select contains an inline view which cannot be merged:

SQL> explain plan for
select ename,tot
from emp,
    (select empno,sum(empno) tot from big_emp group by empno) tmp
where emp.empno = tmp.empno; 

Query Plan
------------------------
SELECT STATEMENT [CHOOSE]
  HASH JOIN
    TABLE ACCESS FULL EMP [ANALYZED]
    VIEW
      SORT GROUP BY
        INDEX FULL SCAN BE_IX

In this case the inline view tmp which contains an aggregate function cannot be merged into the main query. The explain plan shows this as a view step.

Partition Views

Allows a large table to be broken up into a number of smaller partitions which can be queried much more quickly than the table as a whole a union all view is built over the top to provide the original functionality Check constraints or where clauses provide partition elimination capabilities

SQL> explain plan for
select /*+ use_nl(p1,kbwyv1) ordered */  sum(prc_pd)
from parent1 p1,  kbwyv1
where p1.class = 22
and   kbwyv1.bitm_numb = p1.bitm_numb
and   kbwyv1.year = 1997
and   kbwyv1.week between 32 and 33 ; 

Query Plan
-----------------------------------------
SELECT STATEMENT   [FIRST_ROWS] Cost=1780
  SORT AGGREGATE
    NESTED LOOPS   [:Q65001] Ct=1780 Cd=40 Bt=3120
      TABLE ACCESS FULL PARENT1 [:Q65000] [AN] Ct=20 Cd=40 Bt=1040
      VIEW  KBWYV1 [:Q65001]
        UNION-ALL PARTITION  [:Q65001]
          FILTER   [:Q64000]
            TABLE ACCESS FULL KBWYT1 [AN] Ct=11 Cd=2000 Bt=104000
          TABLE ACCESS FULL KBWYT2 [AN] Ct=11 Cd=2000 Bt=104000
          TABLE ACCESS FULL KBWYT3 [AN] Ct=11 Cd=2000 Bt=104000
          FILTER   [:Q61000]
            TABLE ACCESS FULL KBWYT4 [AN] Ct=11 Cd=2000 Bt=104000

KBWYV1 is a view on 4 tables KBWYT1-4. KBWYT1-4 contain rows for week 31-34 respectively and are maintained by check constraints. This query should only return rows from partions 2 & 3. The filter operation indicates this. Partitions 1 & 4 are eliminated at execution time. The view line indicates that the view is not merged. The union-all partion information indicates that we have recognised this as a partition view. Note that the tables can be accessed in parallel.

Remote Queries

Only shows remote in the OPERATION column OTHER column shows query executed on remote node OTHER_NODE shows where it is executed Different operational characteristics for RBO & CBO

RBO – Drags everything across the link and joins locally
CBO – Uses cost estimates to determine whether to execute remotely or locally

SQL>  explain plan for
select *
from dept@loop_link; 

Query Plan
-------------------------------------------------------
SELECT STATEMENT REMOTE  [CHOOSE] Cost=1
  TABLE ACCESS FULL DEPT [SJD.WORLD] [ANALYZED]

In this case the whole query has been sent to the remote site. The other column shows nothing.

SQL> explain plan for
select a.dname,avg(b.sal),max(b.sal)
from dept@loop_link a, emp b
where a.deptno=b.deptno
group by a.dname
order by max(b.sal),avg(b.sal) desc; 

Query Plan
-----------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=20
  SORT ORDER BY  [:Q137003] [PARALLEL_TO_SERIAL]
    SORT GROUP BY  [:Q137002] [PARALLEL_TO_PARALLEL]
      NESTED LOOPS   [:Q137001] [PARALLEL_TO_PARALLEL]
        REMOTE   [:Q137000] [PARALLEL_FROM_SERIAL]
        TABLE ACCESS FULL EMP [:Q137001] [ANALYZED]
        [PARALLEL_COMBINED_WITH_PARENT]

Bind Variables

Bind variables are recommended in most cases because they promote sharing of sql code
At parse time the parser has NO IDEA what the bind variable contains. With RBO this makes no difference but with CBO, which relies on accurate statistics to produce plans, this can be a problem.

Defining bind variables in sqlplus:

variable x varchar2(18);
assigning values:
begin
:x := 'hello';
end;
/ 

SQL> explain plan for
select *
from dept
where rowid = ':x'; 

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
  TABLE ACCESS BY ROWID DEPT [ANALYZED]

Parallel Query

Main indicators that a query is using PQO:

[:Q1000004] entries in the explain plan

Checkout the other column for details of what the slaves are executing

v$pq_slave will show any parallel activity

Columns to look in for information

other – contains the query passed to the slaves

other_tag – describes the contents of other

object_node – indicates order of pqo slaves

Parallel Query operates on a producer/consumer basis. When you specify parallel degree 4 oracle tries to allocate 4 producer slaves and 4 consumer slaves. The producers can feed any of the consumers. If there are only 2 slaves available then we use these. If there is only 1 slave available then we go serial If there are none available then we use serial. If parallel_min_percent is set then we error ora 12827 instead of using a lower number of slaves or going serial

Consumer processes typically perform a sorting function. If there is no requirement for the data to be sorted then the consumer slaves are not produced and we end up with the number of slaves used matching the degree of parallelism as opposed to being 2x the degree.

Parallel Terms PARALLEL_FROM_SERIAL This means that source of the data is serial but it is passed to a parallel consumer
PARALLEL_TO_PARALLEL Both the consumer and the producer are parallel
PARALLEL_COMBINED_WITH_PARENT This operation has been combined with the parent operator. For example in a sort merge join the sort operations would be shown as PARALLEL_COMBINED_WITH_PARENT because the sort and the merge are handled as 1 operation.
PARALELL_TO_SERIAL The source of the data is parallel but it is passed to a serial consumer. This typically will happen at the top of the explain plan but could occur anywhere

Examples of parallel queries

Assumptions

OPTIMIZER_MODE = CHOOSE
DEPT is small compared to EMP
DEPT has an index (DEPT_INDX) on deptno column

Three examples are presented

Query #1: Serial
Query #2: Parallel
Query #3: Parallel, with forced optimization to RULE and forced usage of DEPT_INDX

Sample Query #1 (Serial)

select A.dname, avg(B.sal), max(B.sal)
from  dept A, emp B
where A.deptno = B.deptno
group by A.dname
order by max(B.sal), avg(B.sal) desc; 

Execution Plan #1 (Serial) 

OBJECT_NAME                      OBJECT_NODE OTHER
-------------------------------  ----------- -------
SELECT STATEMENT
 SORT ORDER BY
   SORT GROUP BY
     MERGE JOIN
       SORT JOIN
         TABLE ACCESS FULL emp
       SORT JOIN
         TABLE ACCESS FULL dept

Notice that the object_node and other columns are empty

Sample Query #2 (Query #1 with parallel hints) 

select /*+ parallel(B,4) parallel(A,4) */
A.dname, avg(B.sal), max(B.sal)
from  dept A, emp B
where A.deptno = B.deptno
group by A.dname
order by max(B.sal), avg(B.sal) desc; 

Execution Plan #2  (Parallel) 

OBJECT_NAME                      OBJECT_NODE OTHER
-------------------------------  ----------- -------
SELECT STATEMENT      Cost = ??
 SORT ORDER BY                   :Q55004     **[7]**
   SORT GROUP BY                 :Q55003     **[6]**
     MERGE JOIN                  :Q55002     **[5]**
       SORT JOIN                 :Q55002     **[4]**
         TABLE ACCESS FULL emp   :Q55001     **[2]**
       SORT JOIN                 :Q55002     **[3]**
         TABLE ACCESS FULL dept  :Q55000     **[1]** 

Execution Plan #2  -- OTHER column 

**[1]**  (:Q55000) "PARALLEL_FROM_SERIAL" 

Serial execution of SELECT DEPTNO, DNAME FROM DEPT 

**[2]**  (:Q55001) "PARALLEL_TO_PARALLEL" 

        SELECT /*+ ROWID(A1)*/
        A1."DEPTNO" C0, A1."SAL" C1
        FROM "EMP" A1
        WHERE ROWID BETWEEN :1 AND :2 

**[3]**  (:Q55002) "PARALLEL_COMBINED_WITH_PARENT"
**[4]**  (:Q55002) "PARALLEL_COMBINED_WITH_PARENT"
**[5]**  (:Q55002) "PARALLEL_TO_PARALLEL" 

        SELECT /*+ ORDERED USE_MERGE(A2)*/
        A2.C1 C0, A1.C1 C1
        FROM :Q55001 A1,:Q55000 A2
        WHERE A1.C0=A2.C0 

**[6]**  (:Q55003) "PARALLEL_TO_PARALLEL" 

        SELECT MAX(A1.C1) C0, AVG(A1.C1) C1, A1.C0 C2
        FROM :Q55002 A1
        GROUP BY A1.C0 

**[7]**  (:Q55004) "PARALLEL_FROM_SERIAL" 

        SELECT A1.C0 C0, A1.C1 C1, A1.C2 C2
        FROM :Q55003 A1
        ORDER BY A1.CO, A1.C1 DESC 

Sample Query #3 (Query #2 with fudged hints) 

select /*+ index(A dept_indx) parallel(B,4) parallel(A,4) */
      A.dname, avg(B.sal), max(B.sal)
from  dept A, emp B
where A.deptno = B.deptno
group by A.dname
order by max(B.sal), avg(B.sal) desc;

Execution Plan #3 (Parallel)

OBJECT_NAME                         OBJECT_NODE OTHER
----------------------------------- ----------- -------
SELECT STATEMENT          Cost = ??
 SORT ORDER BY                      :Q58002     **[6]**
   SORT GROUP BY                    :Q58001     **[5]**
     NESTED LOOPS JOIN              :Q58000     **[4]**
       TABLE ACCESS FULL emp        :Q58000     **[3]**
       TABLE ACCESS BY ROWID dept   :Q58000     **[2]**
         INDEX RANGE SCAN dept_indx :Q58000     **[1]** 

Execution Plan #3  -- OTHER column 

**[1]**  (:Q58000) "PARALLEL_COMBINED_WITH_PARENT"
**[2]**  (:Q58000) "PARALLEL_COMBINED_WITH_PARENT"
**[3]**  (:Q58000) "PARALLEL_COMBINED_WITH_PARENT"
**[4]**  (:Q58000) "PARALLEL_TO_PARALLEL" 

        SELECT /*+ ORDERED USE_NL(A2) INDEX(A2) */
        A2."DNAME" C0, A1.C0 C1
        FROM
          (SELECT /*+ ROWID(A3) */
           A3."SAL" CO, A3."DEPTNO" C1
           FROM "EMP" A3
           WHERE ROWID BETWEEN :1 AND :2) A1,
          "DEPT" A2
        WHERE A2."DEPTNO" = A1.C1 

**[5]**  (:Q58001) "PARALLEL_TO_PARALLEL" 

        SELECT MAX(A1.C1) C0, AVG(A1.C1) C1, A1.C0 C2
        FROM :Q58000 A1
        GROUP BY A1.C0 

**[6]**  (:Q58002) "PARALLEL_TO_SERIAL" 

        SELECT A1.C0 C0, A1.C1 C1, A1.C2 C2
        FROM :Q58001 A1
        ORDER BY A1.C0, A1.C1 DESC

How to obtain explain plans

Explain plan for

Main advantage is that it does not actually run the query – just parses the sql. This means that it executes quickly. In the early stages of tuning explain plan gives you an idea of the potential performance of your query without actually running it. You can then make a judgement as to any modifications you may choose to make.

Autotrace

Autotrace can be configured to run the sql & gives a plan and statistics afterwards or just give you an explain plan without executing the query.

Tkprof

Analyzes trace file

Source : http://www.akadia.com/services/ora_interpreting_explain_plan.html