SQL Loader Topics
SQL*Loader is a bulk loader utility used for moving data from external files into
the Oracle database. Its syntax is similar to that of the DB2 Load utility, but
comes with more options. SQL*Loader supports various load
formats, selective loading, and multi-table loads. 
One can load data into an Oracle
database by using the sqlldr (sqlload on some platforms) utility. Invoke the utility
without arguments to get a list of available parameters. Look at the following
example: 
        sqlldr scott/tiger control=loader.ctl
This sample control file (loader.ctl) will
load an external data file containing delimited data: 
        load data
          infile 'c:\data\mydata.csv'
                 
into table emp
          fields terminated by ","
optionally enclosed by '"'                   
          ( empno, empname, sal, deptno )
The mydata.csv file may look like
this: 
        10001,"Scott Tiger", 1000, 40
        10002,"Frank Naude", 500, 20
Another Sample control file with in-line data formatted
as fix length records. The trick is to specify "*" as the name of the
data file, and use BEGINDATA to start the data section in the control file. 
        load data
          infile *
          replace
          into table departments
          ( 
dept     position (02:05) char(4),
             deptname position (08:27) char(20)
          )
        begindata
          COSC 
COMPUTER SCIENCE
          ENGL 
ENGLISH LITERATURE
          MATH 
MATHEMATICS
          POLY 
POLITICAL SCIENCE
Oracle does not supply any data
unload utilities. However, you can use SQL*Plus to select and format your data and then spool it to a
file: 
        set echo off newpage 0 space 0 pagesize
0 feed off head off trimspool on
        spool oradata.txt
        select col1 || ',' || col2 || ',' ||
col3
        from  
tab1
        where 
col2 = 'XYZ';
        spool off
Alternatively use the UTL_FILE PL 
        rem Remember to update initSID.ora, utl_file_dir='c:\oradata'
parameter
        declare
           fp utl_file.file_type;
        begin
           fp :=
utl_file.fopen('c:\oradata','tab1.txt','w');
           utl_file.putf(fp, '%s, %s\n',
'TextField', 55);
           utl_file.fclose(fp);
        end;
        /
You might also want to investigate third
party tools like SQLWays from Ispirer Systems,
TOAD from Quest, or ManageIT Fast Unloader from CA to help you unload data from
Oracle. 
Yes, look at the following control
file examples. In the first we will load delimited data (variable length):
   LOAD DATA
   INFILE *
   INTO TABLE load_delimited_data
   FIELDS TERMINATED BY
"," OPTIONALLY ENCLOSED BY '"'
   TRAILING NULLCOLS
   ( 
data1,
      data2
   )
   BEGINDATA
   11111,AAAAAAAAAA
   22222,"A,B,C,D,"
If you need to load positional
data (fixed length), look at the following control file example: 
   LOAD DATA
   INFILE *
   INTO TABLE load_positional_data
   ( 
data1 POSITION(1:5),
      data2 POSITION(6:15)
   )
   BEGINDATA
   11111AAAAAAAAAA
   22222BBBBBBBBBB
One can skip header records or
continue an interrupted load (for example if you run out of space) by
specifying the "SKIP n" keyword. "n" specifies the number
of logical rows to skip. Look at this example: 
   LOAD DATA (SKIP 5)
   INFILE *
   INTO TABLE load_positional_data
   ( 
data1 POSITION(1:5),
      data2 POSITION(6:15)
   )
   BEGINDATA
   11111AAAAAAAAAA
   22222BBBBBBBBBB
If you are continuing a multiple
table direct path load, you may need to use the CONTINUE_LOAD clause instead of
the SKIP parameter. CONTINUE_LOAD allows you to specify a different number of
rows to skip for each of the tables you are loading. 
Data can be modified as it loads
into the Oracle Database. Note that this only applies for the conventional load
path and not for direct path loads. 
   LOAD DATA
   INFILE *
   INTO TABLE modified_data
   ( 
rec_no                      "my_db_sequence.nextval",
      region                      CONSTANT
'31',
      time_loaded                 "to_char(SYSDATE,
'HH24:MI')",
      data1        POSITION(1:5)  ":data1/100",
      data2        POSITION(6:15) "upper(:data2)",
      data3        POSITION(16:22)"to_date(:data3,
'YYMMDD')"
   )
   BEGINDATA
   11111AAAAAAAAAA991201
   22222BBBBBBBBBB990112
   LOAD DATA
   INFILE 'mail_orders.txt'
   BADFILE 'bad_orders.txt'
   APPEND
   INTO TABLE mailing_list
   FIELDS TERMINATED BY ","
   ( 
addr,
      city,
      state,
      zipcode,
      mailing_addr   "decode(:mailing_addr,
null, :addr, :mailing_addr)",
      mailing_city   "decode(:mailing_city,
null, :city, :mailing_city)",
      mailing_state
   )
Look at the following control file: 
   LOAD DATA
   INFILE *
   REPLACE
   INTO TABLE emp
        WHEN empno != ' '
   ( empno 
POSITION(1:4)   INTEGER EXTERNAL,
     ename 
POSITION(6:15)  CHAR,
     deptno POSITION(17:18) CHAR,
     mgr   
POSITION(20:23) INTEGER EXTERNAL
   )
   INTO TABLE proj
        WHEN projno != ' '
   ( 
projno POSITION(25:27) INTEGER EXTERNAL,
      empno 
POSITION(1:4)   INTEGER EXTERNAL
   )
Look at this example, (01) is the
first character, (30:37) are characters 30 to 37: 
   LOAD DATA
   INFILE 
'mydata.dat' BADFILE  'mydata.bad'
DISCARDFILE 'mydata.dis'
   APPEND
   INTO TABLE my_selective_table
   WHEN (01) <> 'H'
and (01) <> 'T' and (30:37) = '19991217'
   (
      region              CONSTANT '31',
      service_key         POSITION(01:11)   INTEGER EXTERNAL,
      call_b_no           POSITION(12:29)   CHAR
   )
One cannot use POSTION(x:y) with
delimited data. Luckily, from Oracle 8i one can specify FILLER columns. FILLER
columns are used to skip columns/fields in the load file, ignoring fields that
one does not want. Look at this example: 
        LOAD DATA
        TRUNCATE INTO TABLE T1
        FIELDS TERMINATED BY ','
        ( field1,
          field2 FILLER,
          field3
        )
One can create one logical record
from multiple physical records using one of the following two clauses: 
- CONCATENATE: - use when SQL*Loader
     should combine the same number of physical records together to form one
     logical record. 
- CONTINUEIF - use if a condition
     indicates that multiple records should be treated as one. Eg. by having a
     '#' character in column 1. 
One cannot, but by setting the ROWS=
parameter to a large value, committing can be reduced. Make sure you have big
rollback segments ready when you use a high value for ROWS=. 
1.    
A very simple but easily overlooked hint is not to have any
indexes and/or constraints (primary key) on your load tables during the load
process. This will significantly slow down load times even with ROWS= set to a
high value. 
2.    
Add the following option in the command line: DIRECT=TRUE. This
will effectively bypass most of the RDBMS processing. However, there are cases
when you can't use direct load. Refer to chapter 8 on Oracle server Utilities
manual. 
3.    
Turn off database logging by specifying the UNRECOVERABLE
option. This option can only be used with direct data loads. 
4.    
Run multiple load jobs concurrently. 
SQL*Loader can load data from a
"primary data file", SDF (Secondary Data file - for loading nested
tables and VARRAYs) or LOBFILE. The LOBFILE method provides an easy way to load
documents, images and audio clips into BLOB and CLOB columns. Look at this example:
Given the following table: 
CREATE TABLE
image_table (
        image_id   NUMBER(5),
        file_name  VARCHAR2(30),
        image_data BLOB);
Control File: 
LOAD DATA
INFILE *
INTO TABLE
image_table
REPLACE
FIELDS TERMINATED BY
','
(
 image_id  
INTEGER(5),
 file_name 
CHAR(30),
 image_data LOBFILE (file_name) TERMINATED BY
EOF
)
BEGINDATA
001,image1.gif
002,image2.jpg
The conventional path loader
essentially loads the data by using standard INSERT statements. The direct path
loader (DIRECT=TRUE) bypasses much of the logic involved with that, and loads
directly into the Oracle data files. More information about the restrictions of
direct path loading can be obtained from the Utilities Users Guide. 
 
No comments:
Post a Comment