Вы находитесь на странице: 1из 43

SQL*Loader Features

SQL*Loader loads data from external files into tables of an Oracle database. It has a powerful data parsing engine that puts little limitation on the format of the data in the datafile. You can use SQL*Loader to do the following:
y y y y y y y y y y y y y

Load data across a network. This means that you can run the SQL*Loader client on a different system from the one that is running the SQL*Loader server. Load data from multiple datafiles during the same load session. Load data into multiple tables during the same load session. Specify the character set of the data. Selectively load data (you can load records based on the records' values). Manipulate the data before loading it, using SQL functions. Generate unique sequential key values in specified columns. Use the operating system's file system to access the datafiles. Load data from disk, tape, or named pipe. Generate sophisticated error reports, which greatly aid troubleshooting. Load arbitrarily complex object-relational data. Use secondary datafiles for loading LOBs and collections. Use either conventional or direct path loading. While conventional path loading is very flexible, direct path loading provides superior loading performance. See Chapter 11.

A typical SQL*Loader session takes as input a control file, which controls the behavior of SQL*Loader, and one or more datafiles. The output of SQL*Loader is an Oracle database (where the data is loaded), a log file, a bad file, and potentially, a discard file. An example of the flow of a SQL*Loader session is shown in Figure 6-1. Figure 6-1 SQL*Loader Overview

Description of the illustration sut81088.gif

SQL*Loader Parameters
SQL*Loader is invoked when you specify the sqlldr command and, optionally, parameters that establish session characteristics. In situations where you always use the same parameters for which the values seldom change, it can be more efficient to specify parameters using the following methods, rather than on the command line:
y y

Parameters can be grouped together in a parameter file. You could then specify the name of the parameter file on the command line using the PARFILE parameter. Certain parameters can also be specified within the SQL*Loader control file by using the OPTIONS clause.

Parameters specified on the command line override any parameter values specified in a parameter file or OPTIONS clause. See Also:
y y y

Chapter 7 for descriptions of the SQL*Loader parameters PARFILE (parameter file) OPTIONS Clause

SQL*Loader Control File


The control file is a text file written in a language that SQL*Loader understands. The control file tells SQL*Loader where to find the data, how to parse and interpret the data, where to insert the data, and more. Although not precisely defined, a control file can be said to have three sections. The first section contains session-wide information, for example:
y y y

Global options such as bindsize, rows, records to skip, and so on INFILE clauses to specify where the input data is located Data to be loaded

The second section consists of one or more INTO TABLE blocks. Each of these blocks contains information about the table into which the data is to be loaded, such as the table name and the columns of the table. The third section is optional and, if present, contains input data. Some control file syntax considerations to keep in mind are:
y y

The syntax is free-format (statements can extend over multiple lines). It is case insensitive; however, strings enclosed in single or double quotation marks are taken literally, including case.

In control file syntax, comments extend from the two hyphens (--) that mark the beginning of the comment to the end of the line. The optional third section of the control file is interpreted as data rather than as control file syntax; consequently, comments in this section are not supported. The keywords CONSTANT and ZONE have special meaning to SQL*Loader and are therefore reserved. To avoid potential conflicts, Oracle recommends that you do not use either CONSTANT or ZONE as a name for any tables or columns. See Also: Chapter 8 for details about control file syntax and semantics

Input Data and Datafiles


SQL*Loader reads data from one or more files (or operating system equivalents of files) specified in the control file. From SQL*Loader's perspective, the data in the datafile is organized as records. A particular datafile can be in fixed record format, variable record format, or stream record format. The record format can be specified in the control file with the INFILE parameter. If no record format is specified, the default is stream record format. Note: If data is specified inside the control file (that is, INFILE * was specified in the control file), then the data is interpreted in the stream record format with the default record terminator.

Fixed Record Format


A file is in fixed record format when all records in a datafile are the same byte length. Although this format is the least flexible, it results in better performance than variable or stream format. Fixed format is also simple to specify. For example:
INFILE datafile_name "fix n"

This example specifies that SQL*Loader should interpret the particular datafile as being in fixed record format where every record is n bytes long. Example 6-1 shows a control file that specifies a datafile that should be interpreted in the fixed record format. The datafile in the example contains five physical records. Assuming that a period (.) indicates a space, the first physical record is [001,...cd,.] which is exactly eleven bytes (assuming a single-byte character set). The second record is [0002,fghi,\n] followed by the newline character (which is the eleventh byte), and so on. Note that newline characters are not required with the fixed record format. Note that the length is always interpreted in bytes, even if character-length semantics are in effect for the file. This is necessary because the file could contain a mix of fields, some of which are processed with character-length semantics and others which are processed with byte-length semantics. See Character-Length Semantics.

Example 6-1 Loading Data in Fixed Record Format


load data infile 'example.dat' "fix 11" into table example fields terminated by ',' optionally enclosed by '"' (col1, col2) example.dat: 001, cd, 0002,fghi, 00003,lmn, 1, "pqrs", 0005,uvwx,

Variable Record Format


A file is in variable record format when the length of each record in a character field is included at the beginning of each record in the datafile. This format provides some added flexibility over the fixed record format and a performance advantage over the stream record format. For example, you can specify a datafile that is to be interpreted as being in variable record format as follows:
INFILE "datafile_name" "var n"

In this example, n specifies the number of bytes in the record length field. If n is not specified, SQL*Loader assumes a length of 5 bytes. Specifying n larger than 40 will result in an error. Example 6-2 shows a control file specification that tells SQL*Loader to look for data in the datafile example.dat and to expect variable record format where the record length fields are 3 bytes long. The example.dat datafile consists of three physical records. The first is specified to be 009 (that is, 9) bytes long, the second is 010 bytes long (that is, 10, including a 1-byte newline), and the third is 012 bytes long (also including a 1-byte newline). Note that newline characters are not required with the variable record format. This example also assumes a single-byte character set for the datafile. The lengths are always interpreted in bytes, even if character-length semantics are in effect for the file. This is necessary because the file could contain a mix of fields, some processed with character-length semantics and others processed with byte-length semantics. See Character-Length Semantics. Example 6-2 Loading Data in Variable Record Format
load data infile 'example.dat' "var 3" into table example fields terminated by ',' optionally enclosed by '"' (col1 char(5), col2 char(7)) example.dat: 009hello,cd,010world,im, 012my,name is,

Stream Record Format


A file is in stream record format when the records are not specified by size; instead SQL*Loader forms records by scanning for the record terminator. Stream record format is the most flexible format, but there can be a negative effect on performance. The specification of a datafile to be interpreted as being in stream record format looks similar to the following:
INFILE datafile_name ["str terminator_string"]

The terminator_string is specified as either 'char_string' or X'hex_string' where:


y y 'char_string' is X'hex_string' is

a string of characters enclosed in single or double quotation marks a byte string in hexadecimal format

When the terminator_string contains special (nonprintable) characters, it should be specified as a X'hex_string'. However, some nonprintable characters can be specified as ('char_string') by using a backslash. For example:
y y y y y \n indicates \t indicates \f indicates \v indicates \r indicates

a line feed a horizontal tab a form feed a vertical tab a carriage return

If the character set specified with the NLS_LANG parameter for your session is different from the character set of the datafile, character strings are converted to the character set of the datafile. This is done before SQL*Loader checks for the default record terminator. Hexadecimal strings are assumed to be in the character set of the datafile, so no conversion is performed. On UNIX-based platforms, if no terminator_string is specified, SQL*Loader defaults to the line feed character, \n. On Windows NT, if no terminator_string is specified, then SQL*Loader uses either \n or \r\n as the record terminator, depending on which one it finds first in the datafile. This means that if you know that one or more records in your datafile has \n embedded in a field, but you want \r\n to be used as the record terminator, you must specify it. Example 6-3 illustrates loading data in stream record format where the terminator string is specified using a character string, '|\n'. The use of the backslash character allows the character string to specify the nonprintable line feed character. Example 6-3 Loading Data in Stream Record Format
load data infile 'example.dat' "str '|\n'" into table example fields terminated by ',' optionally enclosed by '"'

(col1 char(5), col2 char(7)) example.dat: hello,world,| james,bond,|

Logical Records
SQL*Loader organizes the input data into physical records, according to the specified record format. By default a physical record is a logical record, but for added flexibility, SQL*Loader can be instructed to combine a number of physical records into a logical record. SQL*Loader can be instructed to follow one of the following logical record-forming strategies:
y y

Combine a fixed number of physical records to form each logical record. Combine physical records into logical records while a certain condition is true. See Also:
o o

Assembling Logical Records from Physical Records Case study 4, Loading Combined Physical Records (see SQL*Loader Case Studies for information on how to access case studies)

Data Fields
Once a logical record is formed, field setting on the logical record is done. Field setting is a process in which SQL*Loader uses control-file field specifications to determine which parts of logical record data correspond to which control-file fields. It is possible for two or more field specifications to claim the same data. Also, it is possible for a logical record to contain data that is not claimed by any control-file field specification. Most control-file field specifications claim a particular part of the logical record. This mapping takes the following forms:
y

The byte position of the data field's beginning, end, or both, can be specified. This specification form is not the most flexible, but it provides high field-setting performance. The strings delimiting (enclosing and/or terminating) a particular data field can be specified. A delimited data field is assumed to start where the last data field ended, unless the byte position of the start of the data field is specified. The byte offset and/or the length of the data field can be specified. This way each field starts a specified number of bytes from where the last one ended and continues for a specified length. Length-value datatypes can be used. In this case, the first n number of bytes of the data field contain information about how long the rest of the data field is. See Also:
o

Specifying the Position of a Data Field

Specifying Delimiters

Data Conversion and Datatype Specification


During a conventional path load, data fields in the datafile are converted into columns in the database (direct path loads are conceptually similar, but the implementation is different). There are two conversion steps: 1. SQL*Loader uses the field specifications in the control file to interpret the format of the datafile, parse the input data, and populate the bind arrays that correspond to a SQL INSERT statement using that data. 2. The Oracle database accepts the data and executes the INSERT statement to store the data in the database. The Oracle database uses the datatype of the column to convert the data into its final, stored form. Keep in mind the distinction between a field in a datafile and a column in the database. Remember also that the field datatypes defined in a SQL*Loader control file are not the same as the column datatypes.

Discarded and Rejected Records


Records read from the input file might not be inserted into the database. Such records are placed in either a bad file or a discard file.

The Bad File


The bad file contains records that were rejected, either by SQL*Loader or by the Oracle database. If you do not specify a bad file and there are rejected records, then SQL*Loader automatically creates one. It will have the same name as the data file, with a.bad extension. Some of the possible reasons for rejection are discussed in the next sections. SQL*Loader Rejects Datafile records are rejected by SQL*Loader when the input format is invalid. For example, if the second enclosure delimiter is missing, or if a delimited field exceeds its maximum length, SQL*Loader rejects the record. Rejected records are placed in the bad file. Oracle Database Rejects After a datafile record is accepted for processing by SQL*Loader, it is sent to the Oracle database for insertion into a table as a row. If the Oracle database determines that the row is valid, then the row is inserted into the table. If the row is determined to be invalid, then the record is rejected and SQL*Loader puts it in the bad file. The row may be invalid, for example, because a key is not unique, because a required field is null, or because the field contains invalid data for the Oracle datatype. See Also:
y

Specifying the Bad File

Case study 4, Loading Combined Physical Records (see SQL*Loader Case Studies for information on how to access case studies)

The Discard File


As SQL*Loader executes, it may create a file called the discard file. This file is created only when it is needed, and only if you have specified that a discard file should be enabled. The discard file contains records that were filtered out of the load because they did not match any record-selection criteria specified in the control file. The discard file therefore contains records that were not inserted into any table in the database. You can specify the maximum number of such records that the discard file can accept. Data written to any database table is not written to the discard file. See Also:
y y

Case study 4, Loading Combined Physical Records (see SQL*Loader Case Studies for information on how to access case studies) Specifying the Discard File

Log File and Logging Information


When SQL*Loader begins execution, it creates a log file. If it cannot create a log file, execution terminates. The log file contains a detailed summary of the load, including a description of any errors that occurred during the load.

Conventional Path Loads, Direct Path Loads, and External Table Loads
SQL*Loader provides the following methods to load data:
y y y

Conventional Path Loads Direct Path Loads External Table Loads

Conventional Path Loads


During conventional path loads, the input records are parsed according to the field specifications, and each data field is copied to its corresponding bind array. When the bind array is full (or no more data is left to read), an array insert is executed. See Also:
y y

Data Loading Methods Bind Arrays and Conventional Path Loads

SQL*Loader stores LOB fields after a bind array insert is done. Thus, if there are any errors in processing the LOB field (for example, the LOBFILE could not be found), the LOB field

is left empty. Note also that because LOB data is loaded after the array insert has been performed, BEFORE and AFTER row triggers may not work as expected for LOB columns. This is because the triggers fire before SQL*Loader has a chance to load the LOB contents into the column. For instance, suppose you are loading a LOB column, C1, with data and that you want a BEFORE row trigger to examine the contents of this LOB column and derive a value to be loaded for some other column, C2, based on its examination. This is not possible because the LOB contents will not have been loaded at the time the trigger fires.

Direct Path Loads


A direct path load parses the input records according to the field specifications, converts the input field data to the column datatype, and builds a column array. The column array is passed to a block formatter, which creates data blocks in Oracle database block format. The newly formatted database blocks are written directly to the database, bypassing much of the data processing that normally takes place. Direct path load is much faster than conventional path load, but entails several restrictions. See Also: Direct Path Load Parallel Direct Path A parallel direct path load allows multiple direct path load sessions to concurrently load the same data segments (allows intrasegment parallelism). Parallel direct path is more restrictive than direct path. See Also: Parallel Data Loading Models

External Table Loads


An external table load creates an external table for data that is contained in a datafile. The load executes INSERT statements to insert the data from the datafile into the target table. The advantages of using external table loads over conventional path and direct path loads are as follows:
y y

An external table load attempts to load datafiles in parallel. If a datafile is big enough, it will attempt to load that file in parallel. An external table load allows modification of the data being loaded by using SQL functions and PL/SQL functions as part of the INSERT statement that is used to create the external table.

Note: An external table load is not supported using a named pipe on Windows NT. [Added per mail from Elaine Egolf in March 05.]

See Also:
y y

Chapter 12, "External Tables Concepts" Chapter 13, "The ORACLE_LOADER Access Driver"

Choosing External Tables Versus SQL*Loader


The record parsing of external tables and SQL*Loader is very similar, so normally there is not a major performance difference for the same record format. However, due to the different architecture of external tables and SQL*Loader, there are situations in which one method is more appropriate than the other. In the following situations, use external tables for the best load performance:
y y

You want to transform the data as it is being loaded into the database. You want to use transparent parallel processing without having to split the external data first.

However, in the following situations, use SQL*Loader for the best load performance:
y y

You want to load data remotely. Transformations are not required on the data, and the data does not need to be loaded in parallel.

SQL*Loader Case Studies


SQL*Loader features are illustrated in a variety of case studies. The case studies are based upon the Oracle demonstration database tables, emp and dept, owned by scott/tiger. (In some case studies, additional columns have been added.)The case studies are numbered 1 through 11, starting with the simplest scenario and progressing in complexity. The following is a summary of the case studies:
y

y y

y y y y y

Case Study 1: Loading Variable-Length Data - Loads stream format records in which the fields are terminated by commas and may be enclosed by quotation marks. The data is found at the end of the control file. Case Study 2: Loading Fixed-Format Fields - Loads data from a separate datafile. Case Study 3: Loading a Delimited, Free-Format File - Loads data from stream format records with delimited fields and sequence numbers. The data is found at the end of the control file. Case Study 4: Loading Combined Physical Records - Combines multiple physical records into one logical record corresponding to one database row. Case Study 5: Loading Data into Multiple Tables - Loads data into multiple tables in one run. Case Study 6: Loading Data Using the Direct Path Load Method - Loads data using the direct path load method. Case Study 7: Extracting Data from a Formatted Report - Extracts data from a formatted report. Case Study 8: Loading Partitioned Tables - Loads partitioned tables.

Case Study 9: Loading LOBFILEs (CLOBs) - Adds a CLOB column called resume to the table emp, uses a FILLER field (res_file), and loads multiple LOBFILEs into the emp table. Case Study 10: REF Fields and VARRAYs - Loads a customer table that has a primary key as its OID and stores order items in a VARRAY. Loads an order table that has a reference to the customer table and the order items in a VARRAY. Case Study 11: Loading Data in the Unicode Character Set - Loads data in the Unicode character set, UTF16, in little-endian byte order. This case study uses character-length semantics.

Case Study Files


Generally, each case study is comprised of the following types of files:
y y y

Control files (for example, ulcase5.ctl) Datafiles (for example, ulcase5.dat) Setup files (for example, ulcase5.sql)

These files are installed when you install Oracle Database. They are located in the $ORACLE_HOME/rdbms/demo directory. If the sample data for the case study is contained within the control file, then there will be no .dat file for that case. Case study 2 does not require any special set up, so there is no .sql script for that case. Case study 7 requires that you run both a starting (setup) script and an ending (cleanup) script. Table 6-1 lists the files associated with each case. Table 6-1 Case Studies and Their Related Files Case 1 2 3 4 5 6 7
.ctl .dat .sql

ulcase1.ctl ulcase2.ctl ulcase3.ctl ulcas4.ctl ulcase5.ctl ulcase6.ctl ulcase7.ctl

N/A ulcase2.dat N/A ulcase4.dat ulcase5.dat ulcase6.dat ulcase7.dat

ulcase1.sql N/A ulcase3.sql ulcase4.sql ulcase5.sql ulcase6.sql ulcase7s.sql ulcase7e.sql

8 9 10

ulcase8.ctl ulcase9.ctl ulcase10.ctl

ulcase8.dat ulcase9.dat N/A

ulcase8.sql ulcase9.sql ulcase10.sql

Case 11

.ctl

.dat

.sql

ulcase11.ctl

ulcase11.dat

ulcase11.sql

Running the Case Studies


In general, you use the following steps to run the case studies (be sure you are in the $ORACLE_HOME/rdbms/demo directory, which is where the case study files are located): 1. Start SQL*Plus as scott/tiger by entering the following at the system prompt:
2. sqlplus scott/tiger 3.

The SQL prompt is displayed. 4. At the SQL prompt, execute the SQL script for the case study. For example, to execute the SQL script for case study 1, enter the following:
5. SQL> @ulcase1 6.

This prepares and populates tables for the case study and then returns you to the system prompt. 7. At the system prompt, invoke SQL*Loader and run the case study, as follows:
8. sqlldr USERID=scott/tiger CONTROL=ulcase1.ctl LOG=ulcase1.log 9.

Substitute the appropriate control file name and log file name for the CONTROL and LOG parameters. Be sure to read the control file for any notes that are specific to the particular case study you are executing. For example, case study 6 requires that you add DIRECT=TRUE to the SQL*Loader command line.

Case Study Log Files


Log files for the case studies are not provided in the $ORACLE_HOME/rdbms/demo directory. This is because the log file for each case study is produced when you execute the case study, provided that you use the LOG parameter. If you do not wish to produce a log file, omit the LOG parameter from the command line.

Checking the Results of a Case Study


To check the results of running a case study, start SQL*Plus and perform a select operation from the table that was loaded in the case study. This is done, as follows: 1. Start SQL*Plus as scott/tiger by entering the following at the system prompt:
2. sqlplus scott/tiger 3.

The SQL prompt is displayed. 4. At the SQL prompt, use the SELECT statement to select all rows from the table that the case study loaded. For example, if the table emp was loaded, enter:
5. SQL> SELECT * FROM emp; 6.

The contents of each row in the emp table will be displayed.

SQL*Loader - Loading data using OEM


In this tutorial you will learn about SQL*Loader - Input Data and Datafiles, Fixed Record Format, Variable Record Format and Stream Record Format. SQL*Loader is useful when you need to load the files in batch mode. SQL* Loader supports three different type of data files. You will need to specify the INFILE parameter with the file format and additional parameters required. Fixed Record Format: This format is useful when you have a data file with fixed layout. Variable Record Format: This format is used when you have different record lengths in the data file. You will need to specify the record length in the beginning of the each record. This format provided greater flexibility to have the data loaded compared to fixed record formatted files. Stream Record Format: This format is used when the records are not in the fixed or specified size. Each record can be any length and records will be identified by using the record terminator.

You can specify terminator_string either in character or hexadecimal format. Char is enclosed in single or double quotes and hexadecimal should be used when nonprintable characters like new line feed or tab characters. There are few types of hex characters which can be used. Please note that these may change based on the operating system you are using. In Unix/Linux based systems, the default to the line feed character will be n and Windows uses either n or rn as the default record terminator. Just to avoid any issues with various character sets, you may want to check NLS_LANG parameters for your session. Check to make sure that your record terminator is not part of the data record. There are various options available, please use sqlldr for the parameters and usage.

Now let us take a look at the loading some sample data to a table using command prompt and OEM interface.

y y

Log into OEM. (Note: If you have followed default installation with starter database, you will have a link in Oracle menu to Database Control database name. ) Login and select Data Movement,

Select Load Data from user file option. (If you have a control file already then you can use or select "Automatically Generate Control File" option).

In this tutorials we are going to select the first option, then use the generated control file to load the data using command prompt.

Oracle needs access to the host, enter the server login and password, if you prefer you can check Save as Preferred Credentials otherwise leave it unchecked.

Step 1 - Load Data: Data Files

Here is the sample file we are going to use. Sample Code 1. FirstName1,LastName1,Address1,City1,Country1 2. FirstName2,LastName2,Address2,City2,Country2 3. FirstName3,LastName3,Address3,City3,Country3 4. FirstName4,LastName4,Address4,City4,Country4 5.
Copyright exforsys.com

Step 2 - Load Data: Table and File Format

Enter Database name, Table Name, if you need to create select Create new table option or just enter the table name which is already there.

Sample Code 1. CREATE TABLE customer 2. (First_Name char(50), 3. Last_Name char(50), 4. Address char(50), 5. City char(50), 6. Country char(25)); 7.
Copyright exforsys.com

Step 3 - Character Delimiters


Here you can change the settings. In our care Field delimiter is comma and optional filed enclosure is double quotes.

Verify the setting and click next to continue with Step 4

Step 4 - Load Data: Load Method

There are various methods you can use to load the data and it depends on the need and how much data you are loading . We will be discussing these methods in details later. For simplicity we are going to use Conventional path method.

Step 5 - Load Data: Options

If you would like any records to written to the rejected file, select Bad file option and enter the path for the file to be generated. Keep in mind all of these paths related to the server not your local PC.

Step 6 - Load Data: Schedule

If you would like to schedule the job to run later date, you can use this step else click next

Step 7 - Review

Verify the setting, submit job

Click on the Job link to see the status

Control file created

Sample Code 1. LOAD DATA 2. INFILE 'D:APPEXFORSYSORADATAEXFORSYSexample1.dat' "STR 'rn'"

3. APPEND 4. INTO TABLE customer 5. FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' 6. ( 7. FIRST_NAME CHAR, 8. LAST_NAME CHAR, 9. ADDRESS CHAR, 10. CITY CHAR, 11. COUNTRY CHAR 12. )
Copyright exforsys.com

If you see the following error during the job submission , Here are few things you will need to verify

Make sure service is running

You will need to run the following command

Logon as SYSMAN and run

Sample Code 1. execute MGMT_USER.MAKE_EM_USER(username);


Copyright exforsys.com

username is the username that you are using to load the data.

After you complete the above, return back and continue the same step again where you have received the error

Login to SQL Plus to remove the data we have loaded from OEM.

Launch Command Prompt

Sample Code 1. sqlldr username/password@dbname control=commandload.ctl

Copyright exforsys.com

Here is the copy of the control file used in the example. Sample Code 1. LOAD DATA 2. INFILE 'E:oraclesqlloadercommandload.dat' "STR 'rn'" 3. APPEND 4. INTO TABLE customer 5. FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' 6. ( 7. FIRST_NAME CHAR, 8. LAST_NAME CHAR, 9. ADDRESS CHAR, 10. CITY CHAR, 11. COUNTRY CHAR 12. )
Copyright exforsys.com

Here is the logfile generated from the above demo. Sample Code 1. SQL*Loader: Release 11.2.0.1.0 - Production ON Sun Mar 6 13:19:31 2011 2. 3. Copyright (c) 1982, 2009, Oracle AND/OR its affiliates. ALL rights reserved. 4. 5. Control File: commandload.ctl 6. DATA File: E:oraclesqlloadercommandload.dat 7. File processing OPTION string: "STR 'rn'" 8. Bad File: commandload.bad 9. Discard File: none specified 10. 11. (Allow ALL discards) 12. 13. Number TO LOAD: ALL 14. Number TO skip: 0 15. Errors allowed: 50 16. Bind array: 64 rows, maximum of 256000 bytes 17. Continuation: none specified 18. Path used: Conventional 19. 20. TABLE CUSTOMER, loaded FROM every logical record. 21. INSERT OPTION IN effect FOR this TABLE: APPEND 22. 23. COLUMN Name Position Len Term Encl Datatype 24. ------------------------------ ---------- ----- ------- --------------------25. FIRST_NAME FIRST * , O(") CHARACTER 26. LAST_NAME NEXT * , O(") CHARACTER 27. ADDRESS NEXT * , O(") CHARACTER 28. CITY NEXT * , O(") CHARACTER 29. COUNTRY NEXT * , O(") CHARACTER 30. 31. 32. Table CUSTOMER: 33. 4 Rows successfully loaded. 34. 0 Rows not loaded due to data errors. 35. 0 Rows not loaded because all WHEN clauses were failed. 36. 0 Rows not loaded because all fields were null.

37. 38. 39. Space allocated for bind array: 82560 bytes(64 rows) 40. Read buffer bytes: 1048576 41. 42. Total logical records skipped: 43. Total logical records read: 44. Total logical records rejected: 45. Total logical records discarded: 46. 47. Run began on Sun Mar 06 13:19:31 2011 48. Run ended on Sun Mar 06 13:19:31 2011 49. 50. Elapsed time was: 00:00:00.05 51. CPU time was: 00:00:00.03 )

0 4 0 0

Database Links
The central concept in distributed database systems is a database link. A database link is a connection between two physical database servers that allows a client to access them as one logical database. This section contains the following topics:
y y y y y y y y y

What Are Database Links? Why Use Database Links? Global Database Names in Database Links Names for Database Links Types of Database Links Users of Database Links Creation of Database Links: Examples Schema Objects and Database Links Database Link Restrictions

What Are Database Links?


A database link is a pointer that defines a one-way communication path from an Oracle Database server to another database server. The link pointer is actually defined as an entry in a data dictionary table. To access the link, you must be connected to the local database that contains the data dictionary entry. A database link connection is one-way in the sense that a client connected to local database A can use a link stored in database A to access information in remote database B, but users connected to database B cannot use the same link to access data in database A. If local users on database B want to access data on database A, then they must define a link that is stored in the data dictionary of database B. A database link connection allows local users to access data on a remote database. For this connection to occur, each database in the distributed system must have a unique global database name in the network domain. The global database name uniquely identifies a database server in a distributed system. Figure 29-3 shows an example of user scott accessing the emp table on the remote database with the global name hq.acme.com: Figure 29-3 Database Link

Description of "Figure 29-3 Database Link"

Database links are either private or public. If they are private, then only the user who created the link has access; if they are public, then all database users have access. One principal difference among database links is the way that connections to a remote database occur. Users access a remote database through the following types of links:
Type of Link Description Connected user link Users connect as themselves, which means that they must have an account on the remote database with the same username and password as their account on the local database. Users connect using the username and password referenced in the link. For example, if Jane uses a fixed user link that connects to the hq database with the username and password scott/ tiger, then she connects as scott, Jane has all the privileges in hq granted to scott directly, and all the default roles that scott has been granted in the hq database.

Fixed user link

Current user A user connects as a global user. A local user can connect as a global user in the link context of a stored procedure, without storing the global user's password in a link definition. For example, Jane can access a procedure that Scott wrote, accessing Scott's account and Scott's schema on the hq database. Current user links are an aspect of Oracle Advanced Security.

Create database links using the CREATE DATABASE LINK statement. After a link is created, you can use it to specify schema objects in SQL statements. See Also:
y y

Oracle Database SQL Language Reference for syntax of the CREATE DATABASE statement Oracle Database Advanced Security Administrator's Guide for information about Oracle Advanced Security

What Are Shared Database Links?


A shared database link is a link between a local server process and the remote database. The link is shared because multiple client processes can use the same link simultaneously. When a local database is connected to a remote database through a database link, either database can run in dedicated or shared server mode. The following table illustrates the possibilities:
Local Database Mode Dedicated Dedicated Shared server Shared server Remote Database Mode Dedicated Shared server Dedicated Shared server

A shared database link can exist in any of these four configurations. Shared links differ from standard database links in the following ways:
y y

Different users accessing the same schema object through a database link can share a network connection. When a user needs to establish a connection to a remote server from a particular server process, the process can reuse connections already established to the remote server. The reuse of the connection can occur if the connection was established on the same server process with the same database link, possibly in a different session. In a non-shared database link, a connection is not shared across multiple sessions. When you use a shared database link in a shared server configuration, a network connection is established directly out of the shared server process in the local server. For a non-shared database link on a local shared server, this connection would have been established through the local dispatcher, requiring context switches for the local dispatcher, and requiring data to go through the dispatcher. See Also:

Oracle Database Net Services Administrator's Guide for information about shared server

Why Use Database Links?


The great advantage of database links is that they allow users to access another user's objects in a remote database so that they are bounded by the privilege set of the object owner. In other words, a local user can access a link to a remote database without having to be a user on the remote database. For example, assume that employees submit expense reports to Accounts Payable (A/P), and further suppose that a user using an A/P application needs to retrieve information about employees from the hq database. The A/P users should be able to connect to the hq database and execute a stored procedure in the remote hq database that retrieves the desired information. The A/P users should not need to be hq database users to do their jobs; they should only be able to access hq information in a controlled way as limited by the procedure. See Also:
y y

"Users of Database Links" for an explanation of database link users "Viewing Information About Database Links" for an explanation of how to hide passwords from non-administrative users

Global Database Names in Database Links


To understand how a database link works, you must first understand what a global database name is. Each database in a distributed database is uniquely identified by its global database name. The database forms a global database name by prefixing the database network domain, specified by the DB_DOMAIN initialization parameter at database creation, with the individual database name, specified by the DB_NAME initialization parameter. For example, Figure 29-4 illustrates a representative hierarchical arrangement of databases throughout a network. Figure 29-4 Hierarchical Arrangement of Networked Databases

Description of "Figure 29-4 Hierarchical Arrangement of Networked Databases"

The name of a database is formed by starting at the leaf of the tree and following a path to the root. For example, the mfg database is in division3 of the acme_tools branch of the com domain. The global database name for mfg is created by concatenating the nodes in the tree as follows:
y mfg.division3.acme_tools.com

While several databases can share an individual name, each database must have a unique global database name. For example, the network domains us.americas.acme_auto.com and uk.europe.acme_auto.com each contain a sales database. The global database naming system distinguishes the sales database in the americas division from the sales database in the europe division as follows:
y y sales.us.americas.acme_auto.com sales.uk.europe.acme_auto.com

See Also:
"Managing Global Names in a Distributed System" to learn how to specify and change global database names

Names for Database Links

Typically, a database link has the same name as the global database name of the remote database that it references. For example, if the global database name of a database is sales.us.oracle.com, then the database link is also called sales.us.oracle.com. When you set the initialization parameter GLOBAL_NAMES to TRUE, the database ensures that the name of the database link is the same as the global database name of the remote database. For example, if the global database name for hq is hq.acme.com, and GLOBAL_NAMES is TRUE, then the link name must be called hq.acme.com. Note that the database checks the domain part of the global database name as stored in the data dictionary, not the DB_DOMAIN setting in the initialization parameter file (see "Changing the Domain in a Global Database Name"). If you set the initialization parameter GLOBAL_NAMES to FALSE, then you are not required to use global naming. You can then name the database link whatever you want. For example, you can name a database link to hq.acme.com as foo. Note:
Oracle recommends that you use global naming because many useful features, including Replication, require global naming.

After you have enabled global naming, database links are essentially transparent to users of a distributed database because the name of a database link is the same as the global name of the database to which the link points. For example, the following statement creates a database link in the local database to remote database sales:
CREATE PUBLIC DATABASE LINK sales.division3.acme.com USING 'sales1';

See Also:
Oracle Database Reference for more information about specifying the initialization parameter
GLOBAL_NAMES

Types of Database Links


Oracle Database lets you create private, public, and global database links. These basic link types differ according to which users are allowed access to the remote database:
Type Owner Description Creates link in a specific schema of the local database. Only the owner of a private database link or PL/SQL subprograms in the schema can use this link to access database objects in the corresponding remote database.

Private User who created the link. View ownership data through:
y y y DBA_DB_LINKS ALL_DB_LINKS USER_DB_LINKS

Public User called PUBLIC. View ownership Creates a database-wide link. All users and PL/SQL data through views shown for subprograms in the database can use the link to

Type

Owner private database links.

Description access database objects in the corresponding remote database.

Global User called PUBLIC. View ownership Creates a network-wide link. When an Oracle network data through views shown for uses a directory server, the directory server private database links. automatically create and manages global database links (as net service names) for every Oracle Database in the network. Users and PL/SQL subprograms in any database can use a global link to access objects in the corresponding remote database.

Note: In earlier releases of Oracle Database, a global database link referred to a database link that was registered with an Oracle Names server. The use of an Oracle Names server has been deprecated. In this document, global database links refer to the use of net service names from the directory server.

Determining the type of database links to employ in a distributed database depends on the specific requirements of the applications using the system. Consider these features when making your choice:
Type of Link Features

Private This link is more secure than a public or global link, because only the owner of the database link private link, or subprograms within the same schema, can use the link to access the remote database. Public When many users require an access path to a remote Oracle Database, you can database link create a single public database link for all users in a database. Global When an Oracle network uses a directory server, an administrator can conveniently database link manage global database links for all databases in the system. Database link management is centralized and simple.

See Also:
y y

"Specifying Link Types" to learn how to create different types of database links "Viewing Information About Database Links" to learn how to access information about links

Users of Database Links


When creating the link, you determine which user should connect to the remote database to access the data. The following table explains the differences among the categories of users involved in database links:
Sample Link Creation Syntax

User Type

Description

CREATE PUBLIC Connected A local user accessing a database link in which no fixed DATABASE LINK hq user username and password have been specified. If SYSTEM USING 'hq'; accesses a public link in a query, then the connected user is SYSTEM, and the database connects to the SYSTEM schema in the remote database.

Note: A connected user does not have to be the user who created the link, but is any user who is accessing the link.
Current user A global user in a CURRENT_USER database link. The global user must be authenticated by an X.509 certificate (an SSLauthenticated enterprise user) or a password (a passwordauthenticated enterprise user), and be a user on both databases involved in the link. Current user links are an aspect of the Oracle Advanced Security option.
CREATE PUBLIC DATABASE LINK hq CONNECT TO CURRENT_USER using 'hq';

See Oracle Database Advanced Security Administrator's Guide for information about global security
Fixed user A user whose username/password is part of the link definition. If a link includes a fixed user, the fixed user's username and password are used to connect to the remote database.
CREATE PUBLIC DATABASE LINK hq CONNECT TO jane IDENTIFIED BY doe USING 'hq';

See Also:
"Specifying Link Users" to learn how to specify users when creating links

Connected User Database Links


Connected user links have no connect string associated with them. The advantage of a connected user link is that a user referencing the link connects to the remote database as the same user, and credentials don't have to be stored in the link definition in the data dictionary.

Connected user links have some disadvantages. Because these links require users to have accounts and privileges on the remote databases to which they are attempting to connect, they require more privilege administration for administrators. Also, giving users more privileges than they need violates the fundamental security concept of least privilege: users should only be given the privileges they need to perform their jobs. The ability to use a connected user database link depends on several factors, chief among them whether the user is authenticated by the database using a password, or externally authenticated by the operating system or a network authentication service. If the user is externally authenticated, then the ability to use a connected user link also depends on whether the remote database accepts remote authentication of users, which is set by the REMOTE_OS_AUTHENT initialization parameter. The REMOTE_OS_AUTHENT parameter operates as follows:
REMOTE_OS_AUTHENT Value
TRUE for the remote

Consequences An externally-authenticated user can connect to the remote database using a connected user database link. An externally-authenticated user cannot connect to the remote database using a connected user database link unless a secure protocol or a network authentication service supported by the Oracle Advanced Security option is used.

database
FALSE for the remote

database

Note:
The REMOTE_OS_AUTHENT initialization parameter is deprecated. It is retained for backward compatibility only.

Fixed User Database Links


A benefit of a fixed user link is that it connects a user in a primary database to a remote database with the security context of the user specified in the connect string. For example, local user joe can create a public database link in joe's schema that specifies the fixed user scott with password tiger. If jane uses the fixed user link in a query, then jane is the user on the local database, but she connects to the remote database as scott/tiger. Fixed user links have a username and password associated with the connect string. The username and password are stored with other link information in data dictionary tables.

Current User Database Links


Current user database links make use of a global user. A global user must be authenticated by an X.509 certificate or a password, and be a user on both databases involved in the link.

The user invoking the CURRENT_USER link does not have to be a global user. For example, if jane is authenticated (not as a global user) by password to the Accounts Payable database, she can access a stored procedure to retrieve data from the hq database. The procedure uses a current user database link, which connects her to hq as global user scott. User scott is a global user and authenticated through a certificate over SSL, but jane is not. Note that current user database links have these consequences:
y

If the current user database link is not accessed from within a stored object, then the current user is the same as the connected user accessing the link. For example, if scott issues a SELECT statement through a current user link, then the current user is scott. When executing a stored object such as a procedure, view, or trigger that accesses a database link, the current user is the user that owns the stored object, and not the user that calls the object. For example, if jane calls procedure scott.p (created by scott), and a current user link appears within the called procedure, then scott is the current user of the link. If the stored object is an invoker-rights function, procedure, or package, then the invoker's authorization ID is used to connect as a remote user. For example, if user jane calls procedure scott.p (an invoker-rights procedure created by scott), and the link appears inside procedure scott.p, then jane is the current user. You cannot connect to a database as an enterprise user and then use a current user link in a stored procedure that exists in a shared, global schema. For example, if user jane accesses a stored procedure in the shared schema guest on database hq, she cannot use a current user link in this schema to log on to a remote database. See Also:
o

"Distributed Database Security" for more information about security issues relating to database links o Oracle Database Advanced Security Administrator's Guide o Oracle Database PL/SQL Language Reference for more information about invoker-rights functions, procedures, or packages.

Creation of Database Links: Examples


Create database links using the CREATE DATABASE LINK statement. The table gives examples of SQL statements that create database links in a local database to the remote sales.us.americas.acme_auto.com database:
Connects To Database Connects As
sales using Connected user

SQL Statement
CREATE DATABASE LINK sales.us.americas.acme_auto.com USING 'sales_us';

Link Type Private connected user

net service name


sales_us

SQL Statement
CREATE DATABASE LINK foo CONNECT TO CURRENT_USER USING 'am_sls';

Connects To Database Connects As


sales using Current global

Link Type Private current user

service name
am_sls

user

CREATE DATABASE LINK sales.us.americas.acme_auto.com CONNECT TO scott IDENTIFIED BY tiger USING 'sales_us';

sales using scott using

net service name


sales_us

password tiger

Private fixed user

CREATE PUBLIC DATABASE LINK sales CONNECT TO scott IDENTIFIED BY tiger USING 'rev';

sales using scott using

net service name rev

password tiger

Public fixed user

CREATE SHARED PUBLIC DATABASE LINK sales.us.americas.acme_auto.com CONNECT TO scott IDENTIFIED BY tiger AUTHENTICATED BY anupam IDENTIFIED BY bhide USING 'sales';

sales using scott using

net service name


sales

password tiger, authenticated as anupam using password bhide

Shared public fixed user

See Also:
y y

"Creating Database Links" to learn how to create link Oracle Database SQL Language Reference for information about the CREATE DATABASE LINK statement syntax

Schema Objects and Database Links


After you have created a database link, you can execute SQL statements that access objects on the remote database. For example, to access remote object emp using database link foo, you can issue:
SELECT * FROM emp@foo;

You must also be authorized in the remote database to access specific remote objects. Constructing properly formed object names using database links is an essential aspect of data manipulation in distributed systems.

Naming of Schema Objects Using Database Links

Oracle Database uses the global database name to name the schema objects globally using the following scheme:
schema.schema_object@global_database_name

where:
y

y y

is a collection of logical structures of data, or schema objects. A schema is owned by a database user and has the same name as that user. Each user owns a single schema. schema_object is a logical data structure like a table, index, view, synonym, procedure, package, or a database link. global_database_name is the name that uniquely identifies a remote database. This name must be the same as the concatenation of the remote database initialization parameters DB_NAME and DB_DOMAIN, unless the parameter GLOBAL_NAMES is set to FALSE, in which case any name is acceptable.
schema

For example, using a database link to database sales.division3.acme.com, a user or application can reference remote data as follows:
SELECT * FROM scott.emp@sales.division3.acme.com; # emp table in scott's schema SELECT loc FROM scott.dept@sales.division3.acme.com;

If GLOBAL_NAMES is set to FALSE, then you can use any name for the link to sales.division3.acme.com. For example, you can call the link foo. Then, you can access the remote database as follows:
SELECT name FROM scott.emp@foo; # link name different from global name

Authorization for Accessing Remote Schema Objects


To access a remote schema object, you must be granted access to the remote object in the remote database. Further, to perform any updates, inserts, or deletes on the remote object, you must be granted the SELECT privilege on the object, along with the UPDATE, INSERT, or DELETE privilege. Unlike when accessing a local object, the SELECT privilege is necessary for accessing a remote object because the database has no remote describe capability. The database must do a SELECT * on the remote object in order to determine its structure.

Synonyms for Schema Objects


Oracle Database lets you create synonyms so that you can hide the database link name from the user. A synonym allows access to a table on a remote database using the same syntax that you would use to access a table on a local database. For example, assume you issue the following query against a table in a remote database:
SELECT * FROM emp@hq.acme.com;

You can create the synonym emp for emp@hq.acme.com so that you can issue the following query instead to access the same data:

SELECT * FROM emp;

See Also:
"Using Synonyms to Create Location Transparency" to learn how to create synonyms for objects specified using database links

Schema Object Name Resolution


To resolve application references to schema objects (a process called name resolution), the database forms object names hierarchically. For example, the database guarantees that each schema within a database has a unique name, and that within a schema each object has a unique name. As a result, a schema object name is always unique within the database. Furthermore, the database resolves application references to the local name of the object. In a distributed database, a schema object such as a table is accessible to all applications in the system. The database extends the hierarchical naming model with global database names to effectively create global object names and resolve references to the schema objects in a distributed database system. For example, a query can reference a remote table by specifying its fully qualified name, including the database in which it resides. For example, assume that you connect to the local database as user SYSTEM:
CONNECT SYSTEM@sales1

You then issue the following statements using database link hq.acme.com to access objects in the scott and jane schemas on remote database hq:
SELECT * FROM scott.emp@hq.acme.com; INSERT INTO jane.accounts@hq.acme.com (acc_no, acc_name, balance) VALUES (5001, 'BOWER', 2000); UPDATE jane.accounts@hq.acme.com SET balance = balance + 500; DELETE FROM jane.accounts@hq.acme.com WHERE acc_name = 'BOWER';

Database Link Restrictions


You cannot perform the following operations using database links:
y y

y y y

Grant privileges on remote objects Execute DESCRIBE operations on some remote objects. The following remote objects, however, do support DESCRIBE operations: o Tables o Views o Procedures o Functions Analyze remote objects Define or enforce referential integrity Grant roles to users in a remote database

y y

Obtain nondefault roles on a remote database. For example, if jane connects to the local database and executes a stored procedure that uses a fixed user link connecting as scott, jane receives scott's default roles on the remote database. Jane cannot issue SET ROLE to obtain a nondefault role. Execute hash query joins that use shared server connections Use a current user link without authentication through SSL, password, or NT native authentication

104. Describe two phases of Two-phase commit? Prepare phase - The global coordinator (initiating node) ask a participants to prepare (to promise to commit or rollback the transaction, even if there is a failure) Commit - Phase - If all participants respond to the coordinator that they are prepared, the coordinator asks all nodes to commit the transaction, if all participants cannot prepare, the coordinator asks all nodes to roll back the transaction.

Autonomous Transactions
Autonomous transactions allow you to leave the context of the calling transaction, perform an independant transaction, and return to the calling transaction without affecting it's state. The autonomous transaction has no link to the calling transaction, so only commited data can be shared by both transactions. The following types of PL/SQL blocks can be defined as autonomous transactions:
y y y y y

Stored procedures and functions. Local procedures and functions defined in a PL/SQL declaration block. Packaged procedures and functions. Type methods. Top-level anonymous blocks.

The easiest way to understand autonomous transactions is to see them in action. To do this, we create a test table and populate it with two rows. Notice that the data is not commited.
CREATE TABLE at_test ( id NUMBER NOT NULL, description VARCHAR2(50) NOT NULL ); INSERT INTO at_test (id, description) VALUES (1, 'Description for 1'); INSERT INTO at_test (id, description) VALUES (2, 'Description for 2'); SELECT * FROM at_test; ID ---------1 2 DESCRIPTION -------------------------------------------------Description for 1 Description for 2

2 rows selected. SQL>

Next, we insert another 8 rows using an anonymous block declared as an autonomous transaction, which contains a commit statement.
DECLARE PRAGMA AUTONOMOUS_TRANSACTION; BEGIN FOR i IN 3 .. 10 LOOP INSERT INTO at_test (id, description) VALUES (i, 'Description for ' || i); END LOOP; COMMIT; END; / PL/SQL procedure successfully completed. SELECT * FROM at_test; ID DESCRIPTION

---------1 2 3 4 5 6 7 8 9 10

-------------------------------------------------Description for 1 Description for 2 Description for 3 Description for 4 Description for 5 Description for 6 Description for 7 Description for 8 Description for 9 Description for 10

10 rows selected. SQL>

As expected, we now have 10 rows in the table. If we now issue a rollback statement we get the following result.
ROLLBACK; SELECT * FROM at_test; ID ---------3 4 5 6 7 8 9 10 DESCRIPTION -------------------------------------------------Description for 3 Description for 4 Description for 5 Description for 6 Description for 7 Description for 8 Description for 9 Description for 10

8 rows selected. SQL>

The 2 rows inserted by our current session (transaction) have been rolled back, while the rows inserted by the autonomous transactions remain. The presence of the PRAGMA AUTONOMOUS_TRANSACTION compiler directive made the anonymous block run in its own transaction, so the internal commit statement did not affect the calling session. As a result rollback was still able to affect the DML issued by the current statement. Autonomous transactions are commonly used by error logging routines, where the error messages must be preserved, regardless of the the commit/rollback status of the transaction. For example, the following table holds basic error messages.
CREATE TABLE error_logs ( id NUMBER(10) NOT NULL, log_timestamp TIMESTAMP NOT NULL, error_message VARCHAR2(4000), CONSTRAINT error_logs_pk PRIMARY KEY (id) ); CREATE SEQUENCE error_logs_seq;

We define a procedure to log error messages as an autonomous transaction.

CREATE OR REPLACE PROCEDURE log_errors (p_error_message IN VARCHAR2) AS PRAGMA AUTONOMOUS_TRANSACTION; BEGIN INSERT INTO error_logs (id, log_timestamp, error_message) VALUES (error_logs_seq.NEXTVAL, SYSTIMESTAMP, p_error_message); COMMIT; END; /

The following code forces an error, which is trapped and logged.


BEGIN INSERT INTO at_test (id, description) VALUES (998, 'Description for 998'); -- Force invalid insert. INSERT INTO at_test (id, description) VALUES (999, NULL); EXCEPTION WHEN OTHERS THEN log_errors (p_error_message => SQLERRM); ROLLBACK; END; / PL/SQL procedure successfully completed. SELECT * FROM at_test WHERE id >= 998; no rows selected SELECT * FROM error_logs; ID LOG_TIMESTAMP ---------- -------------------------------------------------------------------------ERROR_MESSAGE --------------------------------------------------------------------------------------------------1 28-FEB-2006 11:10:10.107625 ORA-01400: cannot insert NULL into ("TIM_HALL"."AT_TEST"."DESCRIPTION") 1 row selected. SQL>

From this we can see that the LOG_ERRORS transaction was separate to the anonymous block. If it weren't, we would expect the first insert in the anonymous block to be preserved by the commit statement in the LOG_ERRORS procedure. Be careful how you use autonomous transactions. If they are used indiscriminately they can lead to deadlocks, and cause confusion when analyzing session trace. To hammer this point home, here's a quote from Tom Kyte posted on my blog (here): "... in 999 times out of 1000, if you find yourself "forced" to use an autonomous transaction it likely means you have a serious data integrity issue you haven't thought about.

Where do people try to use them?


y y

in that trigger that calls a procedure that commits (not an error logging routine). Ouch, that has to hurt when you rollback. in that trigger that is getting the mutating table constraint. Ouch, that hurts *even more*

Error logging - OK. Almost everything else - not OK."