You are on page 1of 7

Knowledge Modules (KM), components of Oracle Data Integrator’s Open Connector Technology, are

generic, highly specific and reusable code templates which defines the overall data integration process.
Each Knowledge Module contains the knowledge required by ODI to perform a specific set of actions or
tasks against a specific technology or set of technologies, such as connecting to this technology,
extracting data from it, transforming the data, checking it, integrating it, etc.

ODI Uses Six Different Types of Knowledge Modules

1. RKM (Reverse Knowledge Module) are used to perform a customized reverse-engineering of


data models for a specific technology. It extracts metadata from a metadata provider to ODI
repository. These are used in data models. A data model corresponds to group of tabular data
structure stored in a data server and is based on a Logical Schema defined in the topology and
contain only metadata.
2. LKM (Loading Knowledge Module) are used to extract data from heterogeneous source
systems (files, middleware, databases, etc.) to a staging area. These are used in
Interfaces. An interface consists of a set of rules that define the loading of a datastore or a
temporary target structure from one or more source datastores.
3. JKM (Journalizing Knowledge Modules) are used to create a journal of data modifications
(insert, update and delete) of the source databases to keep track of changes. These are used
in data models and used for Changed Data Capture.
4. IKM (Integration Knowledge Module) are used to integrate (load) data from staging to
target tables. These are used in Interfaces.
5. CKM (Check Knowledge Module) are used to check data consistency i.e. constraints on the
sources and targets are not violated. These are used in data model’s static checks and
interfaces flow checks. Static check refers to constraint or rules defined in data model to
verify integrity of source or application data. Flow check refers to declarative rules defined in
interfaces to verify an application’s incoming data before loading into target tables.
6. SKM (Service Knowledge Module) are used to generate code required for data services.
These are used in data models. Data Services are specialized web services that enable
access to application data in datastores, and to the changes captured for these datastores
using Changed Data Capture.
Loading Knowledge Module (LKM)

The LKM is used to load data from a source datastore to staging tables. This loading comes into picture
when some transformations take place in the staging area and the source datastore is on a different data
server than in the staging area. The LKM is not required when all source datastores reside on the same
data server as the staging area.
An interface consists of a set of declarative rules that define the loading of a datastore or a temporary
target structure from one or more source datastores. The LKM executes the declarative rules on the
source server and retrieves a single result set that it stores in a "C$" table in the staging area using
defined loading method. An interface may require several LKMs when it uses datastores from
heterogeneous sources.

LKM Loading Methods Are Classified as Follows

1. Loading Using the Run-time Agent - This is a standard Java connectivity method (JDBC,
JMS, etc.). It reads data from source using JDBC connector and writes to staging table using
JDBC. This method is not suitable for loading large volume of data set as it reads data as row-by-
row in an array from source and writes to staging as row-by-row in a batch.
2. Loading File Using Loaders - This method is used to leverages most efficient loading utility
available for the staging area technology (for e.g. Oracle’s SQL*Loader, Microsoft SQL Server
bcp, Teradata FastLoad or MultiLoad) when the interface uses Flat file as a source.
3. Loading Using Unload/Load - This is alternate solution for run-time agent when dealing with
large volumes of data across heterogeneous sources. Data from sources can be extracted into flat
file and then load the file into staging table.
4. Loading Using RDBMS-Specific Strategies - This method leverages RDBMSs mechanism for
data transfer across servers (e.g. Oracle’s database links, MS SQL Server’s linked servers, etc.)
A Typical LKM Loading Process Works in the Following Way

1. The loading process drops the temporary loading table C$ (if it exists) and then creates the loading
table in the staging area. The loading table represents a source set i.e. the images of the columns that
takes part in transformation and not the source datastore. It can be explained with a few examples below:
 If only a few columns from a source table is used in a mapping and joins on the staging area,
then loading table contains images of only those columns. Source columns which are not
required in the rest of the integration flow will not appear in loading table.

 If a source column is only used as a filter constraint to filter out certain rows and is not used
afterward in interface, then loading table will not include this column.

 If two tables are joined in the source and the resulting source set is used in transformations in the
staging area, then loading table will contain combined columns from both tables.
 If all the columns from a source datastore are mapped in interface and this datastore is not joined
in source, then the loading table is an exact image of source datastore. For e.g. in case of a File
as a source.

2. Data is loaded from the source (A, B, C in this case) into the loading table using an appropriate LKM
loading method (run-time agent, RDBMS specific strategy).

3. Data from loading table is then used in the integration phase to load integration table.

4. After the integration phase, and before the interface completes, the temporary loading table is dropped.

LKM Naming Convention

LKM <source technology> to <target technology> [(loading method)]

Oracle Data Integrator provides a large number Loading Knowledge Modules out-of-the-box. List of
supported LKMs can be found in ODI Studio and also can be seen in installation directory <ODI
Home>\oracledi\xml-reference.

Below are examples of a few LKMs

Integration Knowledge Description


Module

LKM File to SQL Loads data from an ASCII or EBCDIC File to any
ISO-92 compliant database.

LKM File to MSSQL Loads data from a file to a Microsoft SQL Server
(BULK) BULK INSERT SQL statement.

LKM File to Oracle Loads data from a file to an Oracle staging area
(EXTERNAL TABLE) using the EXTERNAL TABLE SQL Command.

LKM MSSQL to MSSQL Loads data from a Microsoft SQL Server to a


(LINKED SERVERS) Microsoft SQL Server database using Linked
Servers mechanism.

LKM MSSQL to Oracle Loads data from a Microsoft SQL Server to Oracle
(BCP SQLLDR) database (staging area) using the BCP and
SQL*Loader utilities.

LKM Oracle BI to Oracle Loads data from any Oracle BI physical layer to an
(DBLINK) Oracle target database using dblink.

LKM Oracle to Oracle Loads data from an Oracle source database to an


(datapump) Oracle staging area database using external tables
in the datapump format.

Integration Knowledge Module (IKM)

IKM takes place in the interface during an integration process to integrate data from source (in case of
datastore exists in the same data server as the staging area) or loading table (i.e. C$ tables loaded by
LKM in case of remote datastore on a separate data server than staging area) into the target datastore
depending on selected integration mode; data might be inserted, updated or to capture slowly changing
dimension.

ODI Supports the Integration Modes Below

 Append - Rows are append to target table. Existing records are not updated. It is possible to
delete all rows before performing an insert by setting optional truncate property.
 Control Append - Does same operation as Append, but in addition data flow can be checked by
setting flow control property. A flow control is used to check the data quality to ensure that all
references are validated before loading into target.
 Incremental Update - Used for performing insert and update. Existing rows are updated and
non-existence rows are inserted using Natural Key defined in interface, along with checking
flow control.
 Slowly Changing Dimension - Used to maintain Type 2 SCD for slowly changing attributes.

IKM Integration Process Works in Two Ways


1. When staging is on same server as target.
2. When staging is on a different server than target (also referred as multi-technology IKMs).
1. When Staging is on the Same Data Server as Target

This is useful to perform complex integration strategies, recycling rejected records from previous runs,
implementing technology specific optimized integration methods before loading data into target.

Typical Flow Process

1. IKM executes a single set-oriented SQL based programming to perform staging area and
target area declarative rules on all “C$” tables and source tables (D in this case) to generate
result set.
2. IKM then writes the result set directly into target table (in case of “Append” integration mode)
and into an Integration table “I$” (in case of more complex integration mode. For e.g.,
Incremental Update, SCD) before loading into target. Integration table or flow table is an
image of the target table with few extra fields required to carry out specific operations on data
before loading data into target. Data in this table are flagged for insert/update, transformed
and checked against constraint to identify invalid rows using CKM and load erroneous rows
into “E$” table and removed them from “I$” table.
3. IKM loads the records from “I$” table to target table using the defined integration mode
(control append, Incremental update, etc.)
4. After completion of data loading, IKM drops temporary integration tables.
5. IKM can optionally call CKM to check the consistency of target datastore.
IKM can also be configured to recycle rejected records from previous runs from error table “E$” to
integration table “I$” by setting property RECYCLE_ERRORS in the interface before calling CKM. This is
useful for example when a fact or transaction rows that reference an INTEGRATION_ID of a dimension
that may not exist in previous run of interface but is available in current run. So, the error record becomes
valid and need to be reapplied to target.
2. When Staging is on a Different Data Server Than Target

This configuration is mainly used for data servers with no transformation capabilities and only simple
integration modes are possible, for e.g. Server to File. CKM operations cannot be performed in this
strategy.

Typical Flow Process

1. IKM executes a single set-oriented SQL based programming to perform staging area and
target area declarative rules on all “C$” tables and source tables (D in this case) to generate
result set.
2. IKM then writes the result set directly into target table using defined integration mode
(append or incremental update).
IKM Naming Convention

IKM [<staging technology>] <target technology> [<integration mode>] [(<integration method>)]

List of supported IKMs can be found in ODI Studio and also can be seen in installation directory <ODI
Home>\oracledi\xml-reference.

Below Are Examples of a Few IKMs

Integration Knowledge Description


Module

IKM Oracle Incremental Integrates data into an Oracle target table in


Update (MERGE) incremental update mode using the MERGE DML
statement. Erroneous data can be isolated into an
error table and can be recycled in next execution of
interface.

When using this module with a journalized source


table, it is possible to synchronize deletions

IKM Oracle Incremental Integrates data into an Oracle target table in


Update incremental update. Erroneous data can be isolated
into an error table and can be recycled in next
execution of interface.

When using this module with a journalized source


table, it is possible to synchronize deletions

IKM Oracle Slowly Integrated data into Oracle target table by


Changing Dimension maintaining SCD type 2 history.

Erroneous data can be isolated into an error table


and can be recycled in next execution of interface.

IKM Oracle Multi Table Integrates data from one source into one to many
Insert Oracle target tables in append mode, using a multi-
table insert statement.