Вы находитесь на странице: 1из 10

InfoSphere Change Data Capture

for Oracle
Configurations
This document discusses the different Oracle configurations supported by InfoSphere Change Data
Capture. It covers Real Application Clusters (RAC) and Automated Storage Manager (ASM) options
available with Oracle as well as remote log reading and apply options available with InfoSphere CDC. It
provides a step-by-step approach to the specific tasks which need to be performed while installing and
configuring InfoSphere CDC.

RAC
Considerations
One of the main reasons to use a RAC environment is to enable failover capabilities. In order to
incorporate CDC into the same concept there are two criteria that the configuration must allow:

Ability to restart InfoSphere CDC from a different location that is, InfoSphere CDC binaries,
configuration files and operational metadata need to be accessible from both locations.
Ability to connect to InfoSphere CDC by external clients or processes (for example, for
subscriptions targeting the failed InfoSphere CDC instance).

With this in mind here are some best practices to install InfoSphere CDC on a RAC environment:

InfoSphere CDC can be installed on a node of the cluster or outside of the cluster. It is preferable
to install outside the cluster to facilitate failover scenarios
InfoSphere CDC should be installed on the mount point of a SAN or NFS
InfoSphere CDC must have access to all archived and online redo logs generated by all nodes on
the RAC.

Steps
1. Install InfoSphere CDC on a node inside or outside of the Oracle RAC but on the mount point of a
SAN
2. Create an entry on /etc/hosts for each node involved in the configuration using a common
host_name entry and pointing to the corresponding IP. This way regardless of the node on which
InfoSphere CDC is started, the host_name will be the same and hence configuration wont be
affected. Here is an example. Say we have NODE A and NODE B on the RAC:
/etc/host on NODE A
#cdc_host <IP address of NODE A>
/etc/host on NODE B
#cdc_host <IP address of NODE B>
3. A similar approach must be taken for the tnsnames.ora configuration. Create a tnsnames.ora
entry for each node on the RAC environment using the host_name used on the previous step and
pointing to the proper RAC node SID. In line with the previous example
On NODE A:
SID_CDC=
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=cdc_host)(PORT=1521))
(CONNECT_DATA=(server=DEDICATED)

(SERVICE_NAME=SID_A)
)
)
On NODE B:
SID_CDC=
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=cdc_host)(PORT=1521))
(CONNECT_DATA=(server=DEDICATED)
(SERVICE_NAME=SID_B)
)
)
To verify that the entries are properly defined, verify the connectivity to the database as follows:
from the command line run sqlplus user/pass@SID_CDC.
4. While creating the CDC instance the two items related specifically to RAC are:
ORACLE_HOME: Full path of the ORACLE_HOME
SERVICE NAME: tnsnames.ora entry name you created in step 3)
NOTE: if the Oracle RAC uses ASM then refer to the details under ASM configuration as well.

ASM
Considerations
There are special considerations while installing InfoSphere CDC against an Oracle database where the
logs are managed by Automatic Storage Management (ASM):

InfoSphere CDC requires an Oracle ASM connection (user name and password) which requires
sysdba privileges.
Physical access to the underlying RAW block device storage.

InfoSphere CDC can be configured to connect locally or remotely to ASM (in RAC and NON-RAC
scenarios). The configuration is slightly different in those scenarios:
Connecting to ASM locally
1. Install InfoSphere CDC on a server which can access ASM locally
2. Create an entry on /etc/hosts for each node involved in the configuration using a common host
name entry and pointing to the corresponding IP
3. A similar approach has to be taken for the tnsnames.ora configuration. Create a tnsnames.ora
entry using the host_name used on the previous step and pointing to the proper local SID. If this
is a RAC system, then it must point to the Oracle Node SID, otherwise use the global one.
One for the database (note that if RAC is been used, then the SID_DB should correspond to the
SID of one node, not the global one):
SID_CDC=
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=cdc_host)(PORT=1521))
(CONNECT_DATA=(server=DEDICATED)
(SERVICE_NAME=SID_DB)

)
)
And one for the ASM:
SID_ASM=
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=cdc_host)(PORT=1521))
(CONNECT_DATA=(server=DEDICATED)
(SERVICE_NAME=SID_ASM)
)
)
To verify that the entries are properly defined, verify the connectivity to the database as follows:
from the command line run sqlplus user/pass@SID_CDC and sqlplus user/pass@SID_ASM.

4. While creating the CDC instance, ensure the following items are entered correctly:
ORACLE_HOME:
Full path of the ORACLE_HOME
SERVICE NAME:
tnsnames.ora entry name you created in step 3)
ASM ORACLE_PATH: Full path for the ASM ORACLE_HOME
ASM USER:
User with sysdba privileges
ASM PASSWORD:
Password for the ASM USER
5. MIRROR_ASM_ORCL_PATH
If ASM uses asmLib (a Linux library set produced by Oracle) to manage the raw devices on a
Linux platform, then this parameter is needed to tell CDC where to locate the path to the ASM
disk. Normally, this value is available from V$ASM_DISK, but with asmLib, the PATH column in
V$ASM_DISK will have a prefix "ORCL:" and CDC needs to know what path "ORCL" represents.
To get the value, run the "oracleasm status" command the output will look something like this:
oracleasm status
Checking if ASM is loaded: yes
Checking if /dev/oracleasm is mounted: yes
The 2nd line shows where ORCL is mounted: /dev/oracleasm, and you must append the word
"disks" to it. With the sample shown above, the value for mirror_asm_orcl_path would be set to
"/dev/oracleasm/disks"
Connecting to ASM remotely
Steps here are the same as above with two additional considerations.

InfoSphere CDC must be able to access all disks managed by the Oracle ASM instance.

The ASM devices must be mounted to the machine where InfoSphere CDC is installed with the exact
same names as appears in the PATH column of the V$ASM_DISK view.

1. Install InfoSphere CDC on a server which can access the Oracle ASM instance remotely.
2. Create an entry on /etc/hosts for each node involved in the configuration using a common
host_name entry and pointing to the corresponding IP
3. A similar approach has to be taken for the tnsnames.ora configuration. In this setting we need to
create two tnsnames.ora entries:

One for the database (note that if RAC is been used, then the SID_DB should correspond to the
SID of one node, not the global one):
SID_CDC=
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=cdc_host)(PORT=1521))
(CONNECT_DATA=(server=DEDICATED)
(SERVICE_NAME=SID_DB)
)
)
And one for the ASM:
SID_ASM=
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=cdc_host)(PORT=1521))
(CONNECT_DATA=(server=DEDICATED)
(SERVICE_NAME=SID_ASM)
)
)
To verify that the entries are properly defined, verify the connectivity to the database as follows:
from the command line run sqlplus user/pass@SID_CDC and sqlplus user/pass@SID_ASM.
4. While creating the CDC instance ensure the following items are configured correctly:
ORACLE_HOME
Full path of the ORACLE_HOME where the tnsnames.ora
containing the connections strings can be found.
SERVICE NAME
tnsnames.ora entry name you created on step 3) for the
database
MIRROR_ASM_ORCL_PATH Full path to disks if using ASMLib (see previous section for full
description).
ASM ORACLE_PATH
Full path for the ASM ORACLE_HOME where the tnsnames.ora
containing the connections strings can be found.
ASM USER
User with sysdba privileges
ASM PASSWORD
Password for the ASM user

Archive only mode


Considerations
Online logs are a source of disk contention in that both Oracle and CDC can access them simultaneously.
Therefore, the operating system must share the underlying device between reader (CDC) and writer
(Oracle). In this case, if the physical disks do not have sufficient IO operations per second, both products
will be constrained. Oracle will find that writes will queue and take longer to complete, and CDC will stall
on disk reads. If this is problematic in your environment, InfoSphere CDC can be configured to only read
archive logs. Be aware that this will affect latency.
Steps
1. Set oracle_archive_logs_only system parameter to true.

Read-only source support


Considerations

Use this configuration when the user is not allowed write access to the source database.
To configure this feature you must:
Create a read-only user in the Oracle source database
Enable table-level supplemental logging for tables which will be replicated in your database
before installing and configuring InfoSphere CDC.
Steps
1. Create a read-only user.
2. Ensure that DBMS_FLASHBACK package is installed. (This step is no longer necessary if you
are using version 6.5.2 IF2 or later).
3. Enable table-level supplemental logging
4. While configuring the CDC instance select the read-only database option.

Remote log reading


Considerations
Use this configuration when:
InfoSphere CDC has no direct access to the Oracle online redo log files and archived log files
because it is installed on a separate server.

Source server is overloaded and cannot accommodate the additional resources InfoSphere CDC
requires.

This configuration can work either in regular mode (online and archived) or archived only mode.
Steps
1. Install InfoSphere CDC on a server different from the database server
2. Create an entry on /etc/hosts for the database server pointing to the corresponding IP. I.e.
#db_host <IP address of database host>
3. Create a tnsnames.ora entry pointing to the remote database. I.e.:
SID_CDC=
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST= db_host)(PORT=1521))
(CONNECT_DATA=(server=DEDICATED)
(SERVICE_NAME=SID)
)
)
To verify that the entries are properly defined, verify the connectivity to the database as follows:
from the command line run sqlplus user/pass@SID_CDC.
4. While creating the CDC instance, ensure the following fields are set correctly:
ORACLE_HOME Full path of the oracle client on the local box where tsnnames.ora can be
found.
SERVICE NAME tnsnames.ora entry name you created in step 3

5. If the location name where the logs (online and archived) are available on the local host does not
match the location on the database server, then system parameter oracle_archive_dir can
be set to the full directory path of the local location.

Log shipping
The log shipping feature was made available in the CDC for Oracle product in version 6.3.1. You can
configure InfoSphere CDC for Oracle databases to use copies of complete Oracle archive logs that
are shipped to a secondary system that is accessible to InfoSphere CDC.
There are two main reasons for using this feature:
Avoid overloading the source server moving the processing to another server.
Avoid log retention dependencies on the source system
If you decide to ship your logs, InfoSphere CDC latency is affected by the Oracle log switch
frequency and the amount of time required to physically ship the logs to the remote destination.
Latency will always be at least as high as the amount of time taken by Oracle to create an archive
log and may increase if the log switch interval and the log shipping time increase.

Environment

CDC

Oracle
Production
archivelogs

Transfer files over

Log shipping process


Copied over archivelogs

Oracle database machine

CDC machine

To use this feature, InfoSphere CDC for Oracle databases must be configured to only use Oracle archive
logs. Be aware that this configuration will affect latency of the overall system. You can ship your
database logs with:
Oracle Data Guard log transport services
A customized log shipping process that you develop and maintain.
You can only use one method InfoSphere CDC cannot be run with a hybrid version of both. Also note
that the endianness of the server where InfoSphere CDC is installed must match the endianness of the
system where the logs were created.

Prerequisites and Requirements

The Oracle system parameter log_archive_format must contain %t (thread id) and %s (sequence
number) in order to identify which logs are associated with each node. CDC will fail with an error if
those fields are not specified.
The Endianness of both the CDC Server and the origin of the archive logs must be the same. ie.
CDC installed on a Little Endian machine cannot read logs that are shipped from a Big Endian
machine.
You must use either manual log shipping or DataGuard log shipping. CDC cannot be run with a
hybrid version of both.
If DataGuard log shipping is used:
o The specified oracle_archive_destination_id must reference a STANDBY database
o The log naming format specified by log_archive_format must be identical between primary
and standby databases

Setup Instructions
1) Stop all CDC replication activities from the CDC remote source
2) Setting up IS CDC Oracle system parameters.

System Parameter
oracle_archive_logs_only = true

Description
Use this system parameter to indicate that
InfoSphere CDC will only use Oracle archive
logs, not online redo logs.

oracle_log_shipping = true

Use this system parameter to indicate whether or


not InfoSphere CDC replication processes will
use copies of complete Oracle archive logs that
are shipped to a secondary system that is
accessible to InfoSphere CDC.

oracle_archive_dir = <directory>

Use this system parameter to specify the fully


qualified path to the local directory (on the
machine where InfoSphere CDC is installed) to
which the logs are shipped
For example, you can specify the following
value:
/archivelog/mycdcsystem/
Note: For RAC environments InfoSphere CDC
will read the archive logs from subdirectories
that are named using the thread number. For
example, /archivelog/mycdcsystem/1 for thread 1
and /archivelog/mycdcsystem/2 for thread 2.
In a two node RAC environment, ship the
archive logs for node 1 to this directory
/archivelog/mycdcsystem/1
Similarly, ship the archive logs for node 2 to this

directory
/archivelog/mycdcsystem/2
If the oracle_archive_dir is specified
when log shipping is disabled CDC will read
logs from the first available destination id.
oracle_log_path_userexit=<path>

By default the product assumes that the archive


logs will be located in the following:
Script based:
<ORACLE_ARCHIVE_DIR>/<threadId>/
DataGuard:
<ORACLE_ARCHIVE_DIR>/

If the directory may change, then a user exit can


be written to return the directory which
InfoSphere CDC will look in to find the next
archive log. This parameter specifies the path
where the user exit resides. A sample is shipped
with the product:
<installdir>\samples\com\datamirror\ts\target\publicati
on\userexit\sample

If DataGuard log shipping is used, 2 additional parameters need to be configured:


oracle_using_log_transport_service This parameter specifies that CDC will identify
when logs are available using DG log shipping
s=true
status
oracle_archive_destination_id=<de
st_id>

This parameter specifies the destination id that


Oracle Data Guard will use to ship the logs
(DEST_ID column on V$ARCHIVED_LOG
view where DEST_ID is the Oracle Archiver
process that sends archived logs to the oracle
standby database)

3) Set up of shipped logs:


a. Configure DataGuard Log Shipping
b. Manually ship the Oracle archive logs to the remote server.
If you are shipping the archive logs using a custom solution, each log can be shipped as it is created, or
the new logs can be shipped on a periodic basis using a cron job. In a RAC environment the script must
be run on EACH active node. Here is an example pseudo code for the script that is to be run on each
node:

Query v$archived_log to find out which archives have been completely written by Oracle
by checking STATUS = A and THREAD# = <The node you are running on>
For each new archive log file loop
Perform a CRC check (cksum <local filename>) on the local file. Save the CRC
checksum for the local file.
Ship the file to the remote server using ftp, sftp, or scp. Ensure that the name of
the file remains unchanged. Depending on the node number make sure you ship it to
/archivelog/mycdcsystem/<node number>
Using a remote shell (rsh) perform a CRC check (cksum <remote filename>) on the
remote file. Check to ensure that the checksum matches the local one. If not then
resend the file.
Inform CDC that the archive log file is available using remote shell (rsh) by
calling this command:
dmarchivelogavailable I <instance name> -t <node number> -f <archive log file
name>
End loop

4) Create a script on the remote server to periodically remove the archive logs that are no longer required
by CDC. There are two CDC commands to accomplish this
dmshowlogdependency
Use this command to display information about the database logs that are used by CDC and are required
for replication. Use this command to implement a log retention policy.
Syntax
dmshowlogdependency -I <INSTANCE_NAME> -A -i

Example output:
/archivelog/dmc9264/arch_1_22781,Available,Exists
/archivelog/dmc9264/arch_1_22782,Not Available,Missing
dmarchivelogremoved
Use this command to specify the shipped Oracle archive logs that are no longer needed by InfoSphere
CDC. You should run this command before removing a shipped Oracle archive log. It will cleanup the
CDC metadata which holds the list of all archive logs that are available.
Syntax
dmarchivelogremoved -I <INSTANCE_NAME> -f <file_name> [-a]

Remote Apply
Considerations
This InfoSphere CDC configuration is used when the target database server resides on a different server
than InfoSphere CDC.
Steps
1. Install InfoSphere CDC on a server different from the target database server
2. Create an entry on /etc/hosts for the InfoSphere CDC server pointing to the corresponding IP. i.e.

#db_host <IP address of target database host>


3. Create a tnsnames.ora entry pointing to the remote target database. I.e.:
SID_CDC=
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST= db_host)(PORT=1521))
(CONNECT_DATA=(server=DEDICATED)
(SERVICE_NAME=SID)
)
)
4. While creating the CDC instance, ensure the following fields are set correctly:
ORACLE_HOME Full path of the oracle client on the local box where tsnnames.ora can be
found.
SERVICE NAME tnsnames.ora entry name you created in step 3

Вам также может понравиться