Академический Документы
Профессиональный Документы
Культура Документы
Informatica PowerExchange
(Version 8.5.1)
DISCLAIMER: Informatica Corporation provides this documentation as is without warranty of any kind, either express or implied, including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. The information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is subject to change at any time without notice.
Table of Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Informatica Knowledge Base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Informatica Global Customer Support . . . . . . . . . . . . . . . . . . . . . . . . . . xii
iv
Table of Contents
Configuring Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 DB2 Batch Mode Relational Database Connections . . . . . . . . . . . . . . . . . . . 86 DB2 CDC Mode Application Connections . . . . . . . . . . . . . . . . . . . . . . . . . 89 NRDB Batch Mode Application Connections . . . . . . . . . . . . . . . . . . . . . . . 92 NRDB CDC Mode Application Connections . . . . . . . . . . . . . . . . . . . . . . . . 93 NRDB Lookup Relational Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 MSSQL Batch Mode Relational Connections . . . . . . . . . . . . . . . . . . . . . . . . 97 MSSQL CDC Mode Application Connections . . . . . . . . . . . . . . . . . . . . . . . 99 Oracle Batch Mode Relational Connections . . . . . . . . . . . . . . . . . . . . . . . . 102 Oracle CDC Mode Application Connections . . . . . . . . . . . . . . . . . . . . . . . 104 Sybase Batch Mode Relational Connections . . . . . . . . . . . . . . . . . . . . . . . . 107 Configuring Connection Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Common Connection Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Batch Application and Relational Connection Attributes . . . . . . . . . . . 110 CDC-Specific Connection Attributes . . . . . . . . . . . . . . . . . . . . . . . . . 112 Understanding Commit Processing with PWXPC . . . . . . . . . . . . . . . . . . . . 123
Table of Contents
Understanding PWXPC Restart and Recovery . . . . . . . . . . . . . . . . . . . . . . 151 Session Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Recovery Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Recovery State Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Recovery State File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 The Restart Token File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Determining the Restart Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Initializing and Running CDC Sessions . . . . . . . . . . . . . . . . . . . . . . . . 158 Ending CDC Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Creating Recovery Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Recovery Table Creation with PowerExchange Targets . . . . . . . . . . . . . . 160 Creating the Recovery Tables Manually . . . . . . . . . . . . . . . . . . . . . . . . 161 Configuring the Restart Token File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Syntax Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Statement Syntax Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Restart Token File Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Restart Token File Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 PWXPC Restart and Recovery Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Enabling Session Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Configuring CDC Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Application Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Using DTLUAPPL with CDC Sessions . . . . . . . . . . . . . . . . . . . . . . . . 167 Starting CDC Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Stopping CDC Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Changing CDC Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Recovering from CDC Session Failures . . . . . . . . . . . . . . . . . . . . . . . . 173 Managing Session Log and Restart Token File History . . . . . . . . . . . . . 175
vi
Table of Contents
Table of Contents
vii
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
viii
Table of Contents
List of Figures
Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 1-1. PWXPC Batch Extraction Mode Data Flow . . . . . . . . . . . . . . . . . . . . . . . . . 1-2. PWXPC Change Mode Extraction Data Flow . . . . . . . . . . . . . . . . . . . . . . . 1-3. PWXPC Real-time Mode Extraction Data Flow . . . . . . . . . . . . . . . . . . . . . . 3-1. Import from PowerExchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2. Import from PowerExchange - Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3. Import from PowerExchange - DB2/390 Source . . . . . . . . . . . . . . . . . . . . . 3-4. Import from PowerExchange - DB2/400 or DB2/UDB Source . . . . . . . . . . . 3-5. Import from PowerExchange - DB2390 Select Datamaps List . . . . . . . . . . . . 3-6. Import from PowerExchange - Microsoft SQL Server Source . . . . . . . . . . . . 3-7. Import from PowerExchange - MSSQL Select Datamaps List . . . . . . . . . . . . 3-8. Import from PowerExchange - Oracle Source . . . . . . . . . . . . . . . . . . . . . . . 3-9. Import from PowerExchange - Oracle Select Datamaps List . . . . . . . . . . . . . 3-10. Import from PowerExchange - Sybase Source . . . . . . . . . . . . . . . . . . . . . . . 3-11. Import from PowerExchange - Sybase Select Datamaps List . . . . . . . . . . . . 3-12. Non-Relational Source Definition with Multiple Records . . . . . . . . . . . . . . 3-13. Import from PowerExchange - Non-Relational Source . . . . . . . . . . . . . . . . 3-14. Import from PowerExchange - Single Record Source Data Maps . . . . . . . . . 3-15. Import from PowerExchange - Multiple Record Source Data Maps . . . . . . . 3-16. Import from PowerExchange - Non-Relational Target . . . . . . . . . . . . . . . . 3-17. Import from PowerExchange - Non-Relational Target Select Datamaps List 3-18. Non-Relational Table - Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-19. Non-Relational Table - Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . 3-20. Import from PowerExchange - CDC Datamaps . . . . . . . . . . . . . . . . . . . . . 3-21. Import from PowerExchange - CDC Select Datamaps List . . . . . . . . . . . . . 3-22. Extraction Map Table - Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-23. Extraction Map Table - Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . 3-24. Multi-record Non-Relational Source Definition . . . . . . . . . . . . . . . . . . . . 3-25. Group Source Mapping Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1. Relational Connection Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2. Application Connection Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3. Application Connection Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1. Session Mapping Tab - Batch VSAM Reader . . . . . . . . . . . . . . . . . . . . . . . . 5-2. Session Mapping Tab - DB2 Readers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3. Session Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4. Session Mapping Tab - Extraction Map Source . . . . . . . . . . . . . . . . . . . . . . 5-5. Session Mapping Tab - Relational Source . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6. Session Mapping Tab - Relational Targets . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7. Session Mapping Tab - Non-Relational Targets . . . . . . . . . . . . . . . . . . . . . . 6-1. Application Connection - Number of Restart Token Files . . . . . . . . . . . . . . 7-1. Primary key updates from a source relational table . . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 7 .. 8 .. 9 . 23 . 24 . 26 . 27 . 29 . 31 . 33 . 35 . 37 . 39 . 41 . 43 . 45 . 47 . 48 . 49 . 51 . 52 . 53 . 58 . 59 . 60 . 61 . 72 . 73 . 83 . 84 . 85 130 133 137 138 140 143 145 175 178
List of Figures
ix
Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure
7-2. DB2 Extraction Map Source Mapping . . . . . . . . . . . . 7-3. DB2 Target Table Mapping . . . . . . . . . . . . . . . . . . . . 7-4. DB2 Source to DB2 Target CDC mapping . . . . . . . . . 10-1. ODBC - Connection Object Definition . . . . . . . . . . A-1. Filter Overrides: Single-Record Filter . . . . . . . . . . . . . A-2. Filter Overrides: Multi-Record Filter . . . . . . . . . . . . . A-3. Pre-Session Command - DTLREXE . . . . . . . . . . . . . . A-4. Workflow Link Condition - DTLREXE . . . . . . . . . . . A-5. Command Task Expression Editor - DTLREXE . . . . . A-6. Session Mapping Tab - File Create Pre-SQL Command
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
.183 .183 .184 .208 .231 .232 .234 .235 .235 .237
List of Figures
List of Tables
Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table 1-1. Functional Comparison between PWXPC and PowerExchange ODBC . . . . . . . . . . . 4 1-2. PowerExchange Client for PowerCenter Extract and Load Capabilities . . . . . . . . . . . 5 1-3. Group Source Usage by PowerExchange Database Type . . . . . . . . . . . . . . . . . . . . . . 6 1-4. PowerExchange ODBC Extract and Load Capabilities . . . . . . . . . . . . . . . . . . . . . . 10 2-1. Valid Version Combinations for PowerExchange and PWXPC . . . . . . . . . . . . . . . . 20 3-1. Attributes of Fields in a Non-Relational Source Definition . . . . . . . . . . . . . . . . . . . 52 3-2. Non-Relational Source Definition Metadata Extensions . . . . . . . . . . . . . . . . . . . . . 53 3-3. Attributes of Fields in a Extraction Map Definition . . . . . . . . . . . . . . . . . . . . . . . . 61 3-4. Extraction Map Definition Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4-1. Connection Types for Extracting Source Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4-2. Connection Types for Loading Target Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4-3. PWX DB2390, DB2400, and DB2UDB Relational Database Connection Attributes 86 4-4. DB2390, DB2400, and DB2UDB CDC Mode Application Connection Attributes . 89 4-5. NRDB Batch Mode Application Connection Attributes . . . . . . . . . . . . . . . . . . . . . 92 4-6. NRDB CDC Mode Application Connection Attributes . . . . . . . . . . . . . . . . . . . . . 93 4-7. NRDB Lookup Relational Connection Attributes . . . . . . . . . . . . . . . . . . . . . . . . . 96 4-8. MSSQL Batch Mode Relational Connection Attributes . . . . . . . . . . . . . . . . . . . . . 97 4-9. MSSQL CDC Mode Application Connection Attributes . . . . . . . . . . . . . . . . . . . . 99 4-10. Oracle Batch Mode Relational Connection Attributes . . . . . . . . . . . . . . . . . . . . 102 4-11. Oracle CDC Mode Application Connection Attributes . . . . . . . . . . . . . . . . . . . . 104 4-12. Sybase Batch Mode Relational Connection Attributes . . . . . . . . . . . . . . . . . . . . 107 4-13. Encryption and Compression Connection Attributes . . . . . . . . . . . . . . . . . . . . . 109 4-14. Pacing Size Connection Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4-15. Convert Character Data Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . 110 4-16. Write Mode Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4-17. Retrieve PWX Log Entries Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . 112 4-18. Image Type Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4-19. Event Table Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4-20. CAPI Connection Override Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . 115 4-21. Idle Time Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4-22. Change and Real-time Mode Restart Connection Attributes . . . . . . . . . . . . . . . . 117 4-23. UOW Count Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4-24. Real-Time Flush Latency Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . . 119 4-25. Commit Threshold Connection Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5-1. Session Properties for Non-Relational Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 6-1. Default Starting Extraction Points for Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 6-2. Recovery Table SQL Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 6-3. CDC Sessions - Recommended Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 10-1. Connection Types for Extracting Source Data . . . . . . . . . . . . . . . . . . . . . . . . . . 207 10-2. Target Database Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
List of Tables
xi
Table 10-3. ODBC Connection Object Definition Table . . . . . . . . . . . . . . . . . . . . . . . . . . .209 Table 11-1. Partition Types for Partitioning Points for Sources . . . . . . . . . . . . . . . . . . . . . . .212 Table B-1. PowerExchange and Transformation Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . .241
xii
List of Tables
Preface
PowerExchange Interfaces for PowerCenter describes the Informatica interface between PowerExchange and PowerCenter. It is written for developers and administrators who are responsible for creating, running, and administering workflows and sessions that interface with PowerExchange. This manual assumes that you have knowledge of your operating systems, relational database concepts, and the database engines and non-relational files in your environment. This manual also assumes that you are familiar with the basic operation of PowerExchange and PowerCenter This manual discusses:
The PowerExchange Client for PowerCenter (PWXPC) interface The Power Exchange ODBC interface with PowerCenter
xi
Informatica Resources
Informatica Customer Portal
As an Informatica customer, you can access the Informatica Customer Portal site at http://my.informatica.com. The site contains product information, user group information, newsletters, access to the Informatica customer support case management system (ATLAS), the Informatica Knowledge Base, Informatica Documentation Center, and access to the Informatica user community.
support@informatica.com for technical inquiries support_admin@informatica.com for general customer service requests
WebSupport requires a user name and password. You can request a user name and password at http://my.informatica.com.
xii
Preface
Use the following telephone numbers to contact Informatica Global Customer Support:
North America / South America Informatica Corporation Headquarters 100 Cardinal Way Redwood City, California 94063 United States Europe / Middle East / Africa Informatica Software Ltd. 6 Waltham Park Waltham Road, White Waltham Maidenhead, Berkshire SL6 3TN United Kingdom Asia / Australia Informatica Business Solutions Pvt. Ltd. Diamond District Tower B, 3rd Floor 150 Airport Road Bangalore 560 008 India Toll Free Australia: 1 800 151 830 Singapore: 001 800 4632 4357 Standard Rate India: +91 80 4112 5738
Standard Rate Belgium: +32 15 281 702 France: +33 1 41 38 92 26 Germany: +49 1805 702 702 Netherlands: +31 306 022 797 United Kingdom: +44 1628 511 445
Preface
xiii
xiv
Preface
Part 1: Introduction
Part 1: Introduction
Chapter 1
Overview
You can use the following interfaces to extract and load data through PowerExchange when using PowerCenter:
PowerExchange Client for PowerCenter (PWXPC). Chapters 2 to 6 describe PowerExchange Client for PowerCenter. You can use it to extract and load data through PowerExchange for a variety of data types on a variety of platforms. PowerExchange Client for PowerCenter is fully integrated into PowerCenter. PowerExchange ODBC. Chapters 8 to 12 describe the PowerExchange ODBC interface. You can use PowerExchange ODBC connections with PowerCenter to extract and load data through PowerExchange for a variety of data types on a variety of platforms.
of PowerExchange ODBC. PWXPC has additional functionality as well as improved performance and superior CDC recovery and restart. Table 1-1 compares the interface functionality of the PowerExchange Client for PowerCenter and PowerExchange ODBC:
Table 1-1. Functional Comparison between PWXPC and PowerExchange ODBC
PWXPC Yes Yes ODBC Yes No Function Extract data in batch and CDC mode. Extract data using Group Source Save target data and CDC restart information in a single commit. Use PowerCenter graceful stop for real-time sessions. Use the change indicator to determine the type of change record. Create source definitions from PowerExchange extraction maps. Modify the file name in the PowerCenter source definition. Description PowerExchange extracts relational and non-relational data in batch mode and changed data in CDC mode. PowerExchange Group Source processes change data for multiple sources or multiple record types in a VSAM and sequential files in a single pass. CDC restart information is stored in the same database as the relational target table. The restart information is updated in the same commit as the target data providing guaranteed restart and recovery for CDC data. PowerCenter stops real-time sessions after all data in the pipeline is written to the targets. Each changed data record indicates whether it is an insert, update, or delete. When the change indicator is used, an Update Strategy transformation is not required to process inserts, updates, and deletes. Extraction maps contain the PowerExchange autogenerated columns minimizing modification of the source definition in Designer. PowerCenter source definition can specify the file name and override the file name specified in the PowerExchange data map.
Yes
No
Yes Yes
No No
Yes
No
Yes
No
For an overview of PowerExchange, see the PowerExchange Getting Started Guide. For an overview of PowerCenter, see the PowerCenter Getting Started Guide.
4 Chapter 1: Understanding PowerExchange Interfaces for PowerCenter
Batch Change Data Capture (CDC) Real-time Change Data Capture (CDC) batch extraction mode from condense files Change Data Capture (CDC) continuous extraction mode from condense files
Table 1-2 lists the database types that PowerExchange Client for PowerCenter can access to extract data or load data:
Table 1-2. PowerExchange Client for PowerCenter Extract and Load Capabilities
Database Type Adabas Datacom DB2 for z/OS DB2 for i5/OS DB2 for Linux, UNIX, and Windows IDMS IMS MSSQL Oracle Sequential files Sybase VSAM Batch Mode Extract/Load Yes/Yes Yes/No Yes/Yes Yes/Yes Yes/Yes Yes/No Yes/Yes Yes/Yes Yes/Yes Yes/Yes Yes/Yes Yes/Yes* CDC Real Time Extraction Mode Yes Yes Yes Yes Yes Yes Yes Yes Yes No No Yes CDC Batch Extraction Mode Yes No Yes Yes Yes No Yes Yes Yes Yes Yes Yes* CDC Continuous Extraction Mode Yes Yes Yes Yes No Yes Yes No Yes No No Yes
* With ESDS and RRDS VSAM datasets, only inserts are allowed. Inserts, updates, and deletes are allowed for KSDS VSAM data sets.
PowerExchange group source reads all data from the same physical source in a single pass. PWXPC uses PowerExchange group source to extract changed data from the change stream. PWXPC also uses group source to extract data from VSAM data sets and sequential files containing multiple record types. As a result, PWXPC connections process data faster than PowerExchange ODBC connections and reduce PowerExchange resource consumption on the source or extraction platform. For more information about group source, see PowerExchange Group Source on page 71.
PowerExchange Client for PowerCenter 5
Table 1-3 lists the PowerExchange database types that read sources in a single pass during extraction:
Table 1-3. Group Source Usage by PowerExchange Database Type
Database Type Adabas Datacom DB2 for z/OS DB2 for i5/OS DB2 for Linux, UNIX, and Windows IDMS IMS MSSQL Oracle Sequential files Sybase VSAM Batch Extraction Mode No No No No No No No No No Yes No Yes CDC Real Time Extraction Mode Yes Yes Yes Yes Yes Yes Yes Yes Yes No No Yes CDC Batch Extraction Mode Yes Yes Yes Yes No Yes Yes No Yes No No Yes CDC Continuous Extraction Mode No No No No No No No No Yes No No No
Batch Mode
Use PWX batch application and relational connections to extract and load data for relational databases and non-relational data sets and files through PowerExchange. PWXPC connects to PowerExchange using the PowerExchange Call Level Interface (SCLI). You can extract all records in multiple record types from VSAM and sequential files with a single pass of the data using PWXPC. In contrast, ODBC connections read a single record type at a time requiring multiple passes of the data.
The following diagram shows the data flow of source data from PowerExchange through PWXPC and PowerCenter to the target tables:
Figure 1-1. PWXPC Batch Extraction Mode Data Flow
Batch extraction mode from condense files. Use PWX CDC Change connections to extract changed data from condense files in batch extraction mode. PWXPC uses the PowerExchange CAPX access method when processing data with CDC Change connections. PowerExchange stops the extraction after the data from all condense files is read. Continuous extraction mode from condense files. Use PWX CDC Real Time connections for Oracle sources to extract changed data from condense files in continuous extraction mode. PWXPC uses the PowerExchange CAPXRT access method when
processing data with CDC Real Time connections. PowerExchange runs the extraction until stopped. See the PowerExchange Oracle Adapter Guide for more information. PWXPC connects to PowerExchange using the PowerExchange Call Level Interface (SCLI). PowerExchange reads the changed data from each condense file once for all sources in the mapping in a single pass. The following diagram shows the data flow of condensed changed data from PowerExchange through PWXPC and PowerCenter to the target tables:
Figure 1-2. PWXPC Change Mode Extraction Data Flow
change stream for all sources in the mapping in a single pass. PWXPC real-time sessions run for a specified period or continuously until stopped. The following diagram shows the data flow of changed data from PowerExchange through PWXPC and PowerCenter to the target tables:
Figure 1-3. PWXPC Real-time Mode Extraction Data Flow
PowerExchange ODBC
PowerExchange provides a thin ODBC driver that you can use with PowerCenter. The Integration Service uses PowerExchange ODBC to connect to PowerExchange either locally or remotely. Using PowerExchange ODBC, you can extract and load relational and nonrelational data. You can also extract changed data. Use the following modes to extract relational and non-relational data:
Batch. PowerExchange ODBC extracts and loads data from a relational table or nonrelational file through PowerExchange. You can read multiple record VSAM data sets and sequential files using the ODBC interface with multiple passes of the data to read all record types. Change Data Capture (CDC) batch extraction mode from condense files. PowerExchange ODBC extracts changed data from condense files through PowerExchange reading all of the changes captured in condense files since the last extraction session. PowerExchange ODBC reads the changed data once for each source in the mapping resulting in multiple passes of the condense files. The extraction session ends when all captured changes are read. PowerExchange maintains restart information in the CDEP file on the source machine. PowerExchange ODBC has limited restart capability. Change Data Capture (CDC) Real-time. PowerExchange ODBC extracts changed data in real time from the change stream using one pass of the data for each source in the mapping. You can run real-time extractions for a specified time period or continuously until stopped. PowerExchange maintains restart information in the CDEP file on the source machine. PowerExchange ODBC has limited restart capability.
Table 1-4 shows the PowerExchange ODBC extract and load capabilities:
Table 1-4. PowerExchange ODBC Extract and Load Capabilities
Database Type Adabas Datacom DB2 for z/OS DB2 for i5/OS DB2 for Linux, UNIX, and Windows IDMS IMS MSSQL Oracle Sequential/flat files Batch Mode Extract Yes Yes Yes Yes Yes Yes Yes No No Yes Batch Mode Load Yes No Yes Yes Yes No Yes Yes Yes Yes CDC Batch Extraction Mode Yes Yes Yes Yes No Yes Yes No Yes No CDC Real-time Extraction Mode Yes Yes Yes Yes Yes Yes Yes Yes Yes No
10
* With ESDS and RRDS VSAM datasets, only inserts are allowed. Inserts, updates, and deletes are allowed for KSDS VSAM data sets.
PowerExchange ODBC
11
12
Installing PWXPC, 15 Working with Mappings, 21 Configuring Connections, 79 Working with Sessions, 127 Restart and Recovery, 149 Flexible Key Custom Transformation, 177
13
14
Chapter 2
Installing PWXPC
Overview, 16 Installing PowerExchange Client for PowerCenter (PWXPC), 17 Modifying the PowerExchange Configuration Files, 18 Working with PowerExchange and PowerExchange Client for PowerCenter, 20
15
Overview
This chapter describes how to install and configure PowerExchange Client for PowerCenter (PWXPC).
PowerCenter 8.5.1. This includes the PWXPC plug-in. For more information about installing PowerCenter, see the PowerCenter Installation Guide. PowerExchange 8.5.1. Install PowerExchange on the PowerCenter Client and Integration Service machines. For more information about installing PowerExchange see the PowerExchange Installation Guide.
If you install the PowerCenter Integration Service on a 32-bit machine, install the 32-bit version of PowerExchange on the same machine. If you install the Integration Service on a 64-bit machine, install the 64-bit version of PowerExchange on the same machine. PowerExchange Navigator is a 32-bit application. You can use a 32-bit PowerCenter Client and PowerExchange Navigator to communicate with a 64-bit version of either product.
Note: If the appropriate version of PowerExchange is not installed and available on the
PowerCenter Client platform, the Import from PowerExchange dialog box will not function.
16
Installation Steps
The PowerExchange Client for PowerCenter (PWXPC) is installed when you install PowerCenter on the client and Integration Service machines. You must still configure PowerExchange configuration files on the Integration Service node. To configure PowerExchange Client for PowerCenter for use on the PowerCenter Integration Service and Client, you must add NODE statements in the PowerExchange dbmover.cfg file on the PowerCenter Client and Integration Service machines for those PowerExchange Listeners to which you wish to connect. See Modifying the PowerExchange Configuration Files on page 18. Read the Release Notes and PowerExchange Migration Guide for any changes to installation or connectivity.
Upgrading
When upgrading from a previous release of PowerCenter, you must perform a repository upgrade. This process registers the PWXPC plug-in. No repository upgrade is necessary if upgrading from 8.5 to 8.5.1. When upgrading from PowerCenter 8.1.1 SP2 (or any higher 8.1.1 Service Pack) and using enhanced restart for CDC sessions, do the following:
Upgrading for PWXPC enhanced restart users: 1.
Prior to migrating to PowerCenter 8.5.x, cleanly shutdown all CDC sessions and run recovery on all CDC sessions. PWXPC creates a backup restart token file with a timestamp appended. Save this file. As a precaution, backup the relational tables which are targets in the CDC sessions. Also backup the PowerCenter recovery tables. After completing the migration to PowerCenter 8.5.x, copy the backup restart token files PWXPC created to the appropriate restart token file for each CDC session. This ensures that the restart token files contain the restart points from the point of interruption on the previous release. Cold start the session so PWXPC uses only the newly populated restart token file to restart the CDC session.
2. 3.
4.
17
the Integration Service. In local mode, a PowerExchange Listener is not required. If local mode is used, there is no need to update to the PowerExchange dbmover.cfg file. Specify local in the Location attribute in PWXPC connections.
Locate the dbmover.cfg file in the PowerExchange root directory. Open the file with a text editor.
18
3.
Create a node for each PowerExchange Listener you want to register using the following guidelines:
NODE=(<node name>,TCPIP,<hostname>,<port_number>)
where <node name> is a logical name used to reference the PowerExchange Listener and <hostname> and <port_number> are the host name (or IP address) and port number of the PowerExchange Listener.
4.
For more information, see Configuration File Parameters in the PowerExchange Reference Manual.
19
1. Support for PWXPC V7.1.1 was introduced with PowerExchange V5.2.0 Patch 02 2. Support for PWXPC V7.1.2 was introduced with PowerExchange V5.2.2 Patch 01 3. Support for PWXPC V7.1.3 was introduced with PowerExchange V5.2.2 Patch 02 is the minimum level required for V71.4 and V7.1.5. 4. See Using Versions of PowerCenter Earlier than v8.x with PowerExchange v8.x in PowerExchange Migration Guide.
20
Chapter 3
Overview, 22 Source and Target Definitions, 23 Working with Relational Source and Target Definitions, 25 Working with Non-Relational Source and Target Definitions, 43 Working with Extraction Map Definitions, 57 Previewing PowerExchange Data in Designer, 63 PowerExchange Group Source, 71 Working with Source Qualifiers, 75 Using Lookup Transformations, 76
21
Overview
A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. Mappings represent the data flow between sources and targets. Source and target definitions represent metadata for sources and targets. When you create a source definition, its structure differs depending on the type of source it represents. Nonrelational sources require a multi-group source definition. Relational sources use a singlegroup source definition. The source qualifier for a source definition also differs in structure depending on type of source definition. After you create a source or target definition, you can include it in a mapping to extract data from the source or load data to the target. You can extract source data in batch, change, or real-time mode. For a list of sources and targets that PowerExchange Client for PowerCenter supports, see Table 1-2 on page 5. This table also lists whether the Integration Service can read the source data in batch, change, or real-time mode. With CDC mappings, it is generally necessary to have multiple mappings: a batch mapping to materialize the target tables from the source tables in preparation for CDC and the CDC mapping itself which then uses extraction map sources for the source tables. To minimize the effort in creating these mappings, create any business rules applicable to both the batch and CDC sessions in Mapplets. For more information about Mapplets, see the PowerCenter Designer Guide.
22
This displays the Import from PowerExchange dialog box. The process and the dialog box displayed is the same for targets.
23
Figure 3-2 shows the dialog box used to import all PowerExchange sources and targets, including relational metadata, PowerExchange data maps, and PowerExchange capture extraction maps:
Figure 3-2. Import from PowerExchange - Source
Additional input fields appear depending upon the Source Type chosen. You create the source and target definitions differently depending on the database type. After you create a source or target definition, you can edit it.
Note: The Owner name is included in the source definitions for relational metadata and in the
source and target definitions for PowerExchange data maps imported using this dialog. This information is used, unless overridden, when the source or target is accessed from the Integration Service node. This eliminates the need to provide the Owner Name attribute in the Session Properties for all source types and Table Name Prefix attribute in the Session Properties for non-relational PowerExchange targets.
24
DB2 UDB for z/OS (DB2/390) DB2 UDB for iSeries (DB2/400) DB2 for Linux, UNIX, and Windows (DB2/UDB) Microsoft SQL Server Oracle Sybase
Import table definitions from the DB2 catalog using PowerExchange. Import DB2 or DB2 unload (DB2UNLD) data map definitions from PowerExchange. For more information, see Importing Non-Relational Source Definitions on page 44. Import extraction map definitions for PowerExchange. For more information, see Working with Extraction Map Definitions on page 57. Manually create a DB2 definition. Import table definitions from the DB2 catalog using PowerExchange. Manually create a DB2 definition. Create a DB2 target definition from a DB2 source definition. In the Target Designer, drag a DB2 source definition to the workspace.
See the PowerCenter Designer Guide for more information about using PowerCenter to create source and target definitions.
Tip: If your repository already contains DB2 definitions, you can use them to extract data
from or load data to a DB2 table. However, the metadata definition must match the table structure of the DB2 table.
25
DB2 tables that are mapped in PowerExchange as either DB2 data maps or DB2UNLD (DB2 database unload data set) data maps (DB2/390 only) are imported in the same manner as non-relational data map sources. For more information on how to import these sources, see Importing Non-Relational Source Definitions on page 44. Use the following procedure to import DB2/390, DB2/400, and DB2/UDB source or target definitions.
To import a DB2/390, DB2/400 or DB2/UDB source or target definition: 1.
To import a DB2 source definition, select Sources > Import from PowerExchange and select a source type of DB2390, DB2400, or DB2UDB. To import a DB2 target definition, select Targets > Import from PowerExchange and select a source type of DB2390, DB2400, or DB2UDB. The dialog box for a DB2/390 source definition import looks as follows:
Figure 3-3. Import from PowerExchange - DB2/390 Source
The target dialog box looks similar but does not contain the Multi-Record Datamaps and CDC Datamaps options.
26
The target dialog box looks similar but does not contain the Multi-Record Datamaps and CDC Datamaps options.
2.
User Name Password Multi-Record Datamaps Source Type CDC Datamaps Subsystem Id
27
Description Name of the database for connection. DB2400 and DB2UDB only. Enter a schema name to filter the resulting data maps. Enter a table name to filter the resulting data maps.
Optionally, enter a filter to view particular schemas and tables from the database. Enter a schema and/or table name. Or, enter a filter condition to display schemas and/or tables that meet the filter condition. Use one of the following wildcard characters in the filter condition:
* (asterisk). Represents one or more characters. ? (question mark). Represents one character. Enter the filter condition as a prefix. For example, enter A* to display schemas and/or tables that begin with an A. Enter the filter condition as a suffix. For example, enter *A to display schemas and/or tables that end with an A. Enter the filter condition as a substring. For example, enter *cust* to display schemas and/or tables that contain cust.
Click Connect. The available tables, based on the values specified in the dialog box, will appear in the Selected Datamaps list. If no tables are found, No Data Found will appear in the Selected Datamaps list.
28
Schema
Table
5.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a schema. Use the Select all button to select all tables. Use the Select none button to clear all highlighted selections.
6.
Import table definitions from Microsoft SQL Server using PowerExchange. Import extraction map definitions from PowerExchange. For more information, see Working with Extraction Map Definitions on page 57.
29
Import table definitions from Microsoft SQL Server using the PowerCenter ODBC interface. Manually create a Microsoft SQL definition. Import table definitions from Microsoft SQL Server using PowerExchange. Import table definitions from Microsoft SQL Server using the PowerCenter ODBC interface. Manually create a Microsoft SQL definition.
You can create a Microsoft SQL Server target definition in the following ways:
See the PowerCenter Designer Guide for more information about using PowerCenter to create source and target definitions.
Tip: If your repository contains Microsoft SQL Server definitions, you can use them to extract
data from or load data to a Microsoft SQL Server table. However, the metadata definition must match the table structure of the Microsoft SQL Server table.
To import a Microsoft SQL Server source definition, select Sources > Import from PowerExchange and select a source type of MSSQL. To import a Microsoft SQL Server target definition, select Targets > Import from PowerExchange and select a source type of MSSQL.
30
The dialog box for the Microsoft SQL Server source definition input looks as follows:
Figure 3-6. Import from PowerExchange - Microsoft SQL Server Source
The target dialog box looks similar but does not contain the Multi-Record Datamaps and CDC Datamaps options.
2.
User Name Password Multi-Record Datamaps Source Type CDC Datamaps Server Name
31
Description Database name in the SQL instance specified. Enter a schema name to filter the resulting data maps. Enter a table name to filter the resulting data maps.
Optionally, enter a filter to view particular schemas and tables from the database. Enter a schema and/or table name. Or, enter a filter condition to display schemas and/or tables that meet the filter condition. Use one of the following wildcard characters in the filter condition:
* (asterisk). Represents one or more characters. ? (question mark). Represents one character. Enter the filter condition as a prefix. For example, enter A* to display schemas and/or tables that begin with an A. Enter the filter condition as a suffix. For example, enter *A to display schemas and/or tables that end with an A. Enter the filter condition as a substring. For example, enter *cust* to display schemas and/or tables that contain cust.
Click Connect. The available tables, based on the values specified in the dialog box, will appear in the Selected Datamaps list. If no tables are found, No Data Found will appear in the Selected Datamaps list.
32
Schema
Table
5.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a schema. Use the Select all button to select all tables. Use the Select none button to clear all highlighted selections.
6.
Import table definitions from Oracle using PowerExchange. Import extraction map definitions from PowerExchange. For more information, see Working with Extraction Map Definitions on page 57.
33
Import table definitions from Oracle using the PowerCenter ODBC interface. Manually create an Oracle source definition. Import table definitions from Oracle using PowerExchange. Import table definitions from Oracle using the PowerCenter ODBC interface. Manually create an Oracle source definition. F
See the PowerCenter Designer Guide for more information about using PowerCenter to create source and target definitions.
Tip: If your repository contains Oracle definitions, you can use them to extract data from or
load data to an Oracle table. However, the metadata definition must match the table structure of the Oracle table.
To import an Oracle source definition, select Sources > Import from PowerExchange and select a source type of Oracle. To import an Oracle target definition, select Targets > Import from PowerExchange and select a source type of Oracle.
34
The Import from PowerExchange > Oracle dialog box looks as follows:
Figure 3-8. Import from PowerExchange - Oracle Source
The target dialog box looks similar but does not contain the Multi-Record Datamaps and CDC Datamaps options.
2.
User Name Password Multi-Record Datamaps Source Type CDC Datamaps TNS Name
35
Description Enter a schema name to filter the resulting data maps. Enter a table name to filter the resulting data maps.
Optionally, enter a filter to view particular schemas and tables from the database. Enter a schema and/or table name. Or, enter a filter condition to display schemas and/or tables that meet the filter condition. Use one of the following wildcard characters in the filter condition:
* (asterisk). Represents one or more characters. ? (question mark). Represents one character. Enter the filter condition as a prefix. For example, enter A* to display schemas and/or tables that begin with an A. Enter the filter condition as a suffix. For example, enter *A to display schemas and/or tables that end with an A. Enter the filter condition as a substring. For example, enter *cust* to display schemas and/or tables that contain cust.
Click Connect. The available tables, based on the values specified in the dialog box, will appear in the Selected Datamaps list. If no tables are found, No Data Found will appear in the Selected Datamaps list.
36
Schema
Table
5.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a schema. Use the Select all button to select all tables. Use the Select none button to clear all highlighted selections.
6.
Import table definitions from Sybase using PowerExchange. Import table definitions from Sybase using the PowerCenter ODBC interface. Manually create an Sybase source definition.
37
Import table definitions from Sybase using PowerExchange. Import table definitions from Sybase using the PowerCenter ODBC interface. Manually create an Sybase target definition.
See the PowerCenter Designer Guide for more information about using PowerCenter to create source and target definitions.
Tip: If your repository already contains Sybase definitions, you can use them to extract data
from a Sybase source. However, the metadata definition must match the table structure of the Sybase table.
To import an Sybase source definition, select Sources > Import from PowerExchange and select a source type of Sybase. To import an Sybase target definition, select Targets > Import from PowerExchange and select a source type of Sybase.
38
The Import from PowerExchange > Sybase dialog box looks as follows:
Figure 3-10. Import from PowerExchange - Sybase Source
2.
User Name Password Multi-Record Datamaps Source Type CDC Datamaps Server Name Database Name
39
Description Enter a schema name to filter the resulting data maps. Enter a table name to filter the resulting data maps.
Optionally, enter a filter to view particular schemas and tables from the database. Enter a schema and/or table name. Or, enter a filter condition to display schemas and/or tables that meet the filter condition. Use one of the following wildcard characters in the filter condition:
* (asterisk). Represents one or more characters. ? (question mark). Represents one character. Enter the filter condition as a prefix. For example, enter A* to display schemas and/or tables that begin with an A. Enter the filter condition as a suffix. For example, enter *A to display schemas and/or tables that end with an A. Enter the filter condition as a substring. For example, enter *cust* to display schemas and/or tables that contain cust.
Click Connect. The available tables, based on the values specified in the dialog box, will appear in the Selected Datamaps list. If no tables are found, No Data Found will appear in the Selected Datamaps list.
40
Schema
Table
5.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a schema. Use the Select all button to select all tables. Use the Select none button to clear all highlighted selections.
6.
Modify column names. Modify column data types. Add or delete columns.
41
Note: If using the relational source in a CDC session you do not need to add the
DTL__CAPXACTION field nor do you need to include an Update Strategy transformation. PWXPC will automatically include the DTL__CAPXACTION column in its SELECT statement for CDC sources. It then uses the value of the DTL__CAPXACTION to construct the appropriate SQL statement (INSERT, UPDATE, or DELETE). For more information about editing source and target definitions, see the PowerCenter Designer Guide.
42
Adabas Datacom - source only DB2 data maps (DB2MAP) - source only DB2/390 unload files (DB2UNLD) - source only IDMS - source only IMS SEQ VSAM
You can create a non-relational source and target definitions by importing a data map from a PowerExchange Listener. Non-relational definitions represent the data map metadata in groups. Each group represents a table in the data map. Each group also contains metadata for the fields in the table. The following diagram shows a non-relational source definition for a VSAM data map that contains multiple tables representing multiple records in the VSAM file. The tables in the VSAM data map are represented as groups in the source definition:
Figure 3-12. Non-Relational Source Definition with Multiple Records
Group name
43
In this example, the source definition contains four groups: V07A_RECORD_LAYOUT, V07B_RECORD_LAYOUT, V07C_RECORD_LAYOUT and V07D_RECORD_LAYOUT. These groups are tables in the data map. The groups contain metadata for the fields in the tables. Some data maps contain records that have hierarchical relationships with each other. For example, records can have a parent/child relationships. When you import data maps with hierarchies, the Designer imports the data map as a single group.
44
To import a non-relational source definition, select Sources > Import from PowerExchange and select the desired source type. The dialog box and parameters displayed are the same for each non-relational source type. The Import from PowerExchange dialog box looks as follows:
Figure 3-13. Import from PowerExchange - Non-Relational Source
2.
45
Description Select to list CDC extraction maps - source only. Enter a schema name to filter the resulting data maps. Enter a data map name to filter the resulting data maps. Lists the available data maps for the connection, database and filter details that you entered.
3.
Optionally, enter a filter to view particular schemas and tables from the database. Enter a schema and/or table name. Or, enter a filter condition to display schemas and/or tables that meet the filter condition. Use one of the following wildcard characters in the filter condition:
* (asterisk). Represents one or more characters. ? (question mark). Represents one character. Enter the filter condition as a prefix. For example, enter A* to display schemas and/or tables that begin with an A. Enter the filter condition as a suffix. For example, enter *A to display schemas and/or tables that end with an A. Enter the filter condition as a substring. For example, enter *cust* to display schemas and/or tables that contain cust.
Click Connect. The available data maps appear in the Selected Datamaps list. The Designer displays metadata to import. The following two examples show the results when single record VSAM data maps and multi-record VSAM data maps are selected. Each record in the multi-record data map will display in Selected Datamaps list when importing single record data maps. Each record in a multi-record data map is effectively a single record data map. It is possible to import only a single record within a multi-record data map as a source.
46
This example shows the results for single record VSAM data maps:
Figure 3-14. Import from PowerExchange - Single Record Source Data Maps
Schema
Table
47
This example shows the results for multi-record VSAM data maps:
Figure 3-15. Import from PowerExchange - Multiple Record Source Data Maps
Schema Table
5.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a data map. Use the Select all button to select all tables. Use the Select none button to clear all highlighted selections.
6.
Click OK. The source definitions appear. The Designer uses the data map name as the name of the source definition.
48
To import a non-relational target definition, select Targets > Import from PowerExchange and select the desired target type. The dialog box and parameters displayed are the same for each non-relational source type. The Import from PowerExchange dialog box for targets looks as follows:
Figure 3-16. Import from PowerExchange - Non-Relational Target
2.
49
Description Enter a data map name to filter the resulting data maps. Lists the available data maps for the connection, database and filter details that you entered.
3.
Optionally, enter a filter to view particular schemas and tables from the database. Enter a schema and/or table name. Or, enter a filter condition to display schemas and/or tables that meet the filter condition. Use one of the following wildcard characters in the filter condition:
* (asterisk). Represents one or more characters. ? (question mark). Represents one character. Enter the filter condition as a prefix. For example, enter A* to display schemas and/or tables that begin with an A. Enter the filter condition as a suffix. For example, enter *A to display schemas and/or tables that end with an A. Enter the filter condition as a substring. For example, enter *cust* to display schemas and/or tables that contain cust.
Click Connect. The available data maps appear in the Selected Datamaps list. The Designer displays metadata to import.
50
The following example shows the results when a Source Type of VSAM is selected:
Figure 3-17. Import from PowerExchange - Non-Relational Target Select Datamaps List
Schema
Table
5.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a data map. Use the Select all button to select all tables. Use the Select none button to clear all highlighted selections.
6.
Click OK. The source definitions appear. The Designer uses the data map name as the name of the source definition.
51
Table 3-1 describes the attributes the Attributes tab displays for each field in the nonrelational definition:
Table 3-1. Attributes of Fields in a Non-Relational Source Definition
Attribute Name column_name base_rec base_fld base_fld_tpe base_fld_offset Description Name of the field in the data map. Name of the record to which the field belongs. This corresponds to name of the group the field belongs to in the source definition. Name of the base record and table field name in the following format:
<Base_Field_Name>:<Table_Field_Name>
PowerExchange data type of the field. Offset value from which the field starts. For example, if the value is 5, the field starts at the fifth position. You determine the offset value of each field based on the order of fields in the data map.
52
The following table describes the extensions on the Metadata Extensions tab for a nonrelational definition:
Table 3-2. Non-Relational Source Definition Metadata Extensions
Extension Name Access Method Description Method you specified in the data map to access the source database: - A = Adabas - D = IMS DL1 - E = VSAM ESDS - I = IDMS - K = VSAM KSDS - O = IMS ODBA - N = VSAM RRDS - S = sequential (SEQ) - W = DB2 unload file (DB2UNLD) - X = Datacom - Z = DB2 data map Any comments.
comments
53
Modify a column data type Modify the owner name Modify column key relationships Add or delete columns Add a description of the definition Create metadata extensions
Note: If using the non-relational source in a CDC session you do not need to add the
DTL__CAPXACTION field nor do you need to include an Update Strategy transformation. PWXPC will automatically include the DTL__CAPXACTION column in its SELECT statement for CDC sources. It then uses the value of the DTL__CAPXACTION to construct the appropriate SQL statement (INSERT, UPDATE, or DELETE). For more information about editing source definitions, see the PowerCenter Designer Guide.
blank File Name. If desired, this field can be populated manually or by re-importing the data map.
54
The following procedure explains how to manually update the File Name field to add or change the file name in a VSAM or sequential definition.
To manually update the file name field: 1. 2.
Double-click the source or target definition in the workspace. Select the Metadata Extensions tab. PWXPC populates the File Name field with the File Name for the PowerExchange data map. The following example shows the metadata extensions for a VSAM definition. The VSAM was created prior to PowerCenter V8.5 so the File Name field is blank.
3.
55
The Edit Metadata Extension Value panel displays allowing you to enter or change the file name.
You can also re-import the PowerExchange data map to automatically populate the File Name field with the information contained in the data map.
56
The extraction map for the source contains a subset of the actual columns defined for the source. The PowerExchange generated columns like DTL__CAPXTIMESTAMP, DTL__CAPXACTION, and DTL__BI before image columns are needed.
You can connect to PowerExchange locally or to a PowerExchange Listener to import an extraction map definition. Connect to the capture source platform location where the extraction maps are stored. When you connect to PowerExchange, the Designer displays the extraction map schemas and tables. Before you connect to PowerExchange, you can filter the metadata the Designer displays by schema and/or data map name. Select an extraction map to create the source definition.
To import an extraction map as a source definition 1.
To import an extraction map source definition, select Sources > Import from PowerExchange and select the CDC Datamaps box and then select the desired source type. The dialog box and parameters displayed are the same for each non-relational source type.
57
User Name Password Multi-Record Datamaps Source Type CDC Datamaps Schema Map name
58
2.
Optionally, enter a filter to view particular schemas and tables from the database. Enter a schema, table name, or both. Or enter a filter condition to display schemas or tables that meet the filter condition. Use one of the following wildcard characters in the filter condition:
* (asterisk). Represents one or more characters. ? (question mark). Represents one character. Enter the filter condition as a prefix. For example, enter A* to display schemas or tables that begin with an A. Enter the filter condition as a suffix. For example, enter *A to display schemas or tables that end with an A. Enter the filter condition as a substring. For example, enter *cust* to display schemas or tables that contain cust.
Click Connect. The Designer displays the metadata to import. The extraction maps shown will be filtered based on the source type specified.
Figure 3-21. Import from PowerExchange - CDC Select Datamaps List
Schema
Table
59
In this example, only extraction maps for a specific Schema and Map name appear in the Designer.
4.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a data map. Use the Select all button to select all tables. Use the Select none button to clear all highlighted selections.
5.
Click OK. The source definition appears in the workspace. The Designer uses the data map name as the name of the source definition.
60
Table 3-3 describes the attributes that the Attributes tab displays for each field in the source definition:
Table 3-3. Attributes of Fields in a Extraction Map Definition
Attribute Name column_name base_rec base_fld base_fld_tpe base_fld_offset Description Name of the field in the extraction map. blank. blank blank. blank
61
The following table describes the extensions on the Metadata Extensions tab for an extraction map definition:
Table 3-4. Extraction Map Definition Metadata Extensions
Extension Name Access Method Description Method you specified in the data map to access the source database: - A = Adabas - B = DB2/390 and DB2/400 - D = IMS - E = VSAM ESDS - I = IDMS - K = VSAM KSDS - L = MSSQL - N = VSAM RRDS - P = Oracle - V = DB2/UDB - X = Datacom Any comments. Name of the extraction map. Original table name in relational database or PowerExchange data map. Original schema or owner name in relational database or PowerExchange data map. Name of the schema for the extraction map.
Modify a column data type. Modify the owner name. Add or delete columns. Add a description of the definition. Create metadata extensions.
When using extraction maps, you do not need to add the DTL__CAPXACTION field nor do you need to include an Update Strategy transformation. PWXPC will automatically include the DTL__CAPXACTION column in its SELECT statement for CDC sources. It then uses the value of the DTL__CAPXACTION to construct the appropriate SQL statement (INSERT, UPDATE, or DELETE).
Warning: Changing column information in the extraction map could result in failures in the session or workflow during the extraction process.
For more information about editing source definitions, see the PowerCenter Designer Guide.
62
Relational, such as DB2 for DB2 metadata. Non-relational, such as PWX_VSAM_NRDB2 for VSAM data maps. Extraction Maps, such as PWX_DB2390_CDC for DB2 CDC data maps. Source Analyzer. Preview source data in the Source Analyzer after you import the source. Target Designer. Preview target data in the Target Designer after you import a target. Mapplet Designer. Preview source data in the Mapplet Designer while creating a mapplet. Mapping Designer. Preview source and target data in the Mapping Designer while creating a mapping.
You can preview source or target data in the following Designer tools:
For sources and targets other than those accessed through PowerExchange, see the PowerCenter Designer Guide.
Select a relational source or target definition in the workspace. Right-click the source or target definition in the workspace and choose Preview Data.
63
3.
Select an ODBC data source name. You can add a new ODBC data source using the ... button.
4.
If necessary, enter the Username and Password. For PowerExchange sources and targets on MVS and AS/400, this is only necessary if connecting to a PowerExchange Listener configured with security (SECURITY=1 or SECURITY=2 in DBMOVER).
5. 6.
Enter the database table owner name. Enter the number of rows you want to preview. The default is 100. The Preview Data dialog box can display up to 500 rows and up to 65,000 columns.
7.
Click Connect.
64
The contents of the table appear in the Preview Data dialog box.
8. 9.
To change the number of rows you want to preview, enter a new number and click Refresh. Click Close to exit.
Select a non-relational source or target definition in the workspace. Right-click the source or target definition in the workspace and choose Preview Data.
65
3.
Select a Location name. The Location names are retrieved from the NODE statements in the dbmover.cfg file on the Designer platform. To add additional Locations, edit this file and add additional NODE statements.
4.
If necessary, enter the Username and Password. For PowerExchange sources and targets on MVS and AS/400, this is only necessary if connecting to a PowerExchange Listener configured with security (SECURITY=1 or SECURITY=2 in DBMOVER).
5. 6.
The Schema field is automatically populated with the PowerExchange data map schema name. This can be changed if desired. Enter the number of rows you want to preview. The default is 10. The Preview Data dialog box can display up to 500 rows and up to 65,000 columns.
7.
Click Connect.
66
The contents of the table appear in the Preview Data dialog box.
8. 9. 10.
To return more data, click More. To change the number of rows you want to preview, enter a new number and click Connect. Click Close to exit.
Select an extraction map source definition in the workspace. Right-click the source definition in the workspace and choose Preview Data.
67
3.
Select a Location name. The Location names are retrieved from the NODE statements in the dbmover.cfg file on the Designer platform. To add additional Locations, edit this file and add additional NODE statements.
4.
If necessary, enter the Username and Password. For PowerExchange sources and targets on MVS and AS/400, this is only necessary if connecting to a PowerExchange Listener configured with security (SECURITY=1 or SECURITY=2 in DBMOVER).
5. 6.
The Schema field is automatically populated with the PowerExchange extraction map schema name. This can be changed if desired. Select either Real Time or Change. Real Time extracts data in real-time from the change stream using the earliest starting restart point. Change extracts data from condense files using the earliest starting restart point. See Default Restart Points on page 157 for more information about default restart points.
7. 68
The time out value indicates the maximum time (in seconds) to wait for additional data at the end of log. After this time period expires, EOF is returned and the number of rows requested is displayed. Returning EOF terminates any further retrieval of data from the change stream. The default is 10 seconds. The value can be any number between 0 and 86400. A value of 0 indicates that EOF is returned as soon as the end of log is reached whereas 86400 indicates that EOF is never returned. Do not specify 86400 as no data is displayed and the Preview Data session hangs until the extraction task in PowerExchange Listener is stopped.
8.
Enter the number of rows you want to preview. The default is 10. The Preview Data dialog box can display up to 500 rows and up to 65,000 columns.
9.
Click Connect. The contents of the table appear in the Preview Data dialog box.
10.
To return more data, click More. When the end of log is reached, the More button is greyed out.
69
11. 12.
To change the number of rows you want to preview, enter a new number and click Connect. Click Close to exit.
70
71
The PowerExchange NRDB Batch application connection is used to read the data and uses Group Source functionality. Group Source processing for multi-record non-relational data maps is done for each source definition. Each source definition in a mapping will result in a connection to PowerExchange to read the source data. It is possible to have multiple multirecord source definitions in a mapping and each one can use Group Source processing.
Import table definitions from the relational database using PowerExchange. For more information, see Working with Relational Source and Target Definitions on page 25. Import data map table definitions from PowerExchange (non-relational and DB2/390 sources). For more information, see Working with Non-Relational Source and Target Definitions on page 43. Import extraction map definitions from PowerExchange for all source types. For more information, see Working with Extraction Map Definitions on page 57. Import relational table definitions using the PowerCenter ODBC interface. For more information, see the PowerCenter Designer Guide.
72
For non-relational CDC sources, the source metadata must be imported using PWXPC.
Tip: Use extraction maps, or CDC Datamaps, for CDC sources as this eliminates the need to
specify the extraction map name in the Session Properties. It also eliminates the need to add any of the special DTL columns: the DTL__CAPX fields, the DTL__CI change indicator fields, and the DTL__BI before image fields. This can significantly simplify the mapping design process. Group Source functionality is invoked for each source type. A mapping should contain only a specific source type. Sessions with a mappings containing multiple source types, even if the same change stream is being read, will fail with:
PWXPC_10080 [ERROR] All the readers should be of one database type only
For example, a mapping containing both VSAM and IMS sources will fail with the 10080 message. Two separate mappings should be made: one for the VSAM sources and one for the IMS sources. If these two mappings are included in two sessions in the same workflow, they result in separate Group Source connections to the change stream: one for VSAM and one for IMS. Group Source for batch access to non-relational sources requires that the source be imported as a multi-record data map and is used for an individual source definition. With CDC access, group source is invoked at a mapping level for all source definitions rather than at an individual source definition level. The invocation of Group Source occurs automatically when a PWX Change or Real-Time connection is used in a session, regardless of the number of sources included in the session. It also occurs automatically if a multi-record source definition exists in a mapping. The following diagram shows an example of a mapping for three DB2 sources:
Figure 3-25. Group Source Mapping Example
73
When you include this mapping in a session that uses the PowerExchange DB2 CDC application connection, PowerExchange reads through the change stream a single time, using Group Source, to extract the changes for all three source tables. The changes for each source are provided to the source qualifier in the chronological order in which each unit of work (UOW) completed. When you include this mapping in a session that uses a PowerExchange DB2 relational connection, PowerExchange reads each source table separately. A unique pipeline is created for each source which results in three unique tasks in the PowerExchange Listener.
74
Relational source definitions use a Source Qualifier transformation. Non-relational source definitions use an Application Multi-Group Source Qualifier transformation.
Transformation Datatypes
The transformation datatypes in Source Qualifier and Application Multi-Group Source Qualifier transformations are internal datatypes based on ANSI SQL-92 generic datatypes, which PowerCenter uses to move data across platforms. When the Integration Service reads data from a source, it converts the data from the PowerExchange data type to the transformation data type. When you run a session, the Integration Service performs transformations based on the transformation datatypes. When writing data to a target, the Integration Service converts the data based on the datatypes in the target definition. The transformation data type for all ports in the Application Multi-Group Source Qualifier transformation are predefined. You cannot change the data type for any of the fields in the Application Multi-Group Source Qualifier transformations. For more information about transformation datatypes, see the PowerCenter Designer Guide.
75
The Lookup transformation import process uses ODBC for non-relational files and relational tables. To use PWXPC to import definitions for non-relational files or relational tables, first import the definitions using the Import from PowerExchange dialog box in either the Source Analyzer or Target Designer prior to configuring the lookup in the mapping.
You can use PWXPC connections for lookup both non-relational files and relational tables:
For relational tables, select the appropriate PWXPC relational connection for the database type, such as PWX DB2390, PWX DB2400, PWX DB2UDB, PWX Microsoft SQL Server, PWX Oracle, or PWX Sybase. See Configuring Connections on page 83 for connection configuration information. For non-relational files, select the PWXPC relational connection for NRDB lookups, PWX NRDB Lookup. See NRDB Lookup Relational Connections on page 96 for connection configuration information.
When using Lookup transformations with a resume recovery strategy, select the Lookup source is static transformation attribute to avoid failures during session execution. When using Lookup transformations with IMS databases, careful consideration needs to be given to the fields used to perform the search of the IMS database. Concatenated keys (CCK) fields achieve the best performance with the least impact on the IMS database. For more information about search fields for IMS lookups, see Configuring Lookups for IMS on page 76. When using Lookup transformations for targets being updated with CDC data in the same mapping, use special custom properties to ensure change stream data is accessible across pipelines. For more information, see Configuring Lookups for CDC Data on page 77.
Concatenated Key (CCK) fields allows PowerExchange to construct a fully-qualified Segment Search Argument (SSA) thereby improving IMS database search efficiency. See the PowerExchange IMS Adapter Guide for more information about creating CCK fields. Fields specified in the Lookup condition transformation attribute are used by PowerExchange to create the Segment Search Argument (SSA). In order for a field to be used in the SSA, it must be marked as a key in the IMS source or target definition in Designer. To provide search keys for IMS database lookups, use the following types of fields in the Lookup condition transformation attribute:
Concatenated Key (CCK) fields. Specify these fields as keys in the IMS source or target definition and use them in the Lookup condition attribute. Using CCK fields results in the most efficient search of the IMS database. Key fields. Specify these fields as key in the IMS source or target definition and use them in the Lookup condition attribute. You can specify either the CCK field or the key field for the desired segment as both will exist in the IMS source or target definition. If the segment is not the root, a combination of both CCK fields and key fields will likely be needed in the Lookup condition. Search fields defined in the IMS Database Definition (DBD). Specify these as keys in the IMS source or target definition and use them in the Lookup condition attribute. If the segment does not have a key, IMS can still scan the IMS segments using an IMS search field. This type of search call is not as efficient as a keyed search with CCK fields or key fields. Assuming the root segment is keyed, include its CCK field with the search fields to limit the amount of data IMS scans and therefore the impact on the database. Non-key or non-search fields. The least efficient search method is to mark non-key fields or non-search fields as keys in the IMS source or target definition and in the Lookup condition attribute. This causes a scan of the IMS database to be done in order to find a match. This can adversely affect your IMS operational system and therefore should be avoided.
Tip: You can limit the amount of the database that will be scanned by specifying as many CCK
and key fields as possible. If using Search fields, include as many CCK fields as possible and, at minimum, the root CCK field. Only use non-key or non-search fields as a last resort.
MergeCDCReaders=Yes SingleThreadExecutionModel=Yes
These custom properties remove any partition points from the PWXPC CDC Reader through the transformations to the Writer. As a result, the order of the changes read from the change
77
stream is maintained until the changes reach the Writer. This then ensures that any CDC data placed into a dynamic lookup cache is accessible to lookups sharing that cache in other pipelines.
Warning: The use of these custom properties will impact session throughput as it will singlethread all source data from the Reader through to the Writer. As a result, these custom properties should only be specified when there is a need to share CDC data stored in a dynamic cache across pipeline.
78
Chapter 4
Configuring Connections
Overview, 80 DB2 Batch Mode Relational Database Connections, 86 DB2 CDC Mode Application Connections, 89 NRDB Batch Mode Application Connections, 92 NRDB CDC Mode Application Connections, 93 NRDB Lookup Relational Connections, 96 MSSQL Batch Mode Relational Connections, 97 MSSQL CDC Mode Application Connections, 99 Oracle Batch Mode Relational Connections, 102 Oracle CDC Mode Application Connections, 104 Sybase Batch Mode Relational Connections, 107 Configuring Connection Attributes, 108 Understanding Commit Processing with PWXPC, 123
79
Overview
Before the Integration Service can access a source or target in a session, you must configure connections in the Workflow Manager. When you create or modify a session that reads from or writes to a database of file, you can select only configured source and target databases. Connections are saved in the repository. For PowerExchange Client for PowerCenter, you configure relational database or application connections, depending upon the source or the target type.
80
Overview
81
82
Configuring Connections
The connection you configure depends upon the database or data structure. Source and target connections can be configured for:
Extracting data (Batch) from relational or non-relational sources. Extracting changed data (Change or Real-Time) from non-relational or relational sources. Loading data (Batch) to a relational target. Loading data (Batch) to a non-relational target.
For more information about available connection types, see Table 4-1 on page 80 and Table 4-2 on page 81.
To configure connections: 1. 2.
In the Workflow Manager, connect to a PowerCenter repository. To configure a PowerExchange Batch relational database connection, click Connections > Relational. The Relational Connection Browser dialog box appears:
Figure 4-1. Relational Connection Browser
Configuring Connections
83
To configure a PowerExchange application connection, click Connections > Application. The Application Connection Browser dialog box appears:
Figure 4-2. Application Connection Browser
In the Select Type field, select the type of connection you want to create. For a list of connections to configure according to data source and extraction mode, see Table 4-1 on page 80. For a list of connections to configure according to target data type, see Table 42 on page 81.
3.
Click New.
84
The Connection Object Definition dialog box appears. The relational and application dialog boxes are very similar. An application dialog box is shown here:
Figure 4-3. Application Connection Editor
4. 5.
Enter the values for the connection attributes. The various connection types are described in subsequent sections in this chapter. Click OK. The new connection appears in the Application or Relational Object Browser.
To edit or delete a relational database or application connections, select the connection from the list and click the appropriate button.
Configuring Connections
85
Description Name of the relational database connection. Select the code page for the Integration Server to use to extract data from the source database. Note: In Unicode mode, PWXPC sets the code page with this value overriding any code page specification in the PowerExchange configuration file. Location of the source or target database as specified as a node in the PowerExchange configuration file dbmover.cfg. DB2 subsystem or database instance name. User name for the database connected to. Password for the user name. Commit scope of the transaction. Default is CS. Overrides any occurrence of the specified filename (in any SQL statement) with the library/filename/member specified, regardless of whether the file is qualified or not. This includes any files qualified using Library List. Separate libraries with semicolons. Note: If both Library List and Database file overrides are specified and a table exists both, the Database file overrides takes precedence.
Location Database Name User Name Password Isolation Level Database file overrides
Library List
No
DB2400
List of libraries that PowerExchange searches to qualify the table name for Select, Insert, Delete Or Update statements. PowerExchange searches the list if the table name is unqualified. Separate libraries with semicolons. Note: If both Library List and Database file overrides are specified and a table exists both, the Database file overrides takes precedence.
No No
All All
SQL commands run in the database environment. Select to compress source data during the PowerCenter session. See Configuring Encryption and Compression on page 108.
86
Table 4-3. PWX DB2390, DB2400, and DB2UDB Relational Database Connection Attributes
Connection Attribute Encryption Type Encryption Level Pacing Size Interpret as Rows Bulk Load* DB2390 DB2400 DB2UDB All All All All DB2390
Description Select the encryption type. See Configuring Encryption and Compression on page 108. Select the encryption level. See Configuring Encryption and Compression on page 108. Enter the pacing size. See Configuring Pacing on page 109. Specifies whether, or not, pacing size is in number of rows. See Configuring Pacing on page 109. Select to cause PowerExchange to load data to DB2 for z/OS targets using the DB2LOAD utility. If you select this option, you can configure the remaining connection attributes. Otherwise, the PowerExchange ignores these attributes. Specifies the data set prefix PowerExchange uses to create the temporary files needed when using the DB2 LOAD utility to load data into a DB2 table. Enter one of the following values to allocate MVS space in tracks or cylinders: - TRACK - CYLINDER Default is TRACK. Value for the primary space on MVS. Default is 0. Value for the secondary space on MVS. Default is 0. Specifies how PowerExchange should handle the temporary files it creates when using the DB2 LOAD utility to load data into a DB2 table. Select one of the following values: - NO does not delete the temporary files. - BEFORE deletes the temporary files before running the utility. - AFTER SUCCESS ONLY deletes the temporary files after running the utility if it ends with a return code 0. - AFTER deletes the temporary files after running the utility. Default is NO. Specifies the name of the JCL template for the DB2 LOAD utility on the PowerExchange target system. Default is DB2LDJCL. Specifies the name of the control file template for the DB2 LOAD utility on the PowerExchange on the target system. Default is DB2LDCTL.
Filename*
No
DB2390
Space*
Yes
DB2390
JCL Template*
Yes
DB2390
CTL Template*
Yes
DB2390
87
Table 4-3. PWX DB2390, DB2400, and DB2UDB Relational Database Connection Attributes
Connection Attribute Load Options* DB2390 DB2400 DB2UDB DB2390
Required Yes
Description Specifies how PowerExchange should handle the data when using the DB2 LOAD utility to load data into a DB2 table. Select one of the following values: - INSERT - REPLACE Default is INSERT. Specifies how PowerExchange should run the DB2 LOAD utility to load data into a DB2 table. Select one of the following values: - TASK runs the LOAD utility as a sub-task under the PowerExchange Listener. - JOB submits a separate job to run the DB2 LOAD utility. - NOSUBMIT creates the files and JCL to run the DB2 LOAD utility but does not submit the job. Default is TASK. Specifies how PowerExchange handles the execution of the DB2 LOAD utility. Select one of the following values: - WAIT waits for the job to end before returning control to PowerCenter. WAIT can only be specified with Mode Type JOB or TASK. - NO WAIT returns to PowerCenter without waiting for the job to end. NO WAIT can only be specified with Mode Type JOB or NOSUBMIT. - TIMED waits the number of seconds specified In TIme before returning control to PowerCenter. TIMED can only be specified with Mode Type JOB. - DATAONLY creates the files and JCL to run the DB2 LOAD utility but does not submit the job. Default is WAIT. Specifies the wait time in seconds when you select JOB for the Mode Type and TIMED for Mode Time. Valid values are 1 to 99998. Default is 0. Convert embedded nulls in character fields to spaces. See Converting Character Data to Strings on page 110. Default is no. Select the write mode. See Configuring Write Mode on page 111. Default is Confirm Write On. Overrides the default prefix of PWXR for the reject file. PowerExchange creates the reject file on the target machine when the Write Mode is Asynchronous with Fault Tolerance. Note: Specifying PWXDISABLE will prevent creation of the reject files. See the PowerExchange Reference Manual for further information.
Mode Type*
Yes
DB2390
Mode Time*
Yes
DB2390
Time*
Yes
DB2390
No
All
No No
All All
88
Encryption Type Encryption Level Pacing Size Interpret as Rows Image Type
Yes No Yes No No
Application Name
No
Both
Yes
Both
89
Table 4-4. DB2390, DB2400, and DB2UDB CDC Mode Application Connection Attributes
Connection Attribute RestartToken File Name Required No Change or Real-time Both Description Specify the Restart token file. See Configuring CDC Restart Attributes on page 117. Default is the Application Name if specified or the workflow name if Application Name is not specified. Specify the maximum number of backup copies to keep of the Restart Token File. See Managing Session Log and Restart Token File History on page 175. Default is 0. Specify file cache folder to enable recovery for the session. See Enabling Session Recovery on page 165. Default is $PMRootDir/Cache. Specifies the number of units of work (UOWs) you want the PWXPC to read from the source before flushing data to the target. If you enter: -1 = UOW count is not used 0 = UOW count is not used n = n is the count of UOWs See Configuring UOW Count on page 118 and Understanding Commit Processing with PWXPC on page 123. Default is 1. Reader Time Limit No Real Time Specifies the number of seconds that Integration Service reads data from the source before stopping. If you enter 0, Reader Time Limit does not limit the reader time. This attribute is intended for testing purposes only. Tip: Use Idle Time instead of Reader Time Limit. Default is 0. Specifies the number of seconds the PowerExchange Listener remains idle after reaching the end of the change log (as indicated by message PWX-09967) before returning an end-offile (EOF). If you enter: -1 = EOF is never retuned; the session runs continuously. 0 = EOF is returned at the end of log; the session terminates successfully. n = n is the number of seconds. See Configuring Idle Time on page 115. Default is -1.
No
Both
No
Both
No
Both
Idle Time
No
Real Time
90
Table 4-4. DB2390, DB2400, and DB2UDB CDC Mode Application Connection Attributes
Connection Attribute Real-Time Flush Latency Required No Change or Real-time Real Time Description Specifies the milliseconds between buffer flushes. Valid values are between 0 and 86400 milliseconds. PWXPC sets values between 0 and 2000 to 2000. See Configuring Real-Time Flush Latency on page 119. Default is 0. Specifies the number of change records (not UOWs) after which a commit should be inserted into the change stream. See Configuring Commit Threshold on page 121. Default is 0. Overrides the library and journal name in the PowerExchange CAPI_CONNECTION. Specify complete library and journal names in the format: library/journal Overrides the library and file name in the extraction map. Specify complete library and file names in the format: library/file This attribute overrides the Library/File Override value on the application connection. Warning: Do not specify an asterisk for library name if using PWXPC restart. Converts embedded nulls in character fields to spaces. See Converting Character Data to Strings on page 110. Default is no. Specifies the PowerExchange extraction map name used for event processing. See Configuring Event Table Processing on page 114.. Overrides the default CAPI connection name. See Configuring the CAPI Connection Name Override on page 115. Includes all related PowerExchange log entries in the session log. See Retrieving PWX Log Entries on page 112.
Commit Threshold
No
Real Time
Journal Name*
No
Both
Library/File Override*
No
Both
No
Both
No
Real Time
No
Real Time
No
Both
* These attributes only apply to PWX CDC DB2400 Real Time application connections.
91
No
92
Encryption Type Encryption Level Pacing Size Interpret as Rows Image Type
Yes No Yes No No
Application Name
No
Both
Yes
Both
93
No
Both
No
Both
No
Both
Idle Time
No
Real Time
94
Commit Threshold
No
Real Time
No
Both
No
Real Time
No
Real Time
No
Both
95
96
Encryption Type
No
Encryption Level
No
Pacing Size
No
Interpret as Rows
No
97
No
98
Application Name
No
Yes
No
99
No
No
Idle Time
No
Commit Threshold
No
100
No No No
101
102
103
Encryption Type Encryption Level Pacing Size Interpret as Rows Image Type
Yes No Yes No No
Application Name
No
Both
Yes
Both
104
No
Both
No
Both
No
Both
Idle Time
No
Real Time
105
Commit Threshold
No
Real Time
Instance Name
No
Real Time
Connect String
No
Real Time
No
Both
No
Both
No
Real Time
No
Real Time
No
Both
106
No
107
Encryption and compression Pacing Convert character data to string Write Mode Retrieve PWX log entries Image type Event Table CAPI Connection Name Idle time CDC Restart UOW Count Real-Time Flush Latency Commit Threshold
108
Configuring Pacing
You can configure the pacing size to slow the data transfer rate from the PowerExchange Listener. The pacing size determines the amount of data the PowerExchange Listener passes to the source or target. Configure the pacing size if an external application, database, or the Integration Service node is a bottleneck during the session. For more information about pacing size, see the PowerExchange Reference Manual. Table 4-14 describes the pacing attributes:
Table 4-14. Pacing Size Connection Attributes
Connection Attribute Pacing Size Required No Description Enter the amount of data the source system can pass to the PowerExchange Listener. The lower the value, the faster the session performance. Minimum value is 0. Enter 0 for maximum performance. Default is 0. Interpret as Rows No Select to represent the pacing size in number of rows. If you clear this option, the pacing size represents kilobytes. This option is selected by default.
109
The application that processes this field uses the x00 as a delimiter and parses the field into three strings:
If this field is read from PowerExchange by PowerCenter, the result be only the string ABC. The rest of the field would be truncated when the first null indicator is detected in the data. This connection attributes exists to allow these types of fields to be extracted. If selected, embedded null indicators (x00) are converted to spaces (x40). As a result, the example above would result in the field containing the following hexadecimal EBCDIC data:
C1C2C340C4C5C6C740C8C9
which when read by PowerCenter results in string ABC DEFG HI. Table 4-15 describes the Convert character data to string attribute:
Table 4-15. Convert Character Data Connection Attribute
Connection Attribute Convert character data to string Required No Description Select to convert embedded nulls in character fields to spaces. Default is to leave embedded nulls as-is.
110
Asynchronous with Fault Tolerance is only available for PWX DB2390, PWX DB2400, PWX DB2UDB and PWX Oracle relational connections. Confirm Write On ensures that data is sent synchronously to the PowerExchange Listener (rather than buffered). After a SQL request is sent, the sender then waits for the response from PowerExchange before the next SQL request is sent. This is important if good error recovery is a priority. It has the drawback of slowing data transfer rates. In order to stop session execution when errors are encountered, specify a value larger than 0 in the Session Error handling option Stop on errors on the Config Object tab. Confirm Write Off sends data asynchronously to the PowerExchange Listener by buffering the data. While this method provides greater speed compared to Confirm Write On, it removes the ability to determine exactly which SQL statement failed in error situations. As a result, you must reload the entire table if an error occurs to ensure data integrity. Use this setting only when loading tables.
111
Note: The PowerCenter statistics are unreliable when using Confirm Write Off.
Asynchronous (write) with Fault Tolerance combines the speed of Confirm Write Off with error detection of Confirm Write One. Data is buffered and sent asynchronously to the PowerExchange Listener. A reject file is created on the target machine when SQL errors occur allowing any errors to be corrected without reloading the entire table. You can also specify how to handle specific SQL return codes. In order to stop session execution when errors are encountered, specify a value larger than 0 in the Session Error handling option Stop on errors on the Config Object tab. See the PowerExchange Reference Manual for a complete description of Asynchronous Write with Fault Tolerance.
Retrieving PowerExchange log entries will include into the PowerCenter session log messages related to the session that are normally found only in the PowerExchange log. This allows a single log to provide a view of both PowerCenter and PowerExchange processing, speeding diagnosis when errors occur. The PowerExchange messages related to the session are returned in the session log as a part of message PWXPC_10091.
CDC-Specific Connection Attributes on page 112 Extracting CDC Data in Change and Real-time Modes on page 135 Understanding PWXPC Restart and Recovery on page 151 Enabling Session Recovery on page 165 Configuring CDC Sessions on page 166
112
You can configure whether before-image data is extracted for update operations using the Image Type specification. PowerExchange captures before and after image data for all updates, regardless of source type. The before-image data can always be extracted in real-time mode. In change mode, it is possible that only after-image data is available if the changes have been specifically condensed with only after-images. See the appropriate PowerExchange Adapter guide for the source type for additional information the change Condense process.If you specify Image Type=BA, the before-image and after-image data of the entire row that was updated are presented as two separate rows: a delete with the before-image data and an insert with the after-image data. If you specify Image Type=AI, then only after-image data is provided for update records (unless you explicitly request before-image data). With AI processing, updates are passed as update records and not changed to a delete/insert pair as occurs in BA processing. It is possible, selectively by column, to request that before-image column data be embedded within the after-image update record. When this form of before-image data is used, the change remains an update (as opposed to being changed into a delete/insert pair). When you use embedded before-image columns, you should specify AI for Image Type. In order to request that the before-image of the column be embedded into the update row, you must alter the PowerExchange extraction map. In the PowerExchange Navigator, select the columns for which you would like before-image data. This will create before-image columns (DTL__BI_columnname) within the extraction map for those columns selected. This allows the before-image data to be easily manipulated in your mapping as it is contained in the same update record as the after-image data. One possible use for embedded before-image data is to handle update records where the primary key has been updated. In some relational databases (such as DB2/390), it is possible to do an update to the primary key (thereby changing the key value). The RDBMS understands that this operation is equivalent to deleting the row and then re-adding it with a new primary key. This activity is logged as an update and so will be passed as an update record when extracted. In some circumstances this may causes problems when attempting to apply this update to the target database as some relational databases does not allow primary key values to be updated. Including the before-image data for key columns will allow this type of activity to be detected. The Flexible Key Custom transformation will allow this to be properly handled at the target.
113
Note: To use the Flexible Key Custom transformation, you must configure before-image
columns in the PowerExchange extraction map. For more information about Flexible Key Custom transformations, see Flexible Key Custom Transformation on page 177. For additional information about configuring before-image columns, see the PowerExchange Adapter guide for the source type.
By using an event table, you can stop real-time CDC sessions based on an external event. For example, if you want to stop a CDC extraction nightly, after all of the days changes have been processed, you can use an event table to make a change to that table at midnight. When PowerExchange processes the change for the event table, it will stop reading changes at that point and shut down the extraction. To use event table processing, complete the following tasks: 1. Register the event table for CDC. The event table must be the same source type and on the same machine as the CDC data being extracted. For example, if you are extracting DB2 changes from MVS, then the event table must be a DB2 table in the same DB2 subsystem as the DB2 changes. 2. 3. 4. 5. 6. Specify the extraction map name for the event table in the connection Event Table attribute for the CDC sessions you wish to stop based on an event. When the event occurs, make a change to the event table. When PowerExchange reads the change to the event table, it places an end-of-file (EOF) in the change stream. PWXPC processes the EOF, passes it along to the Integration Service and shuts down the reader. The Integration Service completes writing all of the data currently in the pipeline and ends the session.
114
PowerExchange allows a maximum of eight CAPI_CONNECTION statements in the PowerExchange DBMOVER configuration file. You use multiple CAPI_CONNECTION statements to capture changes from more than one database type through a single PowerExchange Listener on a single machine. For example, you can capture changes for Oracle and UDB using a single PowerExchange Listener using multiple CAPI_CONNECTION statements. You can specify the default CAPI_CONNECTION statement by coding the CAPI_SRC_DFLT statement in the PowerExchange configuration file. You request other CAPI_CONNECTION statements by specifying the CAPI Connection Name Override attribute. See the PowerExchange Reference Manual for additional information about CAPI_CONNECTION statements.
Use the Idle Time terminating condition to indicate whether the real-time session should run continuously (forever) or shutdown after a specified period of time. This parameter requires a valid value and has a valid default value.
Configuring Connection Attributes 115
The Idle Time timing starts when the PowerExchange Listener begins reading changed data for the source(s). If -1 is entered for Idle Time, PowerExchange will never return an end-of-file (EOF) to the Integration Server thereby causing the session to run forever. This is generally how a real-time session is setup. It is the default value primed for Idle Time in all of the real-time connections. Continuous extraction sessions must be stopped using either the PowerExchange STOPTASK command or through PowerCenter using Workflow Monitor Stop/Abort or the pmcmd commands to stop and abort tasks and workflows.
If you stop the session or workflow using the PowerCenter Workflow Monitor or using pmcmd, this is considered a normal termination. PowerCenter will perform a graceful stop, instructing the CDC reader and the writers to shutdown and waiting until all data currently in the pipeline is processed. For more information about stopping real-time sessions and workflows, see the PowerCenterWorkflow Administration Guide. If you abort the session or workflow using the PowerCenter Workflow Monitor or using pmcmd, this is considered an abnormal termination since PowerCenter does not wait for the reader and writer to shutdown or until all data in the pipeline is processed. For more information about aborting sessions and workflows, see the PowerCenterWorkflow Administration Guide. The PowerExchange STOPTASK command will stop the extraction task in the PowerExchange Listener and pass an EOF to the Integration Server which will terminate the session successfully. For more information on the PowerExchange STOPTASK command, see the Task Utility (DTLUTSK) documented in PowerExchange Utilities Guide.
Warning: Ensure that you have switched the Session Properties Commit Type attribute to Source and unchecked the Commit at End of File attribute. By default, Commit at End of File is on and it will cause data to be committed after the CDC reader has shutdown and committed the restart tokens. As a result, when the session is restarted, duplicate data will be sent to the target.
If 0 is entered for Idle Time, PowerExchange will return an EOF to the Integration Service when the end-of-log (EOL) is reached. After the EOF is received, the Integration Service will terminate the session successfully, meaning that all data will be committed and the restart token file will be updated. The end-of-log is determined by what was the current end of the change stream at the point that PowerExchange started to read the change stream. This concept of EOL is required because the change stream is generally not static so the actual end-of -log is continually moving forward. PowerExchange issues the following message when EOL is reached:
PWX-09967 CAPI i/f: End of log for time <yy/mm/dd> <hh:mm:ss> reached
For example, if a session starts reading a change stream at 10:00 a.m., the EOL at that point is determined. After PowerExchange reaches that point in the change stream, it will return EOF to the Integration Service. This means that changes recorded in the change stream after 10:00 a.m. will not be processed. Specifying 0 for Idle time is a useful in situations where you want to extract changed data for sources periodically as opposed to continuously.
116 Chapter 4: Configuring Connections
If a positive number is specified for Idle Time, then the session will run until no data is returned for the period of time specified. After Idle Time is reached, PowerExchange will send an end-of-file (EOF) to the Integration Service and the session will terminate successfully. Specifying a low Idle Time (1 for example) can result in this time being reached before all available data in the change stream has been read. The following message is issued to indicate that the Idle Time has been reached:
[PWXPC_10072] [INFO] [CDCDispatcher] session ended after waiting for [idle_time] seconds. Idle Time limit is reached
This message is also issued when a continuous extraction is stopped using the PowerExchange STOPTASK command. In this case, the idle_time variable in the message will show 86400 which is the never expire time limit used when an Idle Time of -1 is specified.
Tip: In highly active systems, a positive value for Idle Time may never match. Use 0 if you do
not want the session to run continuously. For example, if you specify an Idle Time of 10 seconds and PowerExchange finds no data for the source(s) in the change stream for a 10 second period, PowerExchange will return an EOF to the Integration Service which will cause it to terminate successfully. If you specify values for Reader Time Limit and Idle Time, the Integration Service stops reading data from the source at the point based on whichever one of these terminating conditions is reached first. So, if the Reader Time Limit is reached prior to the Idle Time limit, the session will stop at that point regardless of the fact that Idle Time has not yet been reached.
Warning: Reader Time Limit does not result in normal termination of a CDC session. Use Idle Time instead of Reader Time Limit.
117
All of these parameters require a valid value and all have valid default values. There are numerous CDC reader application connection attributes that specify restart information. PWXPC uses the restart information to tell PowerExchange from which point to start reading the captured changed data.
Warning: Care must be taken when using the Application Name default as it may not result in a unique name for the application name. It is imperative that the Application Name value and the Restart Token File Name be unique for every session. Results are unpredictable and include session failures and potential data loss if a non-unique name is used for either of these attributes.
For more information about restart token files, see Configuring the Restart Token File on page 162.
A unit of work (UOW) is a collection of changes within a single commit scope made by a transaction on the source system. Each unit of work may consist of a different number of changes. This parameter requires a valid value and has a valid default value.
118
When the session runs, the PWXPC reader begins to read data from the PowerExchange Listener. After data is provided to the source qualifier, the UOW Count begins. When you use a non-zero value for the UOW Count session attribute, PWXPC issues a commit to the target when it reaches the number of units of work specified in this terminating condition. When the UOW Count is reached, a real-time flush will be triggered to flush the buffers so that the changed data is committed to the target. The following message appears in the session log to indicate that this has occurred:
[PWXPC_10081] [INFO] [CDCDispatcher] raising real-time flush with restart tokens [restart1_token], [restart2_token] because UOW Count [uow_count] is reached
The commit to the target when reading CDC data is not strictly controlled by the UOW Count specification. The Real-Time Flush Latency and the Commit Threshold values also determine the commit frequency. To understand the affect that all of these values have on commit processing, see Understanding Commit Processing with PWXPC on page 123. For example, if the value for UOW Count is 10, the Integration Service commits all data read from the source to its target after the 10th unit of work is processed (assuming the Real-Time Flush Latency period has not expired first). The lower you set the value, the faster the Integration Service commits data to the target. Therefore, if you require the lowest possible latency for the apply of changes to the target, you should specify a UOW Count of 1.
Warning: When you specify a low UOW Count value, the session might consume more system resources on the target platform. This is because it will commit to the target more frequently. You need to balance performance and resource consumption with latency requirements when choosing the UOW Count and Real-Time Flush Latency values.
Use the Real-Time Flush Latency terminating condition to control the target commit latency when running in real-time mode. PWXPC commits source data to the target at the end of the specified maximum latency period. This parameter requires a valid value and has a valid default value.
119
When the session runs, PWXPC begins to read data from the source. After data is provided to the source qualifier, the Real-Time Flush Latency interval begins. At the end of each RealTime Flush Latency interval and an end-UOW boundary is reached, PWXPC issues a commit to the target. The following message appears in the session log to indicate that this has occurred:
[PWXPC_10082] [INFO] [CDCDispatcher] raising real-time flush with restart tokens [restart1_token], [restart2_token] because Real-time Flush Latency [RTF_millisecs] occurred
Only complete UOWs are committed during real-time flush processing. The commit to the target when reading CDC data is not strictly controlled by the Real-Time Flush Latency specification. The UOW Count and the Commit Threshold values also determine the commit frequency. To understand the affect that all of these values have on commit processing, see Understanding Commit Processing with PWXPC on page 123. The value specified for Real-Time Flush Latency also controls the PowerExchange Consumer API (CAPI) interface timeout value (PowerExchange latency) on the source platform. The CAPI interface timeout value is displayed in the following PowerExchange message on the source platform (and in the session log if Retrieve PWX Log Entries is specified in the Connection Attributes):
PWX-09957 CAPI i/f: Read times out after <n> seconds
The CAPI interface timeout also affects latency as it will affect how quickly changes are returned to the PWXPC reader by PowerExchange. PowerExchange will ensure that it returns control back to PWXPC at least once every CAPI interface timeout period. This allows the PWXPC to regain control and, if necessary, perform the real-time flush of data returned. A high RTF Latency specification will also impact the speed with which stop requests from PowerCenter are handled as the PWXPC CDC Reader must wait for PowerExchange to return control before it can handle the stop request.
Tip: Use the PowerExchange STOPTASK command to shutdown more quickly when using a
high RTF Latency value. For example, if the value for Real-Time Flush Latency is 10 seconds, PWXPC will issue a commit for all data read after 10 seconds have elapsed and the next end-UOW boundary is received. The lower you set the value, the faster the data commits data to the target. Therefore, if you require the lowest possible latency for the apply of changes to the target, you should specify a low Real-Time Flush Latency value.
Warning: When you specify a low Real-Time Flush Latency interval, the session might consume more system resources on the source and target platforms. This is because:
The session will commit to the target more frequently therefore consuming more target resources. PowerExchange will return more frequently to the PWXPC reader thereby passing fewer rows on each iteration and consuming more resources on the source PowerExchange platform.
You need to balance performance and resource consumption with latency requirements when choosing the UOW Count and Real-Time Flush Latency values.
120
Commit Threshold is only applicable to Real-Time CDC sessions. Use the Commit Threshold terminating condition to cause commits before reaching the end of the UOW when processing large UOWs. This parameter requires a valid value and has a valid default value Commit Threshold can be used to cause a commit before the end of a UOW is received, a process also referred to as sub-packet commit. The value specified in the Commit Threshold is the number of records within a source UOW to process before inserting a commit into the change stream. This attribute is different from the UOW Count attribute in that it is a count records within a UOW rather than complete UOWs. The Commit Threshold counter is reset when either the number of records specified or the end of the UOW is reached. This attribute is useful when there are extremely large UOWs in the change stream that might cause locking issues on the target database or resource issues on the PowerCenter Integration Server. The Commit Threshold count is cumulative across all sources in the group. This means that sub-packet commits are inserted into the change stream when the count specified is reached regardless of the number of sources to which the changes actually apply. For example, a UOW contains 900 changes for one source followed by 100 changes for a second source and then 500 changes for the first source. If you set the Commit Threshold to 1000, the commit record is inserted after the 1000th change record which is after the 100 changes for the second source.
Warning: A UOW may contain changes for multiple source tables. Using Commit Threshold can cause commits to be generated at points in the change stream where the relationship between these tables is inconsistent. This may then result in target commit failures.
If 0 or no value is specified, commits will occur on UOW boundaries only. Otherwise, the value specified is used to insert commit records into the change stream between UOW boundaries, where applicable. The value of this attribute overrides the value specified in the PowerExchange DBMOVER configuration file parameter SUBCOMMIT_THRESHOLD. For more information on this PowerExchange parameter, see the PowerExchange Reference Manual.
Configuring Connection Attributes 121
The commit to the target when reading CDC data is not strictly controlled by the Commit Threshold specification. The commit records inserted into the change stream as a result of the Commit Threshold value affect the UOW Count counter. The UOW Count and the RealTime Flush Latency values determine the target commit frequency. For example, a UOW contains 1,000 change records (any combination of inserts, updates, and deletes). If you specify 100 for Commit Threshold and 5 for UOW Count, then a commit record will be inserted after each 100 records and a target commit will be issued after every 500 records. To understand the affect that all of these values have on commit processing, see Understanding Commit Processing with PWXPC on page 123.
122
Warning: Duplicate data on targets can occur if the Commit On End Of File option in the Session Properties is enabled. To prevent this, change the Commit Type to Source and then disable the Commit On End Of File option in the Session Properties. This will ensure that PWXPC controls commit processing thereby ensuring the target data and restart tokens are in-sync.
Source-based commit sessions have partitioning restrictions. For more information, see the PowerCenterWorkflow Administration Guide. There are three connection attributes which affect commit processing when running real-time CDC sessions:
Note: When using PWXPC CDC Change connections, the only connection attribute that
affects commit processing is UOW Count. With the exception of Commit Threshold, all source-based commits with PWXPC are done on end-UOW boundaries. Commit Threshold exists to provide sub-packet commit capability; that is, to commit data after a specified number of records within a single UOW. If you specify values for UOW Count and Real-Time Flush Latency, then PWXPC issues commits to the target when it reaches the UOW Count or when the Real-Time Flush Latency period expires, whichever occurs first.When the commit is issued both the Real-Time Flush (RTF) Latency period and the UOW Count counter are reset. PWXPC continues to read data from the source until either the RTF Latency period matches or the UOW Count matches. At which point, it issues another commit. This processing continues until the session terminates. Commits inserted into the change stream as a result of the Commit Threshold specification are counted in the UOW Count counter. Commit Threshold itself does not result in PWXPC issuing a commit to the target. Only UOW Count and Real-Time Flush Latency will cause that to happen. A final commit is also done by PWXPC during reader termination to ensure that all buffered and complete UOWs are committed. Idle Time can impact the commit process by causing the real-time session to terminate. For example, you set the Idle Time value to 10 seconds and the UOW Count to five units of work. When the PWXPC reaches the UOW Count, it commits data to the target and
123
continues to read data from the source. If the Idle Time is then reached, PowerExchange stops reading from the source and sends an EOF to the Integration Service which terminates the session. If only three complete UOWs have been read since the previous UOW countbased commit, these will be committed when the session terminates due to the final commit done by PWXPC. To illustrate the interaction of all of the values affecting commit processing, assume the following settings:
Idle Time is -1 (never timeout) Commit Threshold is 0 (no sub-packet commit) RTF Latency is 5000 (5 seconds) UOW Count is 1,000
The PWXPC reader receives 900 complete UOWs in five seconds after the first change row enters the source qualifier. Because the RTF Latency value has matched, a source-based commit is issued at this point. Both the UOW Count and RTF Latency period are then reset. So, another commit will not be issued when the 1,000 UOW is read. The next 1,000 UOWs are read in 4 seconds. So, a commit is issued after these 1,000 UOWs because the UOW Count matched. Again, the RTF Latency period and the UOW Count are reset at this point. More changed data is read and commits continue based on whichever attributes matches first. So, in this example, commits were issued after the first 900 UOWs because RTF Latency matched first and then again after the 1,900th UOW because the UOW Count then matched first. It is therefore possible to have both the Real-Time Flush Latency period and the UOW Count control commits of the data. Commits will always be done on a UOW boundary based (except when Commit Threshold is specified) on whichever attribute matches first. For the lowest latency in getting changed data to the target, use an RTF Latency of 2000 (the default) and a UOW Count of 1. This will cause a commit at each commit point in the source data. Since the RTF Latency will only commit on a UOW boundary and a UOW Count of 1 causes a commit after each complete UOW is received, the effect is to have only UOW Count control the commit process. Of course, this is the most-resource intensive setting from the target DBMS perspective. The RTF Latency specification controls:
The commit latency for the target tables The PowerExchange Consumer API (CAPI) interface timeout value (PowerExchange latency) on the source platform
In addition to impacting the target latency, high values for RTF Latency will impact the speed with which stop requests from PowerCenter are handled. The PWXPC CDC Reader must wait for PowerExchange to return control before it can handle the stop request.
Tip: Use the PowerExchange STOPTASK command to shutdown more quickly when using a
high RTF Latency value. If you want to use less resources on the target system and want only UOW Count to control the commit process, then ensure that Real-Time Flush Latency is set sufficiently high so as
124 Chapter 4: Configuring Connections
not to be a factor. The value necessary to do this is customer-dependent. The size of the UOWs and the speed at which they are read will affect what value represents a high enough RTF Latency period to eliminate it as a factor.
125
126
Chapter 5
Overview, 128 Extracting Data in Batch Mode, 129 Extracting CDC Data in Change and Real-time Modes, 135 Loading Data to PowerExchange Targets, 142
127
Overview
After you create mappings in the PowerCenter Designer, you can create a session and use the session in a workflow to extract, transform, and load data. You create sessions and workflows in the Workflow Manager. You can create a session in a workflow to extract data in batch, change, or real-time mode. You determine how you want the Integration Service to extract the data when you configure the session. You can also create a session to load data to a target. For more information about creating, configuring, and scheduling workflows, see the PowerCenter Workflow Administration Guide.
Pipeline Partitioning
Pipeline partitioning cannot be used for sources in CDC sessions. You can use it for targets in CDC sessions. For more information about partitioning and a list of all partitioning restrictions, see the PowerCenter Workflow Administration Guide.
128
Constraint-Based Loading for Relational Targets on page 129. PowerCenter Workflow Administration Guide.
In the Task Developer, double-click a session with a non-relational source to open the session properties. Click the Sources view on the Mapping tab.
129
3.
In the Reader field of the Readers settings, the PowerExchange Batch Reader for the specific source type will be shown. The Reader names for non-relational batch sources have one of the following formats:
PowerExchange Batch Reader for <database_type> PowerExchange Reader for <database_type>
ADABAS ADABAS Unload Files DB2 Datamaps DB2 Unload Datasets DATACOM IDMS IMS Sequential Files VSAM Files
The name of the reader cannot be altered. The only exception to this is Adabas where you can choose between ADABAS and ADABAS Unload Files. For example, a VSAM source looks as follows:
Figure 5-1. Session Mapping Tab - Batch VSAM Reader
130
4. 5. 6.
In the Connection Value field, select the application connection from the available PWX NRDB Batch connections. Click the Sources view on the Mapping tab. In the Properties settings, configure the following attributes (all fields are optional except where noted):
Attribute Name Schema Name Override Map Name Override File Name ADABAS Password Database Id Override File Id Override DB2 Sub System Id DB2 Table name Unload File Name Filter Overrides Source Type All All ADABAS Unload ADABAS ADABAS, ADABAS Unload ADABAS, ADABAS Unload DB2 Datamaps DB2 Datamaps DB2 Unload Datasets All Description Overrides the source PowerExchange data map schema name. Overrides the source PowerExchange data map name. Specifies the file name of the unloaded Adabas database. Required for ADABAS Unload. Password for the ADABAS database. Overrides the ADABAS Database Id in the PowerExchange data map. Overrides the Adabas file id in the PowerExchange data map. Overrides the DB2 instance name in the PowerExchange data map. Overrides the DB2 Table name in PowerExchange data map. Overrides the unload file name in PowerExchange data map. Allows data to be filtered at the source by PowerExchange. Enter a filter override using the following syntax:
<filter condition1>;<filter condition2>;...
Use the <group name> syntax to limit the application of the filter to a specific record. If you do not specify <group name> and the source mapping is a multi-record source, then the filter condition applies to all records in the source mapping. For example, you can select only records with ID column values that contain DBA for a multi-record source with two records called USER1 and USER by specifying either:
USER1=ID=DBA;USER2=ID=DBA
- or ID=DBA
See Filtering Source Data Using PWXPC on page 230. IMS Unload File Name IMS IMS database unload file name. Required if you want to read source data from the backup file instead of the IMS database.
131
Description Overrides the data set or file name in the PowerExchange data map. Enter enter the complete data set or file name. For the AS400, this name should be: library_name/file_name. If you selected the Filelist File check box, enter the filelist file name in the File Name Override.
Filelist File
VSAM, SEQ
Select if the File Name Override field contains the data set name of a list of files. Only select this option if you have entered a filelist file for File Name Override. Overrides the SQL query sent to PowerExchange, including any Filter Overrides.
All
For information about other session properties, see the PowerCenter Workflow Administration Guide.
7.
Click OK.
In the Task Developer, double-click a session with a relational source to open the session properties. Click the Sources view on the Mapping tab. In the Reader field of the Readers settings, select Relational Reader. For DB2 on z/OS, you can also select PowerExchange Reader for DB2 Image Copy.
132
For example, the available readers for a DB2 source looks as follows:
Figure 5-2. Session Mapping Tab - DB2 Readers
Note: The PowerExchange Reader for DB2 Image Copy is only applicable to DB2 for z/
In the Connection Value field, select PWX NRDB Batch application connection if using the DB2 Image Copy reader.
133
Select the appropriate PWX relational database connection if using the relational reader:
5.
PWX DB2390 for DB2 for z/OS PWX DB2400 for DB2 for i5/OS PWX DB2UDB for DB2 for Linux, UNIX, and Windows PWX Oracle for Oracle PWX MSSQLServer for Microsoft SQL Server PWX Sybase for Sybase
You can configure the following attributes for the PowerExchange Reader for DB2 Image Copy:
Attribute Name Schema Name Override Map Name Override DB2 Sub System Id Image Copy Dataset Disable Consistency Checking Filter Overrides Description Overrides the source schema name. Overrides the source table name. Overrides the DB2 instance name in the PowerExchange data map. Provides the image copy data set name. If not specified, the most current image copy data set is used. Overrides the unload file name in PowerExchange data map. Allows data to be filtered at the source by PowerExchange. Enter a filter override using the following syntax:
<filter condition1>;<filter condition2>;...
For example, you can select only records with ID column values that contain DBA by specifying:
ID=DBA
See Filtering Source Data Using PWXPC on page 230. SQL Query Override Overrides the SQL query sent to PowerExchange, including any filter overrides.
For information about other session properties, see the PowerCenter Workflow Administration Guide.
6.
Click OK.
134
7.x can still select either Batch or CDC readers and application connections. Non-relational source definitions imported in PowerCenter 8.x automatically have the appropriate Batch Reader selected for the source type. This reader selection cannot be changed. In order to properly configure CDC sessions, review the following topics:
CDC-Specific Connection Attributes on page 112 Understanding Commit Processing with PWXPC on page 123 Extracting CDC Data in Change and Real-time Modes on page 135 Understanding PWXPC Restart and Recovery on page 151 Configuring the Restart Token File on page 162 Enabling Session Recovery on page 165 Configuring CDC Sessions on page 166
the only choice if you want to use a CDC reader. It is no longer possible to select a CDC reader with sources created from PowerExchange data maps in PowerCenter V8.x (that is, sources with Database Type of PWX_source_NRDB2).
135
If you want to extract change data from a multi-record non-relational source using extraction maps, you must create a PowerExchange capture registration for every table in the data map. This creates an extraction map for each table. You can then either import the data map as a multi-record non-relational source (for batch usage) or import the extraction maps for each table (for CDC usage).
include sources with multiple data types will fail when executed.
The Custom Property can be set in the session Config Object tab in the Custom Properties attribute. This property can also be set in the Integration Service making it applicable to all workflows and sessions that use that Integration Service. See Knowledge Base Article #18015 for information about how to set Custom Properties in the Integration Service. If you are using full constraint-based loading, then your mapping must not contain active transformations which change the Row Id generated by the CDC Readers. The transformations that change the Row Id are: Aggregator, Custom (configured as active), Joiner, Normalizer, Rank, and Sorter transformations. All other transformations can be used. For more real-time mode limitations, see the PowerCenter Workflow Administration Guide. For more information about constraint-based loading, see the PowerCenter Workflow Administration Guide.
136
In the Task Developer, double-click the session to edit it. Click the Properties tab and change the following:
Commit Type field - Change to Source. Commit on End of File field - Clear this field to turn this off.
To enable recovery for the session, change the Recovery Strategy attribute to Resume from last checkpoint. Enabling recovery for CDC sessions is important to ensure that data and restart tokens are properly handled. For more information, see Restart and Recovery on page 149.
3.
Click the Sources view on the Mapping tab. For relational sources, you will have to choose the desired CDC Reader (see page 139). With extraction map sources, the reader is automatically chosen based on the source type
137
of the extraction map. In this example, the extraction map used in the mapping is for a DB2/390 source so the PowerExchange CDC Reader for DB2/390 is chosen:
Figure 5-4. Session Mapping Tab - Extraction Map Source
In the Connection Value field, select CDC Real Time or CDC Change application connection types. PWXPC displays the valid connections for the source type in the Application Connection Browser.
4. 5.
Optionally, open the application connection to override any connection values. See Configuring Connections on page 79. In the Session Properties settings, configure the following optional attributes:
Attribute Name Schema Name Override Map Name Override Database Id Override File Id Override Source Type All All ADABAS ADABAS Description Overrides the source PowerExchange extraction map schema name. Overrides the source PowerExchange extraction map name. Overrides the ADABAS Database Id in the PowerExchange data map. Overrides the Adabas file id in the PowerExchange data map.
138
Description Overrides the library and file name in the extraction map. Specify complete library and file names in the format: library/file This attribute overrides the Library/File Override value on the application connection. Warning: Do not specify an asterisk for library name if using PWXPC restart.
Oracle
Overrides the source schema name. Allows data to be filtered at the source by PowerExchange. Enter a filter override using the following syntax:
<filter condition1>;<filter condition2>;...
For example, you can ask PowerExchange to filter change records so that only ones where columns ID and ACCOUNT have changed are passed by specifying:
DTL__CI_ID=Y;DTL__CI_ACCOUNT=Y
See Filtering Source Data Using PWXPC on page 230. SQL Query Override Overrides the SQL query sent to PowerExchange, including any Filter Overrides.
For information about other properties settings, see the PowerCenter Workflow Administration Guide.
To configure a change or real-time mode session (relational sources): 1. 2.
In the Task Developer, double-click the session to edit it. Click the Properties tab and change the following:
Commit Type field - Change to Source. Commit on End of File field - Clear this field to turn this off.
See Figure 5-3 on page 137 for an example of the Properties tab. To enable recovery for the session, change the Recovery Strategy attribute to Resume from last checkpoint. Enabling recovery for CDC sessions is important to ensure that data and restart tokens are properly handled. For more information, see Restart and Recovery on page 149.
139
3.
In the Reader field of the Readers settings, select a CDC Reader from those available based on the source type:
4.
DB2390: PowerExchange CDC Change or PowerExchange CDC Real-time DB2400: PowerExchange CDC Change or PowerExchange CDC Real-time DB2UDB: PowerExchange Real-time Oracle: PowerExchange CDC Change or PowerExchange CDC Real-time MSSQL: PowerExchange Real-time
the first CDC source. For subsequent CDC sources, choose a Connection Type of None. PowerExchange group source processing only uses the information on the first application connection. Subsequent application connection specifications are not required and may cause session failures.
5.
Optionally, open the application connection to override any connection values. See Configuring Connections on page 79.
140
6.
In the Properties settings, configure the following attributes. All fields are optional except where noted:
Attribute Name Extraction Map Name Source Type All Description The PowerExchange extraction map name for the CDC source. Required if using a CDC Change or Real Time Reader. You must specify the extraction map name for the relational source. Overrides the library and file name in the extraction map. Specify complete library and file names in the format: library/file This attribute overrides the Library/File Override value on the application connection. Warning: Do not specify an asterisk for library name if using PWXPC restart. Source Schema Override Oracle Change and Real Time Overrides the source schema name.
Library/File Override
For information about other properties settings, see the PowerCenter Workflow Administration Guide.
7.
Click OK.
141
imported into PowerExchange as a sequential data map, you can use a PWX NRDB Batch application connection to write to it.
Configuring sessions to load data to relational targets: 1. 2. 3.
In the Task Developer, double-click the session to edit it. Click the Targets view on the Mapping tab. In the writers setting, select relational writer to run sessions with relational targets.
142
4.
In the Connections Value field, select a relational database connection from one of the following types:
PWX DB2390 for DB2 for z/OS PWX DB2400 for DB2 for i5/OS PWX DB2UDB for DB2 for Linux, UNIX, and Windows PWX Oracle for Oracle PWX MSSQLServer for Microsoft SQL Server PWX Sybase for Sybase
5. 6.
For information about properties settings, see the PowerCenter Workflow Administration Guide.
143
Adabas IMS Sequential, including flat files on AS/400, Linux, UNIX, and Windows VSAM
The writer is set to the correct PowerExchange Writer based on the target type. You must select a PWX NRDB Batch application connection. You can then configure properties for session as you would any other target.
To configure sessions to load data to non-relational targets: 1. 2.
In the Task Developer, double-click the session to edit it. Click the Targets view on the Mapping tab. The writer value is set based on the target type.
144
3. 4. 5. 6.
In the Connections Value field, select a PWX NRDB Batch application connection. In the Properties settings, configure the PWXPC session properties. See Table 5-1 on page 146. Configure any other session properties. Click OK.
For information about other Properties settings, see the PowerCenter Workflow Administration Guide.
145
146
SEQ (MVS only) ADABAS, IMS, VSAM ADABAS, IMS, VSAM SEQ (MVS only)
147
148
Chapter 6
Overview, 150 Understanding PWXPC Restart and Recovery, 151 Creating Recovery Tables, 160 Configuring the Restart Token File, 162 PWXPC Restart and Recovery Operation, 165
149
Overview
This chapter describes PWXPC restart and recovery processing as well as how to configure your CDC sessions to use this processing. Each source in a CDC session has unique restart information, also referred to as restart tokens. PWXPC manages the CDC restart information. The Integration Service provides recovery for the target files and tables in CDC sessions. In order to extract change data from the change stream, PWXPC provides restart information for the CDC sources to PowerExchange. PowerExchange reads the change stream on the CDC source platform and provides complete units of work to PWXPC. A unit of work (UOW) is a collection of changes within a single commit scope made by a transaction on the source system. Using the commit interval information specified in the CDC session connection, PWXPC periodically flushes complete UOWs to the Integration Service. Target recovery and restart information is stored as the target tables and files are updated by the Integration Service. The Integration Service and PWXPC use this information to recover and restart stopped or failed sessions from the point of interruption. In order to properly configure CDC sessions, review the following topics:
CDC-Specific Connection Attributes on page 112 Understanding Commit Processing with PWXPC on page 123 Extracting CDC Data in Change and Real-time Modes on page 135 Understanding PWXPC Restart and Recovery on page 151 Configuring the Restart Token File on page 162 Enabling Session Recovery on page 165 Configuring CDC Sessions on page 166
In order to manage CDC sessions, review PWXPC Restart and Recovery Operation on page 165.
150
When you enable a resume recovery strategy, the Integration Service provides recovery for the target tables and files and PWXPC provides recovery for the CDC restart information. PWXPC issues the following message indicating that recovery is in effect:
PWXPC_12094 [INFO] [CDCRestart] Advanced GMD recovery in affect. Recovery is automatic
The Integration Service stores the session state of operation in the shared location, $PMStorageDir. The Integration Service saves relational target recovery in the target database. CDC restart information, also called restart tokens, originates from PowerExchange on the CDC source platform. PWXPC stores CDC restart information in different locations based upon the target type:
For non-relational targets, PWXPC stores the CDC restart information in the shared location, $PMStorageDir, in state files on the Integration Service platform. For relational targets, PWXPC stores the CDC restart information in state tables in the target database.
When the Integration Service performs recovery, it restores the state of operation to recover the session from the point of interruption. It uses the target recovery data to determine how to recover the target tables. PWXPC and PowerExchange use the CDC restart information to determine the correct point in the change stream from which to restart the extraction.
Recovery Tables
For relational targets, the Integration Service creates the following recovery tables in the target database:
PM_RECOVERY. This table contains target load information for the session run. The Integration Service removes the information from this table after each successful session and initializes the information at the beginning of subsequent sessions. PM_TGT_RUN_ID. This table contains information the Integration Service uses to identify each target on the database. The information remains in the table between session runs. If you manually create this table, you must create a row and enter a value other than zero for LAST_TGT_RUN_ID to ensure that the session recovers successfully.
Understanding PWXPC Restart and Recovery 151
PM_REC_STATE. This table contains restart information for CDC sessions. The restart information recorded in the table contains the application name and restart tokens for the session. The restart information remains in the table permanently. The Integration Service updates it with each commit to the target tables.
If you edit or drop the recovery tables before you recover a session, the Integration Service cannot recover the session. If you disable recovery, the Integration Service does not remove the recovery tables from the target database. You must manually remove the recovery tables. If you want the Integration Service to create the recovery tables, grant table creation privilege to the database user name for the target database connection. For the database user name used with PowerExchange relational targets, see Recovery Table Creation with PowerExchange Targets on page 160. If you do not want the Integration Service to create the recovery tables, create the recovery tables manually.
Tip: If you are using PowerExchange relational target connections, manually create these
tables so you can assign the desired database attributes. See Creating the Recovery Tables Manually on page 161. For more information about the PM_RECOVERY and PM_TGT_RUN_ID tables, see the PowerCenterWorkflow Administration Guide.
OWNER_TYPE_ID - PowerCenter-defined identifier REP_GID - Global unique identifier of the repository FOLDER_ID - Folder identifier WFLOW_ID - Workflow identifier to which the session belongs WFLOW_RUN_INST_NAME - Workflow run instance name WLET_ID - Worklet identifier TASK_INST_ID - Session (task) instance identifier WID_INST_ID - Reader widget instance identifier
152
GROUP_ID - Partition group Identifier PART_ID - Partition identifier PLUGIN_ID - Application connection plug-in subtype identifier APPL_ID - Application name from the source application connection SEQ_NUM - Entry sequence number VERSION - Session version number CHKPT_NUM - Session checkpoint number STATE_DATA - Restart state data for the session The APPL_ID column contains the application name specified in the source application connection. The STATE_DATA column, which contains the restart tokens for the session, is a variable 1024 byte binary column. If the number of restart tokens for a session causes the data to exceed 1024 in length, additional rows are added to accommodate the remainder of the restart information. The SEQ_NUM field is increased by one, starting from zero, for each additional row added for a session entry.
The majority of the columns in the table are task and workflow repository attributes. These repository attributes remain static unless the task or workflow is altered. The following examples are actions that alter these repository attributes:
Adding or removing sources or targets from the mapping used by the session Moving the workflow or session to a different folder Moving the session to a different workflow
See Changing CDC Sessions on page 171 for additional information. During session initialization, the Integration Service reads the state table looking for an entry that matches the session data. All column data (with the exception of VERSION, CHKPT_NUM, and STATE_DATA) must match the task and workflow repository attributes for the Integration Service to use an entry. If a match is found, the Integration Service uses that entry for target recovery processing. PWXPC uses the CDC restart information stored in the STATE_DATA column to perform restart and recovery processing.
153
The Integration Service uses the application name from the source CDC connection for the application name value in the state file name prefix. The Integration Service includes the complete file name in message CMN_65003. The remainder of the fields in the file name are task and workflow repository attributes. These repository attributes remain static unless the task or workflow is altered. The following examples are actions that alter these repository attributes:
Adding or removing sources or targets from the mapping used by the session Moving the workflow or session to a different folder Moving the session to a different workflow
Initial restart tokens for new CDC sessions Overrides for the restart tokens in the state table or file for existing CDC sessions
PWXPC uses the restart token file in the folder specified in the RestartToken File Folder attribute of the source CDC connection. PWXPC automatically creates this folder, if it does not exist, when the attribute contains the default value of $PMRootDir/Restart. PWXPC does not automatically create any other restart token folder name. During session initialization, PWXPC:
Uses the name specified in the RestartToken File Name attribute to create an empty restart token file, if one does not already exist. Creates a merged view of the restart tokens by reconciling the restart tokens specified in the restart token file with those in state tables and the state file for all relational and nonrelations targets, respectively. For more information on the reconciliation process, see Determining the Restart Point on page 155.
Places the results of the restart token reconciliation process into an initialization file in the restart token file directory and empties out the restart token file. Emptying the restart token file ensures that it does not override the state table or state file restart tokens with the same restart information the next time the session is run.
During normal termination, PWXPC writes the ending restart tokens into a termination file in the restart token file directory. The restart token files containing the initialization and termination restart tokens have the following names:
<restart_token_file_name>yyyymmddhhmmss_init <restart_token_file_name>yyyymmddhhmmss_term
154
Where:
restart_token_file_name is the restart token file name from the CDC connection yyyymmddhhmmss is the initialization file creation timestamp init or term is for initialization and termination files, respectively
For example, a CDC source application connection specifies a restart token file name of my.app.txt, which does not exist. PWXPC creates the following files on the Integration Service platform in the restart token file folder specified in the connection:
The restart token file, my.app.txt, is empty. The timestamps on both the initialization and termination files are the same to indicate that they are related to the same run. The termination file may not exist or may be empty if the session fails. If you are using the default value of zero for the connection attribute Number of Runs to Keep RestartToken File, PWXPC keeps only one copy of the paired initialization and termination files. Otherwise, PWXPC uses the value specified in that attribute to determine the number of backup copies of these paired files to keep. During termination, PWXPC removes any additional pairs of the backup files beyond the Number of Runs to Keep RestartToken File value.
Cold start. When you cold start a CDC session, PWXPC reads only the restart token file to acquire restart tokens for all sources and makes no attempt to recover the session. The session continues to run until stopped or interrupted. Warm Start. When you warm start a CDC session, PWXPC reconciles the restart tokens provided in the restart token file, if any, with any restart tokens that exist in the state file or state tables. If necessary, PWXPC performs recovery processing. The session continues to run until stopped or interrupted. Recover. When you recover a CDC session, PWXPC reads the restart tokens from the state file and state tables and writes them into the restart token file. If necessary, PWXPC performs recovery processing. After PWXPC finishes updating the restart token file and doing any necessary recovery, the session ends.
See Starting CDC Sessions on page 169 for more details. Each CDC source in the CDC session has its own unique restart point. You should create and populate the restart token file with restart points for each source prior to running a CDC session for the first time. If you do not provide restart tokens in the restart token file and no existing entry exists for the session in the state tables or the state file, then PWXPC passes null restart tokens to PowerExchange for all sources in the session. See Default Restart Points on page 157 for further information about null restart tokens.
155
The restart tokens PWXPC uses varies based on whether you warm or cold start the CDC session and whether you provide any overriding restart tokens in the restart token file.
Restart Token file empty or does not exist. PWXPC assigns null restart tokens to all sources in the session. See Default Restart Points on page 157 for further information about null restart tokens. Restart token file contains explicit override statements. PWXPC assigns the restart tokens supplied in the restart token file to the specified sources. PWXPC assigns the oldest restart point of the restart tokens specified to all remaining sources. See Configuring the Restart Token File on page 162 for further information about explicit override statements. Restart token file contains special override statement. PWXPC assigns the restart tokens supplied in the restart token file to all sources. See Configuring the Restart Token File on page 162 for further information about the special override statement. Restart token file contains special override statement and explicit override statements. PWXPC assigns the restart tokens supplied in the restart token file in the explicit override statements to the specified sources. PWXPC assigns the restart tokens supplied in the special override statement to all remaining sources.
If no state file (non-relational target) or no entry in a state table (relational target) for the session: PWXPC assigns null restart tokens to all sources in the session. See Default Restart Points on page 157 for further information about null restart tokens.
If state file (non-relational target) or entry in a state table (relational target) exists for some but not all sources in the session: PWXPC assigns the restart tokens found in the state file or state tables to the appropriate sources. PWXPC assigns the oldest restart point of the restart tokens available to all remaining source without restart tokens.
If state file (non-relational target) or entry in a state table (relational target) exists for all sources in the session: PWXPC uses the restart tokens from the state file or state tables.
156
If no state file (non-relational target) or no entry in a state table (relational target) for the session: PWXPC assigns the restart tokens supplied in the restart token file to the specified sources. PWXPC assigns the oldest restart point of the restart tokens specified in the restart token file to all remaining sources without restart tokens. See Configuring the Restart Token File on page 162 for further information about explicit override statements.
If state file (non-relational target) or entry in a state table (relational target) exists for some but not all sources in the session: PWXPC assigns the restart tokens supplied in the restart token file to the specified sources. PWXPC assigns the restart tokens found in the state file or state tables to the appropriate sources provided they have not been supplied in the restart token file. PWXPC assigns the oldest restart point of the restart tokens available to all remaining sources without restart tokens.
If state file (non-relational target) or entry in a state table (relational target) exists for all sources in the session: PWXPC assigns the restart tokens supplied in the restart token file to the specified sources in the session. PWXPC assigns the restart tokens from the state file or state tables to all remaining sources without restart tokens.
Restart token file contains special override statement. PWXPC assigns the restart tokens supplied in the special override statement in the restart token file to all sources. See Configuring the Restart Token File on page 162 for further information about the special override statement.
Restart token file contains special override statement and explicit override statements. PWXPC assigns the restart tokens supplied in the restart token file in the explicit override statements to the specified sources. PWXPC assigns the restart tokens supplied in the special override statement to all remaining sources without restart tokens.
Start your extraction for a new CDC session at a point in the change stream where the source and its target are in a consistent state. You do this by placing the restart tokens that match that point in the change stream in the restart token file and doing a cold start of the CDC session.
Understanding PWXPC Restart and Recovery 157
For example, a target table has been materialized from its source data and no new changes have been made to the source data. Now you need to establish a starting extraction, or restart, point in the change stream. You do this by using DTLUAPPL. DTLUAPPL is a PowerExchange utility that generates restart points. After you have run DTLUAPPL, place the generated restart tokens in the restart token file specified in the source CDC connection and cold start the CDC session. PWXPC passes the restart tokens from the restart token file to PowerExchange. PowerExchange extracts changes from the change stream from that restart point forward. Table 6-1 describes earliest starting extraction (restart) points PowerExchange uses if null restart tokens are supplied for all sources:
Table 6-1. Default Starting Extraction Points for Sources
Source Platform/ Database MVS (all sources)
CDC Real-time Connection Logger selects the best available restart point. This is the oldest restart point for which an archive log is available, or active log if there are no available archive logs. Oldest journal receiver still on the journal receiver chain. Most current Oracle catalog dump. Oldest data available in the Publication database. Current log position at the time the capture catalog was created.
Oldest Condense file recorded in the CDCT. Oldest Condense file recorded in the CDCT. n/a n/a
PowerExchange only uses the default starting extraction point if all sources have null restart tokens. PWXPC assigned the oldest restart point of the restart tokens available if there are some sources without restart tokens. For example, a new CDC session contains three sources called A, B, and C. The restart token file contains restart points for sources A and B. The restart point for source A is older than source B. Source C has no existing or supplied restart point. When you run the session, PWXPC assigns source C the same restart point as source A since it is the oldest supplied restart point. PWXPC does not assign the default starting extraction point discussed in Table 6-1 to source C because some sources have restart points.
Flushes the restart tokens to the state tables for relational targets and to the state file for non-relational targets Writes an empty restart token file
158
Creates the initialization restart token file containing the reconciled restart information
PWXPC passes the restart tokens for all sources to PowerExchange. PowerExchange uses the oldest restart token passed by PWXPC to start extracting data from the change stream. PowerExchange does not pass data for a source until its restart point is reached. This prevents targets from being updated with records processed in previous extraction runs. PWXPC continually updates the restart tokens for each source in the state table or the state file as it issues flushes target data. With relational target tables in the same database, the Integration Service updates both the target tables and the restart tokens within a single commit. The Integration Service does separate commits for each unique relational database. With heterogeneous target, the restart tokens in one relational database may differ from those in another relational database at specific points in time. When using non-relational targets, the state file and the targets likely exist on completely different machines. With non-relational targets, the Integration Service updates the targets and the state file in separate operations. If the session fails after the Integration Service commits data to the target but before it updates the restart tokens in the state file, targets may receive duplicate data when restarted. On warm start, PWXPC uses the last restart tokens written prior to the failure. As a result, PWXPC re-sends data which has already been applied to the non-relational targets.
The Integration Service commits the flushed data to the targets, including the restart tokens for relational targets. After the Integration Service writes the flushed data to any nonrelational targets, it updates the state file with the restart tokens. If the session fails, the Integration Service rolls back any uncommitted data and the related restart tokens for relational targets. This leaves only the last successfully committed UOW data and restart tokens in the relational target tables. The Integration Service uses relational database rollback capabilities to ensure that uncommitted data is removed during session termination. Consistency between the restart tokens and the relational target data is guaranteed because they are both committed within the same commit scope. The Integration Service does not do rollback processing for non-relational targets. As a result, duplicate data can occur on restart. You should account for this in your CDC session design.
Tip: If the possibility of duplicate data is unacceptable to your application, then design your
159
The default DB2 database (DSNDB04) The PowerExchange Listener userid if it is running with SECURITY=0 or SECURITY=1 and so this user must be granted the appropriate table creation privilege The PowerExchange Listener userid if it is running with SECURITY=2 and MVSDB2AF=CAF and so this user must be granted the appropriate table creation privilege The database user name in the target connection if the PowerExchange Listener is running with SECURITY=2 and MVSDB2AF=RRSAF and so this user must be granted the appropriate table creation privilege
The PowerExchange Listener userid if it is running with SECURITY=0 or SECURITY=1 and so this user must be granted the appropriate table creation privilege The database user name in the target connection is if the PowerExchange Listener is running with SECURITY=2 and so this user must be granted the appropriate table creation privilege The default journal so it must be enabled for the user name
The default tablespace for user-defined tables The database user name in the target connection and so this user must be granted the appropriate table creation privilege
160
Run one of the following scripts to create the recovery tables in the target database:
Table 6-2. Recovery Table SQL Scripts
Script create_schema_db2.sql create_schema_inf.sql create_schema_ora.sql create_schema_sql.sql create_schema_syb.sql create_schema_ter.sql Database DB2 Informix Oracle SQL Server Sybase Teradata
This is generic DDL. Make the appropriate changes for your environment.
161
Look at the PWXPC_12057 message in the session log. PWXPC includes the restart token file folder and the restart token file name in this message. Open the application connection associated with the source. The application connection contains the restart token file name and folder location. This file name overrides the file name you specified in the application connection. If the restart token file name is not specified in the application connection, PWXPC uses the application name, if specified. Otherwise, PWXPC uses the workflow name.
Warning: The Restart Token File Name must be unique for every session. Using non-unique names causes unpredictable results including session failures and potential data loss.
Syntax Rules
The restart token file has these syntax rules:
Comment Statement
<!-- comment text
Use the comment statement anywhere in the restart token file. The <!-- is required.
162
The explicit override statement specifies restart tokens for a specific source. The source is defined by specifying the extraction map name. Sources can have multiple extraction mappings and therefore multiple extraction map names. Each source specification must consist of a pair of lines with:
The source extraction map name (extraction_map_name) specified with the restart1_token value The source extraction map name (extraction_map_name) specified with the restart2_token
The extraction map name specified in the restart token file must match what is defined in the CDC session. To determine the extraction map name:
Check the Extraction Map Name attribute in the Session Properties for relational sources Check the Schema Name Override and Map Name Override attributes in the Session Properties if using CDC data map sources. See Figure 5-4 on page 138. These attributes override the source name in the CDC data map source. Check the Schema Name and Map Name values in the source Metadata Extensions in Designer if using CDC data map sources. See Figure 3-23 on page 61. Sequence= value, minus the trailing 8 zeros, in the DTLUAPPL PRINT output DTL__CAPXRESTART1 value when extracting data Sequence token in PowerExchange messages (e.g., PWX-04564, PWX-09959) Restart Token 1 in PWXPC messages (e.g. PWXPC_12060, PWXPC_12069) Restart= value in the DTLUAPPL PRINT output DTL__CAPXRESTART2 value when extracting data Logger token in PowerExchange messages (e.g., PWX-04564, PWX-09959) Restart Token 2 in PWXPC messages (e.g. PWXPC_12060, PWXPC_12069)
The restart1_token value varies based on capture source and is found in the following:
The restart2_token value varies based on the capture source and is found in the following:
If the session includes source extraction maps that do not have entries in the existing restart token file, then the session executes without error. See Default Restart Points on page 157 for an explanation of what restart tokens PWXPC and PowerExchange use in this case.
The special override statement allows specifies for all sources in a session. The restart token values (restart1_token and restart2_token) are described in detail in the Explicit Override Statement. If used, both RESTART1= and RESTART2= must be specified.
Configuring the Restart Token File 163
This override can be used in conjunction with explicit override statements to provide restart tokens for sources which do not have explicit override statements. An explicit override statement for a source takes precedence over the special override statement for that source.
In the example, the session contains 7 source tables. The restart token file contains explicit override statements for 3 sources: RRTB_SRC_001, RRTB_SRC_002, and RRTB_SRC_004. It also contains the special override statement to provide the restart tokens for the remainder of the sources in the session. When the session executes, PWXPC issues message PWXPC_12060 as follows:
=============================== Session restart information: =============================== Extraction Map Name Restart Token 1 d1dsn7.rrtb0001_RRTB_SRC_001 0000060D1DB2000000000000060D1DB20000000000000000 d1dsn7.rrtb0002_RRTB_SRC_002 000000A3719500000000000000A371950000000000000000 d1dsn7.rrtb0003_RRTB_SRC_003 000000AD775600000000000000AD77560000000000000000 (special override) d1dsn7.rrtb0004_RRTB_SRC_004 000006D84E7800000000000006D84E780000000000000000 d1dsn7.rrtb0005_RRTB_SRC_005 000000AD775600000000000000AD77560000000000000000 (special override) d1dsn7.rrtb0006_RRTB_SRC_006 000000AD775600000000000000AD77560000000000000000 (special override) d1dsn7.rrtb0007_RRTB_SRC_007 000000AD775600000000000000AD77560000000000000000 (special override)
Source Restart file Restart file Restart file Restart file Restart file Restart file Restart file
PWXPC displays the sources with explicit overrides with Restart file under the Source column. The sources to which PWXPC assigns the special override restart tokens have special override in parentheses.
164
PWXPC automatically recovers warm started sessions when a resume recovery strategy is specified.
Select Resume from last checkpoint for the Recovery Strategy in the Properties tab. This is the only recovery strategy that enables PWXPC and the Integration Service to recover CDC sessions.
2.
Change the Commit Type attribute from Target to Source. CDC sessions always use source-based commit processing regardless of the Commit Type attribute setting.
3.
Disable the Commit at End of File attribute in the Properties tab. The Integration Service, after PWXPC has shutdown, issues a commit when the session ends. Data written to the targets after PWXPC shuts down is not reflected in the restart tokens. Disabling this flag ensures that PWXPC issues all commits.
Warning: If you are using the File Writer to write CDC data to flat files, do not enable recovery processing. Data loss or duplication may occur since the restart tokens for all targets, including relational targets, are compromised if there is a flat file target in the same session.
If you run a session with resume recovery strategy and the session fails, do not edit the mapping or sessions the state table entry or file before you restart the session. Recovery is compromised if changes are made. See Recovering from CDC Session Failures on page 173. When the Integration Service resumes a session, it restores the session state of operation, including the state of each source, target, and transformation. The Integration Service, in conjunction with PWXPC, determines how much of the source data it needs to reprocess. For additional information about the Integration Service recovery processing, see the PowerCenterWorkflow Administration Guide.
PWXPC Restart and Recovery Operation 165
166
Application Names
PowerExchange, when using ODBC connections, stores the restart tokens in the PowerExchange CDEP file on the extraction platform. PWXPC stores the restart tokens in:
The state file on the Integration Service platform, for non-relational sources The state table in the target database, for relational sources
PowerExchange always stores extraction history information in the CDEP file for each application name, regardless of whether the CDEP is being used to maintain restart tokens or not. With PWXPC the CDEP file is used for history only. An application name is required when using PWXPC. Each CDC session must use a unique application name in order to prevent failures due to conflicts in the CDEP. Application names cannot be shared with other CDC sessions.
Warning: Do not use a PWXPC CDC session application name when performing a Database Row Test in PowerExchange Navigator or when using the DTLUAPPL utility. Using the same application name as a PWXPC CDC session in another CDC session, a Navigator row test, or in DTLUAPPL fails with message:
PWX-04553 Error restart tokens [required | not allowed] for application "application name"
167
Use a unique application name when generating restart tokens with DTLUAPPL so you avoid any conflicts with existing application names used for CDC sessions. For more information on DTLUAPPL, see the PowerExchange Utilities Guide. For more information on configuring the restart token file, see Configuring the Restart Token File on page 162.
tokens. The following example generates restart tokens for source registration DB2DEMO1 using an application name of tokens and then prints those restart tokens:
MOD APPL tokens DSN7 RSTTKN GENERATE ADD RSTTKN db2demo1 END APPL tokens PRINT APPL tokens
The DTLUAPPL prints the generated tokens because the PRINT APPL statement is specified:
Application name=<tokens> Rsttkn=<1> Ainseq=<0> Preconfig=<N> FirstTkn =<> LastTkn =<> CurrentTkn=<> Registration name=<db2demo1.1> tag=<DB2DSN7db2demo11> Sequence=<000007248B9600000000000007248B9600000000> Restart =<D2D1D4D34040000007248B0E00000000>
DTLUAPPL does not generate the complete restart1_token value which is shown in the SEQUENCE token. You must be add the trailing four bytes (eight digits) of zeros manually when you update the restart token file. DTLUAPPL does generate the complete restart_token2 value in the Restart= token. You can copy this value to the restart token file.
Tip: You can use same restart tokens for multiple source tables in the restart token file to start
extracting changes from the same point in the change stream. You only need to run DTLUAPPL multiple times if you want to start extracting changes from different locations in the change stream for different sources. Using the tokens in this example, the restart token file looks as follows:
D1DSN7.db2demo1=000007248B9600000000000007248B960000000000000000 D1DSN7.db2demo1=D2D1D4D34040000007248B0E00000000
168
PWXPC performs the following tasks for cold start: 1. 2. 3. PWXPC reads the restart tokens from the restart token file only. See Determining the Restart Point on page 155. PWXPC commits the restart tokens to the state tables and file and issues message PWXPC_12104. PWXPC continues processing and committing data and restart tokens until the session ends or is stopped.
PWXPC automatically performs recovery when a workflow or task is warm started. You do not need to recover workflows and tasks before you restart them. PWXPC performs the following tasks for warm start: 1. 2. PWXPC reconciles the restart tokens from the restart token file and from the recovery state tables and file. See Determining the Restart Point on page 155. For heterogeneous targets, PWXPC queries the Integration Service about the commit levels of all targets. If all targets in the session are at the same commit level, PWXPC skips recovery processing. If recovery is required for heterogeneous targets, PWXPC re-reads the data for the last UOW committed to higher-level targets and flushes it to those targets with the lower commit level. The Integration Service commits any flushed data and restart tokens to any relational targets and updates any non-relational files.
Note: PWXPC does not read the restart token file if recovery is required.
3.
169
4.
If recovery is not required and the reconciled restart tokens differ from those in the state tables and file, PWXPC commits the reconciled restart tokens and issues message PWXPC_12104. PWXPC continues processing and committing data and restart tokens until the session ends or is stopped.
5.
Recovery Processing
Recover workflows and tasks by selecting the recover command in Workflow Manager, Workflow Monitor, or pmcmd. When you request recovery, PWXPC issues the following message:
PWXPC_12093 [INFO] [CDCRestart] Recovery run requested. Targets will be resynchronized if required and processing will terminate
Select Recover to populate the restart token file with the restart tokens for all sources in the CDC session so that you can cold start. You can also use recovery to ensure the targets and restart tokens are in a consistent state. PWXPC automatically performs recovery when a workflow or task is warm started. You do not need to recover workflows and tasks before you restart them. PWXPC performs the following tasks for recovery: 1. 2. 3. PWXPC reads the restart tokens from the recovery state tables and file. PWXPC creates the initialization restart token file with the reconciled restart tokens. For heterogeneous targets, PWXPC queries the Integration Service about the commit levels of all targets. If all targets in the session are at the same commit level, PWXPC skips recovery processing. If recovery is required for heterogeneous targets, PWXPC re-reads the data for the last UOW committed to higher-level targets and flushes it to those targets with the lower commit level. The Integration Service commits any flushed data and restart tokens to any relational targets and updates any non-relational files.
Note: PWXPC does not read the restart token file if recovery is required.
4.
5.
PWXPC updates the restart token file with the final restart tokens, creates the termination restart token file, and ends.
You can now warm start or cold start the workflow or task to process changed data from the point of interruption.
170
When you stop a workflow or task gracefully by issuing the stop command through PowerCenter or PowerExchange, the following action occurs: 1. The Integration Service requests PWXPC to stop if you issue the PowerCenter stop command. If you issue the PowerExchange stop command, it sends an end of file to PWXPC. 2. PWXPC performs end of file processing to flush the remaining uncommitted complete units of work to the targets and issues message PWXPC_12101. PWXPC also commits the restart tokens and issues message PWXPC_12068. The Integration Service processes all of data in the pipeline and writes it to the targets. The Integration Service sends an acknowledgement to PWXPC indicating that the targets have been updated. PWXPC issues message PWXPC_12075, writes the termination restart token file, and shuts down. The Integration Service ends the session successfully. Use Idle Time=0 in the PWX CDC Real Time connection which instructs PowerExchange to stop processing at end of log. See Configuring Idle Time on page 115. Use a PWX CDC Change connection to extract changes from condense files. When you use PowerExchange batch change extraction mode for condense files, the extraction automatically ends when all condensed data is read. You can also stop a workflow or task using the abort command in Workflow Monitor or pmcmd. For information about the abort command, see the PowerCenterWorkflow Administration Guide.
3. 4. 5. 6.
171
Gracefully stop the workflow. See Stopping CDC Sessions on page 170. After the workflow stops successfully, issue the Recover command for the CDC session. When you recover tasks, PWXPC writes the ending restart tokens for the session into the restart token file.
3. 4. 5. 6.
Change the session or workflow as desired. Ensure that the restart token file specified in the source CDC connection specifies the restart token file updated in the recovery session. Optionally, update the restart token file to add or remove sources. Cold start the CDC session.
Stop the workflow by issuing the Stop command in Workflow Monitor. After the workflow stops, use the Workflow Monitor and issue the Recover Task command from Workflow Monitor to run a recovery session. This displays the current restart points. The session log shows the following:
CDCDispatcher> PWXPC_12060 [INFO] [CDCRestart] =============================== Session restart information: =============================== Extraction Map Name Restart Token 1 d1dsn7.rrtb0002_RRTB_SRC_002 000000AD220F00000000000000AD220F0000000000000000 storage d1dsn7.rrtb0001_RRTB_SRC_001 000000AD220F00000000000000AD220F0000000000000000 storage d1dsn7.rrtb0003_RRTB_SRC_003 000000AD220F00000000000000AD220F0000000000000000 storage
PWXPC places the restart tokens in the restart token file specified in the CDC application connection.
3. 4.
Make any necessary changes to the mapping, session, and workflow to add the new source, RRTB_SRC_004. Run DTLUAPPL with RSTTKN GENERATE to generate restart tokens for the current end-of-log. Use the following DTLUAPPL control cards to do this:
mod APPL dummy DSN7 rsttkn generate mod rsttkn rrtb004 end appl dummy print appl dummy
172
Add eight zeroes to the end of the Sequence= value to create the restart token file value.
5.
Update the restart token file to add the new source and its tokens. The updated file looks as follows:
<!-- existing sources d1dsn7.rrtb0001_RRTB_SRC_001=000000AD220F00000000000000AD220F0000000000000000 d1dsn7.rrtb0001_RRTB_SRC_001=C1E4E2D34040000000AD0D9C00000000 d1dsn7.rrtb0002_RRTB_SRC_002=000000AD220F00000000000000AD220F0000000000000000 d1dsn7.rrtb0002_RRTB_SRC_002=C1E4E2D34040000000AD0D9C00000000 d1dsn7.rrtb0003_RRTB_SRC_003=000000AD220F00000000000000AD220F0000000000000000 d1dsn7.rrtb0003_RRTB_SRC_003=C1E4E2D34040000000AD0D9C00000000 <!-- new source d1dsn7.rrtb0004_RRTB_SRC_004=00000DBF240A0000000000000DBF240A0000000000000000 d1dsn7.rrtb0004_RRTB_SRC_004=C1E4E2D3404000000DBF238200000000
6.
Cold start the session. PWXPC passes these restart tokens to PowerExchange to recommence extracting changes from the change stream. Note that this restart point is earlier than the one just generated for the new source. The new source does not receive any changes until the first change following its restart point is encountered.
Configuring the Restart Token File on page 162 Using DTLUAPPL with CDC Sessions on page 167
Permanent errors such as source or target data errors Transitory errors such as infrastructure problems, server crashes, and network availability issues.
If the session fails because of transitory errors, restart the session after the source of the transitory error is corrected. PWXPC automatically recovers warm started sessions, if required although you can also run a recovery session. See Recovery Processing on page 170.
Note: You cannot override the restart point if recovery processing is required. PWXPC does
not read the restart token file if you warm start and recovery is required or if you run a recovery session. CDC sessions also fail because of permanent errors, such as SQL failures or other database errors. You must correct permanent errors before restarting the CDC session. With some failures, you can correct the error and then restart the CDC session. In other cases, you need to re-materialize the target table from the source table before you recommence applying
173
changes to it. If you re-materialize the target table, you need to provide restart tokens matching the new restart point in the change stream and the cold start the CDC session.
PWXPC automatically recovers when the session is warm started. PWXPC issues the following messages displaying the restart tokens found for the session and its sources:
CDCDispatcher> PWXPC_12060 [INFO] [CDCRestart] =============================== Session restart information: =============================== Extraction Map Name Restart Token 1 d1dsn8.rrtb0004_RRTB_SRC_004 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage d1dsn8.rrtb0009_RRTB_SRC_009 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage d1dsn8.rrtb0005_RRTB_SRC_005 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage d1dsn8.rrtb0006_RRTB_SRC_006 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage d1dsn8.rrtb0008_RRTB_SRC_008 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage d1dsn8.rrtb0003_RRTB_SRC_003 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage d1dsn8.rrtb0002_RRTB_SRC_002 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage d1dsn8.rrtb0001_RRTB_SRC_001 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage d1dsn8.rrtb0007_RRTB_SRC_007 00000FCA65840000000000000D2E004A00000000FFFFFFFF storage
Restart Token 2 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000
Source GMD GMD GMD GMD GMD GMD GMD GMD GMD
PWXPC issues the PWXPC_12069 message when it detects that recovery is required. The 12069 message usually includes the begin-UOW (from) and end-UOW (to) restart tokens for the oldest uncommitted UOW that PWXPC re-reads during recovery. PWXPC stores endUOW restart tokens in the state table and file unless sub-packet commit is used. See Configuring Commit Threshold on page 121.
CDCDispatcher> PWXPC_12069 [INFO] [CDCRestart] Running in recovery mode. Reader will resend the the oldest uncommitted UOW to resync targets: from: Restart 1 [00000FCA65840000000000000D2E004A00000000FFFFFFFF] : Restart 2 [C1E4E2D3404000000D21B1A500000000] to: Restart 1 [00000FCA65840000000000000D300D8000000000FFFFFFFF] : Restart 2 [C1E4E2D3404000000D21B1A500000000].
The from restart tokens are the same as those displayed in the PWXPC_12060 messages for all sources. This restart token represents the start point in the change stream for the oldest uncommitted UOW. The to restart tokens represent the end of the oldest uncommitted UOW. Since the application connection for this session specifies sub-packet commit, the Restart 2 value is the begin-UOW value in both cases. The Restart 1 values represent the start and end change records in the Restart 2 UOW. PWXPC rereads the changes between the two restart token values in the 12069 message and issues a commit for the data and the restart tokens. The Integration Service writes the data to the target tables and the restart tokens to the state table. PWXPC and the Integration Service continue to read and write data and restart tokens until the session ends or is stopped.
174
You can determine starting and ending restart points for each extraction using historical copies of the restart token file. You need historical copies of the session logs to re-extract changes at any point in between the session start and end. When PWXPC issues a real-time flush to commit data to the targets, it issues message PWXPC_10081. This message contains the restart tokens at that point in time:
PWXPC_10081 [INFO] [CDCDispatcher] raising real-time flush with restart tokens [<restart1_token>], [<restart2_token>] <because UOW Count [<n>] is reached.> | <because Real-time Flush Latency [<n>] occurred.>
175
To restart an extraction from a specific commit point, use the restart tokens in the appropriate 10081 message to populate the restart token file and cold start the CDC session. PWXPC passes the restart token file values to PowerExchange to extract the data from that point forward.
176
Chapter 7
Target Key Transformations, 178 Group Source and Flexible Transformations, 187
177
Here we see the that COL1 is updated from an A to a C. COL1 is marked as the primary key for the target table and is therefore treated as a delete of row A and an insert of row C.
178
To add the before image and change indicator right click on the required extraction group. Open the group as shown below:
179
3.
Open the required extraction map by either right-clicking and selecting open , or by double-clicking the required extraction map.
4.
Right-click the column which requires the before image and change indicator to be set.
180
5.
181
6.
To set the change indicator, select and add the required column by double-clicking or highlighting it and then clicking Add. When all the required change indicators have been set up, click the Before Images tab.
7.
Repeat the process for columns which will require the before image to be included with the change capture data.
Note: The BI and CI column names can be changed on these screens by single-clicking and
editing. The following sections in this chapter will refer to the default names.
Configuring PowerCenter
The PowerExchange Change Data Capture (CDC) source will now need to be imported. This must be done through the Import from PowerExchange option under Source Analyzer in the PowerCenter Designer. For more information on importing a CDC source see Working with Extraction Map Definitions on page 57. After the CDC source has been imported, the target definitions will be required. Import them using the Target Designer in the PowerCenter Designer. For more information about importing a target see Source and Target Definitions on page 23.
182
The following example will work with a DB2 data capture source and a DB2 target table. The structure of the source CDC table is:
Figure 7-2. DB2 Extraction Map Source Mapping
Note: The columns prefixed DTL_CI and DTL_BI are the change indicator and before image
respectively which are vital for this flexible transformation. The DTL__CAPX columns are PowerExchange capture columns. The target is a DB2 table of the following structure:
Figure 7-3. DB2 Target Table Mapping
The source table has a primary key of CUSTOMER_ID, and the PARTNER_CUST_ID is an attribute of the table. The transformation will load the data into the target table where the PARTNER_CUST_ID is the primary key, and the CUSTOMER_ID is an attribute of that table. In this case, if the PARTNER_CUST_ID changes in the source table, a delete and insert will be required for the target.
183
Now drag the PowerExchange CDC source, and the target into the mapping as below:
Figure 7-4. DB2 Source to DB2 Target CDC mapping
Note: Any of the control information including BI and CI columns could, if required, be
From the Transformation option, select Create and then select a transformation type of FLXKEYTRANS.
Link into the transformation the required columns from the source for the target. Also, the transformation will require the BI and CI indicators that were assigned earlier in
184
PowerExchange (see Configuring the PowerExchange Extraction Map on page 179). A final column DTL__CAPXACTION should also be added to the transformation.
3.
The BI and CI columns need to be linked to the Flexible transformation. Right-click the transformation you have just created and select Edit. From within the edit dialog box, select the Source Column Map tab and add the columns to which the relevant before images and change indicators are assigned.
Note: The example above shows a single column primary key. Multiple columns can be
185
4.
Link the transformation to the target. Only link the required data columns from the transformation to the target. No links will exist from the transformation to the target for the CI and BI columns, nor for the DTL__CAPXACTION column unless required in the target. This will result in the following mapping:
5.
186
Note: In this example a data map has been created for records with a REC_TYPE of A in the
example above, and for REC_TYPE of B. These data maps are ksdss1.ksdsm1 and ksdss2.ksdsm2 respectively. 3. 4. Now register each of these data maps for PowerExchange capture. This process is described in the relevant PowerExchange Adapter Guide. Assign the BI and CI PowerExchange fields to the extraction map as shown above in Configuring the PowerExchange Extraction Map on page 179.
187
5. 6. 7.
Now import these two extraction maps as data sources into the PowerCenter Designer using Import from PowerExchange. Import the relevant data targets. In this example the two input capture streams will be written to two separate DB2 tables. When the mapping is created it will look similar to the following:.
Note how the Flexible transformation contains the columns for both of the PowerExchange change data input sources, and that they are linked to their own individual output tables. The Flexible Transaction includes Input and Output Groups. An input and output group is added when the flexible transformation is created, but subsequent input/output groups will be required for each source. These are added when you Edit the transformation and select the
188
Ports tab shown below and then use the Create Input Group and Output Group buttons (the Create Input Group button is highlighted here):
Before validating the mapping the before imaged and change indicators will need to be assigned to ports as shown in step 4 on page 187.
189
190
Installing PowerExchange ODBC, 193 Working with Mappings, 197 Configuring Connections, 205 Working with Sessions, 211 PowerExchange Restart and Recovery, 221
191
192
Chapter 8
193
Overview
Before installing and configuring the PowerExchange ODBC connection, you must install and configure PowerCenter and PowerExchange.
Note: When connecting to PowerExchange, Informatica recommends using PWXPC instead
of PowerExchange ODBC. PWXPC has additional functionality as well as improved performance and superior CDC recovery and restart. See Functional Comparison between PWXPC and PowerExchange ODBC on page 4.
Installation Requirements
To use PowerExchange ODBC connections with PowerCenter, the following products must be installed:
PowerCenter 8.5.1. For more information about installing PowerCenter see the PowerCenter Installation Guide. PowerExchange 8.5.1. Install PowerExchange on the PowerCenter Client and Integration Service machines. For more information about installing PowerExchange see the PowerExchange Installation Guide.
The PowerCenter Client, Integration Service and Repository Server software needs to be installed on the appropriate platforms. The PowerExchange software needs to be installed on the same PowerCenter Client and Server machines. If you have installed the 32-bit version of Integration Service, you must install the 32-bit version of PWXPC and of PowerExchange. If you have installed the 64-bit version of Integration Service, you must install the 64-bit version of PWXPC and of PowerExchange.
194
Click Control Panel > Administrative Tools >Data Sources (ODBC). Click the System DSN tab. Click the Add button. Select the Informatica PowerExchange driver from the list of available drivers. Click Finish. Enter a name for the data source in the Name box. Select the location from the Location pull-down list. This is name defined on a NODE= statement defined within the PowerExchange configuration file (dbmover.cfg). Select the data source type from the Type pull-down list. Depending on the data source selected you will be presented with other specific properties that you can set. Complete all properties parameters, and click OK. The ODBC data source is created.
For more information about creating ODBC data sources, see "Using ODBC with PowerExchange" in the PowerExchange Reference Manual.
195
DESCRIPTION='<Descriptive Text for Data Source>' LOCATION=<data source node from dbmover.cfg> DBTYPE=< Access method for file or database> (other ODBC parameters as appropriate)
For more information about ODBC data source parameters (both mandatory and optional) for a specific DBTYPE, see the PowerExchange Reference Manual. The <data source name> defined in the odbc.ini is specified in the Connect String value of the ODBC Connection in PowerCenter. This connect string causes the PowerExchange ODBC driver to be loaded and the specified location to be contacted to extract or load the data. For more information about ODBC connectivity with PowerCenter, see the PowerCenter Configuration Guide.
196
Chapter 9
Overview, 198 Working with Source and Target Definitions for PowerExchange Batch, 199 Working with Source Definitions for PowerExchange Change or Real-time, 202
197
Overview
A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. Mappings represent the data flow between sources and targets. Source and target definitions represent metadata for sources and targets. When you create a source definition, its structure differs depending on the type of source it represents:
Non-relational sources require a multi-group source definition. Relational sources use a single-group source definition.
The source qualifier for a source definition also differs in structure depending on the type of source definition. After you create a source or target definition, you can include it in a mapping to extract data from the source or load data to the target. You can extract source data in batch, change, or real-time mode. You can use one source definition and one mapping for all modes. For a list of sources and targets that PowerExchange ODBC Interface supports, see Table 1-4 on page 10. This table also lists whether the Integration Service can read the source data in batch, change, or real-time mode.
198
199
Click Sources > Import from Database in the Source Analyzer if importing a source definition. Click Targets>Import from Database in the Target Designer if importing a target definition. The following Import Tables dialog box appears.
Use the Owner name field to restrict the objects retrieved. When you import PowerExchange data maps, the Owner is the Schema Name of the data map and the ODBC data source must have a DB Type of NRDB or NRDB2. When the DB2 catalog is used for DB2/390 or DB2/400, the Owner is the owner of the DB2 tables and the DB Type must be either DB2 or DB2400C. If the Listener pointed to by the ODBC data source is running with PowerExchange Security (either SECURITY=(1,x) or SECURITY=(2,x) in the PowerExchange configuration file), then a valid userid and password must be provided. The only difference between NRDB and NRDB2 is whether a three or two tier naming conventions is used in the SQL statements to extract or load data. Non-relational sources and targets mapped in PowerExchange can be referred to using either NRDB or NRDB2. The format is as follows:
schema.mapname.table for NRDB
or
schema.mapname_table for NRDB2
200
2.
Click Connect.
3.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a schema. Click Select All to select all tables. Click Select None to clear all highlighted selections.
4.
5.
201
Click Sources > Import from Database in the Source Analyzer. The Import Tables dialog box appears.
Use the Owner name field to restrict the objects retrieved. When you import PowerExchange extraction maps the Owner is the first qualifier of the extract map name. The entire extraction map name has the following format:
<D><N><instance>.<regname>_TABLENAME
202
where:
D - the default entry starts with D (any user-modified maps start with U) N - database-specific identifier (e.g., 1 for DB2/390, 2 for IMS, 3 for DB2/400, etc.) instance - instance name chosen for the source registration regname - the registration name chosen for the source registration.
For example, DB2/390 extraction map name might be: d1dsn7.testdb2_KJM723TB. When you import Change Data Capture source definitions, the ODBC data source must have a DB Type of CAPX or CAPXRT. These DB Types instruct the Listener to select extraction maps rather than data maps. If the Listener pointed to by the ODBC data source is running with PowerExchange Security (either SECURITY=(1,x) or SECURITY=(2,x) in the DBMOVER configuration), then a valid user name and password must be provided.
2.
Click Connect.
3.
Hold down the Shift key to select blocks of tables. Hold down the Ctrl key to make non-contiguous selections within a schema. Click Select All to select all tables. Click Select None to clear all highlighted selections.
4.
5.
203
204
Chapter 10
Configuring Connections
205
Overview
Before PowerCenter can access a source or target in a session, you must configure connections in the Workflow Manager. When you create or modify a session that reads from or writes to a database, you can select only configured source and target databases. Connections are saved in the repository. For PowerExchange ODBC, you configure relational database connections.
206
Extraction Mode Batch mode (non-relational data map) Batch mode (relational)
Connection Type ODBC with NRDB and NRDB2 Data Source ODBC with Data Source of appropriate database type (DB2, DB2400C, ADAUNLD, etc.) ODBC with CAPX Data Source ODBC with CAPXRT Data Source
Note: For more information about the full range of database types that can be specified in
207
Database Type (Access Method) DB2 (DB2400C) IDMS IMS Sequential (NRDB/NRDB2) VSAM- KSDS (NRDB/NRDB2) VSAM-ESDS (NRDB/NRDB2) VSAM-RRDS (NRDB/NRDB2)
208
Table 10-3 shows the Connection Object Definition dialog box for an ODBC relational connection and describes the connection attributes to configure for an ODBC relational database connection:
Table 10-3. ODBC Connection Object Definition Table
Connection Attribute Name User Name Password Connect String Code Page Connection Environment SQL Transaction Environment SQL Connection Retry Period Required/ Optional Required Required Required Required Required Optional Optional Optional
Description Name for the relational database connection. Username for the data source. Password for the User Name. Name of the ODBC data source. Code page for the Integration Service to use to extract the data from the data source. Executes an SQL command with each database connection. Default is disabled. Executes an SQL command before the initiation of each transaction. Default is disabled. Number of seconds the Integration Service attempts to reconnect to the database if the connection fails. If the Integration Service cannot connect to the database in the retry period, the session fails.
209
210
Chapter 11
Working with WorkflowsOverview, 212 Extracting Data from PowerExchange in Batch Mode, 213 Extracting Data from PowerExchange in Change and Real-time Mode, 217 Loading Data to PowerExchange Targets, 219
211
Pipeline Partitioning
Depending on your source or target database, you can increase the number of partitions in a pipeline to improve session performance. Increasing the number of partitions allows the Integration Service to create multiple connections to sources and targets and process partitions of data concurrently. While processing data, the Integration Service may process data out of sequence due to the varying rates at which the partitions process data. When you create a session in a workflow, the Workflow Manager validates each pipeline in the mapping for partitioning. You can specify multiple partitions in a pipeline if the Integration Service can maintain data consistency when it processes the partitioned data. For more information about partitioning and a list of all partitioning restrictions, see the PowerCenterWorkflow Administration Guide.
212
conventions is used in the SQL statements to extract or load data. Non-relational sources and targets mapped in PowerExchange can be referred to using either NRDB or NRDB2. The format is as follows:
213
In the Task Developer, double-click a session with a non-relational source to open the session properties. Click the Sources view on the Mapping tab.
3. 4. 5.
In the Reader field of the Readers settings, Relational Reader is automatically selected. In the Connections Value field, select the non-relational ODBC connection (NRDB or NRDB2). In the Properties settings, configure the Owner Name attribute. For more information about other Properties attributes, see the PowerCenter Workflow Administration Guide. At minimum, the schema name of the PowerExchange data map (or the Owner name displayed when the source mapping is edited) must be specified in order to correct construct the SQL statement during execution. PowerExchange SQL Escape Sequences can also be specified in this attribute field to override specifications in the data map. For a complete list of the SQL Escape Sequences available, see the PowerExchange Reference Manual. The following example indicates how the Owner Name attribute can be configured to provide the schema name for the source (seq) as well as an override for the physical file name in the data map (dtldsn=new.dataset.name):
seq{dtldsn=new.dataset.name}
214
Note: SQL escape sequences and the Owner Name can be specified in any order in the attribute field.
6.
Click OK.
In the Task Developer, double-click a session with a relational source to open the session properties. Click the Sources view on the Mapping tab.
2. 3. 4.
In the Reader field of the Readers settings, Relational Reader is selected. In the Connections Value field, select the appropriate relational ODBC connection. In the Properties settings, configure the Owner Name attribute. For more information about other Properties attributes, see the PowerCenter Workflow Administration Guide.
215
At minimum, the owner name of the source table must be specified in order to correct construct the SQL statement during execution. Alternatively, the Owner Name can be specified in the source mapping and, if done, would not be required here.
5.
Click OK.
216
In the Task Developer, double-click a session with a relational source to open the session properties. Click the Sources view on the Mapping tab.
3.
In the Reader field of the Readers settings, Relational Reader is automatically selected.
217
4.
In the Connections Value field, select a connection which points to an ODBC data source with DBType CAPX (for Change) or CAPXRT (for Real-Time) and the appropriate Location value. In the Properties settings, configure the Owner Name attribute. For more information about other Properties attributes, see the PowerCenter Workflow Administration Guide. At minimum, the schema name of the source extraction map must be specified in order to correct construct the SQL statement during execution. This name is the first qualifier of extraction map name shown in the PowerExchange Navigator. It can also be determined by checking the Owner Name in the source mapping. PowerExchange SQL Escape Sequences can also be specified in this attribute field to override specifications in the data map. For a complete list of the SQL Escape Sequences available, see the PowerExchange Reference Manual. The following example indicates the manner in which the Owner Name attribute can be configured to provide the schema name for the source (d6vsam) as well as an override for the application name specified in the DBQual2/Application Name field in the ODBC data source (dtlapp=new_appname):
{dtlapp=new_appname}d6vsam
5.
Note: SQL escape sequences and the Owner Name can be specified in any order in the attribute field.
6.
Click OK.
218
In the Task Developer, double-click a session with a relational source to open the session properties. Click the Targets view on the Mapping tab.
3. 4. 5.
In the Writers settings, ensure that Relational Writer is selected. In the Connections Value field, select a connection which points to an ODBC data source with appropriate DBType and Location value. In the Properties settings, configure the Table Name Prefix attribute. At minimum, the schema name of the target table (if relational) or the PowerExchange data map (if non-relational) must be specified in order to correct construct the SQL statement during execution.
6.
Click OK.
219
220
Chapter 12
221
Overview
Recovery and restart need to be considered when designing and configuring sessions and workflows using either PowerExchange Change or Real-Time. The considerations differ depending upon whether the session uses PowerExchange Client for PowerCenter (PWXPC) or PowerExchange ODBC. This is primarily due to the differences in where the restart information is maintained. With PowerExchange ODBC, the restart information is controlled and maintained on the PowerExchange Listener platform in the internal change information file (CDEP) using the application name specified in the connection.
222
conflict in the CDEP between multiple sessions. The CDEP information for an extraction is only updated when a session ends successfully. Failed sessions do not update the restart token information in the CDEP with the progress so far.
Warning: Failed sessions that are restarted will extract data from the last successful session (based on CAPXTYPE specified). This means that there is the possibility that duplicated changes will be sent to the target. In the event of a failure, you must either restore the targets to match the restart point prior to restarting the session or design your PowerCenter session to handle the possibility of duplicate records.
223
After a session completes successfully, PowerExchange updates the CDEP with the ending restart tokens.
The application name odbc_db2demo13ac was created using DTLUAPPL with RSTTKN GENERATE. An example of the control cards that were used is given below:
mod APPL odbc_db2demo13ac DSN7 RSTTKN GENERATE add rsttkn db2demo1 add rsttkn db2demo2 add rsttkn db2demo3 end APPL odbc_db2demo13ac print appl odbc_db2demo13ac
You can use either ADD APPL or MOD APPL for a new application name as with MOD APPL DTLUAPPL will create the application name if it does not already exist. After the new restart tokens are provided through DTLUAPPL, the session using this application name can be started and it will use this restart tokens.
224
Part 4: Appendices
225
226
Part 4: Appendices
Appendix A
Tips
Organizing Sources by Map Type in Designer, 228 Filtering Source Data Using PWXPC, 230 Using DTLREXE to Submit MVS Batch Jobs, 233 Creating Sequential and GDG Data Sets, 236
227
The IMS database is called IMS1T01 and this is used as the PowerExhange Data Map Name A segment in that database results in a table in the data map called IMSSEG1 The PowerExchange Schema Name used when creating the data map is IMS
The PowerExchange map name (in the NRDB2 form) is constructed as follows:
<schema_name>.<datamap_name>_<table_name>
In this example, the PowerEchange data map name is IMS.IMS1T01_IMSSEG1. If you use the IMS database name for the Capture Registration Name as well, the resulting extraction map name will be d2reconid.IMS1T01_IMSSEG1. The map name for both the data map and extraction map are the same. Since the schema name is not used in the source definition table name, the names will be exactly the same when imported in Designer. As a result, only one will be able to be imported from the same Location. So, there are two choices: 1. 2. Use a different name for the capture registration so that a unique extraction map name is created. Use a different Location name to import data maps and extraction maps
If you want to use the same names for both the data map name and the capture registration name, then you will need to use a different location name in the Import from PowerExchange to place these two source definitions in different folders. To help organize sources in Designer, use location names in the PowerExchange dbmover.cfg that indicate the type of data you are retrieving. Creating multiple NODE= statements
228 Appendix A: Tips
(location names) for the same listener is perfectly acceptable. In this example, if you create a NODE called CDCMAPS in the dbmover.cfg and use this as the Location when importing the IMS extraction map, it will be stored in a sub- folder called IMS_CDCMAPS. This strategy of separating extraction maps from regular relational and non-relational source metadata can be a useful way of organizing source metadata in Designer.
229
With some source types you cannot limit change capture changes based on only certain columns changing (e.g., VSAM and IMS). It is possible, for certain source types, to create capture registrations in PowerExchange which only register specific columns (as opposed to all of them). However, if the RDBMS logs are used directly for extraction and they does not support this selective column capture, then the row will be extracted even if none of the columns of interest have changed (e.g., DB2/400). The source type is one where either PowerExchange itself or the RDBMS will only capture changes based on columns of interest. However, additional or all columns are registered for capture because other extractions require them. You only want to extract columns with a specific value. For example, you want to read all of the columns in a table for a specific customer.
In these types of cases, you can use the source Filter Override attribute in the Session Properties to filter the source data. This can be done with sources using a PWXPC Batch, Change, and Real-Time connection. The filters specified are then included in the WHERE clause sent to PowerExchange. Proper SQL syntax should be followed for these overrides to prevent SQL failures. The SQL filter specified can specify any type of column that exists in the source mapping. This includes regular data columns and PowerExchange-generated columns such as DTL__CAPX columns, change indicator columns (DTL__CI) and before image columns (DTL__BI). There are two forms of the filter syntax. The simplest form is for single record sources such as CDC data maps, relational tables, or single record non-relational data maps:
<filter condition1>;<filter condition2>;...
For multi-record non-relational data map sources, there is a more complex form of the syntax:
<group name1>=<filter condition1>;<group name2>=<filter condition2>;...
This more complex form of the syntax allows you to use different filters for different record types or the same filter for only some of the record types in a multi-record source. You can also use the simple form with multi-record source data maps which then causes that filter to be applied to all records.
230
Appendix A: Tips
The following example shows how to use the PowerExchange Change indicator columns (DTL__CI_column) to filter changed data specifically, the change indicator for the ACCOUNT field which is called DTL__CI_ACCOUNT:
Figure A-1. Filter Overrides: Single-Record Filter
In the following example, the multi-record VSAM source contains four records which each have unique field names. The group names for the four records are: V07A_RECORD_LAYOUT, V07B_RECORD_LAYOUT, V07C_RECORD_LAYOUT, and V07D_RECORD_LAYOUT. The filter contains the group-name filter syntax in order to filter data records for the first two records. No filtering is done on the other two records. The filter specified is in the Filter Overrides attribute is:
V07A_RECORD_LAYOUT=V07A_RECORD_KEY=1;V07B_RECORD_LAYOUT=V07B_RECORD_KEY=2
Because there are four records in the multi-record data, there will be four SELECT statements created by PWXPC. The SELECT statements for the two records specified in the Filter Overrides attribute will also have WHERE clauses for their specific filters.
231
Records for V07A_RECORD_LAYOUT and V07B_RECORD_LAYOUT will be filtered whereas the other two records in the file will not be.
232
Appendix A: Tips
Truncate a database table prior to loading data into it in a session. This is useful for database types for which PowerCenter does not support truncate, such as Adabas. Notify a MVS-based job scheduler that the workflow is starting or ending. Some job schedulers provide batch posting utilities and they can be utilized using DTLREXE to submit a batch job. Unload a database to a flat file so it can then be used in a session to load another database. Clean up DB2 bulk load files when the session completes successfully. Submit any type of MBS Batch JOB for which waiting for the completion and returning a set of messages is required.
233
The following example shows how to setup a DTLREXE PROG=SUBMIT command as a pre-session command:
Figure A-3. Pre-Session Command - DTLREXE
In this example, the DTLREXE command specifies mode=(job,wait) which means that the DTLREXE will wait for the job to complete. This, in turn, will cause the session to wait until this pre-session command completes. In the Error Handling section of the Config Object, you can specify how to handle errors for pre-session commands in the On Pre-session command task error field.
Note: Ensure that the JOB submitted through DTLREXE includes the appropriate DTLNTS
steps if WAIT mode is requested. The PowerExchange RUNLIB, in member DTLREXE, contains sample JCL to be used with DTLREXE that includes the required DTLNTS steps. If you are using a stand-alone command task to submit a batch JOB using DTLREXE, then there are no session configuration options to check for success or failure. If you want to test the status of the command task in the following session, you will need to use one of the taskspecific workflow variables available in the Workflow Manager; that is, either PrevTaskStatus
234
Appendix A: Tips
or Status. These variables can be used in link conditions to test the status of tasks in a workflow. For example:
Figure A-4. Workflow Link Condition - DTLREXE
Link Condition
The link condition is created by double-clicking on the link between the DTLREXE command task and the s_bulk_db2demo123_db2demoabc session to which it is connected. This will invoke the Expression Editor which allows you to add the test to ensure that the DTLREXE command task succeeded, as shown below:
Figure A-5. Command Task Expression Editor - DTLREXE
For additional information on link conditions and the expression editor, see the PowerCenter Workflow Administration Guide.
235
In Workflow Manager, right-click on the appropriate task in either Task Developer or in your workflow in Workflow Designer. Select Tasks > Edit. The Edit Tasks dialog box is displayed. Select the Mappings tab. In the Pre SQL attribute in the Session Level Properties for the target, enter the following: <CMD>CREATEFILE FN=data_set_name If this is a GDG data set, then the data_set_name should be gdg_base_name(+1) to create a new generation.
5.
Click OK.
236
Appendix A: Tips
The following example show a CREATEFILE command for a new generation of GDG data set my.gdg:
Figure A-6. Session Mapping Tab - File Create Pre-SQL Command
Note: When using this procedure for GDG data sets, the GDG base name specified must exist
and GDGLOCATE=Y must be specified in your PowerExchange DBMOVER configuration file on the MVS platform. When you run the workflow, the new generation of the GDG is created in addition to the normal processing of the workflow. The allocation parameters used to create the data set are specified in the DBMOVER configuration used by the PowerExchange Listener.
237
The CREATEFILE command has a number of parameters. Any number of these can be specified. The parameters and their values are separated with a space in the command. The parameters are:
Parameter FN Platform All Description File name to be created. If specifying a relative GDG data set name, the file name must be in double quotes (). This parameter is required. Userid. This parameter is required if your Listener is running with user security (SECURITY=1 or 2). Password for the userid specified in UID. Encrypted password for the userid specified in UID. Only one of PWD or EPWD is required. Model DSCB to be used for the file creation. Generally, this is only required for GDG data sets which are not SMS-managed. Space allocation parameters in the format: SPACE=(u,p,s) where: - u is units (T for tracks and C for cylinders) - p is primary space allocation value - s is secondary space allocation value BS RELEASE VOLSER UNIT LRECL RECFM MVS only MVS only MVS only MVS only MVS, AS400 only MVS, AS400 Block size. Release unused allocated space on CLOSE. Valid value is Y. Volume serial. Unit type. Logical record length. Record format.
MVS / AS400 only MVS / AS400 only MVS / AS400 only MVS only MVS only
These value override the equivalent parameters specified in the DBMOVER configuration file. Any values not specified use either those specified in DBMOVER or the standard PowerExchange defaults. See the PowerExchange Reference Manual for further information on these parameters.
238
Appendix A: Tips
Appendix B
Datatype ReferenceOverview, 240 PowerExchange and Transformation Datatypes, 241 Relational Datatypes, 243 Reading and Writing Binary Data in PowerExchange Client for PowerCenter, 244 Using Code Pages, 245
239
Datatype ReferenceOverview
PowerCenter uses the following datatypes when reading source data, transforming the data, and writing target data:
Native datatypes. Specific to the source and target databases or PowerExchange. Native datatypes appear in source and target definitions. Transformation datatypes. Generic datatypes that appear in transformations. The Integration Service uses the datatypes to move data across platforms.
For more information about transformation datatypes, see the PowerCenter Designer Guide.
240
CHAR DATE
10 10
String Date/Time
18 7 3 3 5 5 10 10
Double Double Small Integer Small Integer Small Integer Integer Integer Double
241
NUM64U
19
Decimal
NUMCHAR PACKED 15
String Decimal
TIME
Date/Time
TIMESTAMP
Date/Time
UPACKED
15
Decimal
UZONED
15
Decimal
VARBIN
10
Binary
VARCHAR ZONED
10 15
String Decimal
242
Relational Datatypes
PowerExchange Client for PowerCenter supports the same datatypes for DB2/390, DB2/400, and DB2/UDB that PowerCenter supports for DB2. It also supports the same Oracle and the same SQL Server datatypes that PowerCenter supports. For more information about PowerCenter datatypes, see the PowerCenter Designer Guide.
Relational Datatypes
243
244
245
246
Appendix C
Troubleshooting
Troubleshooting, 248
247
Troubleshooting
When I go into Designer, I get messages about failures to load DLLs. This can happen when PowerExchange Client for PowerCenter plug-ins are installed but cannot be loaded for various reasons like incorrect releases of PowerExchange installed or PATH problems. For more information, see KnowledgeBase Article # 15346. I want to import a DB2/400 source definition, but need to determine the name of the DB2/400 database on the AS/400 machine. Use the AS400 DSPRDBDIRE command to see a list of databases on the AS/400 machine. The session failed with an error stating that the PowerExchange message repository cannot be loaded. You can receive this error on UNIX when there is no PWX_HOME environment variable set to the PowerExchange installation directory. Set the PWX_HOME environment variable to the PowerExchange installation directory. I set the Idle Time session condition to -1. However, the session completed with the following message: Idle Time limit is reached. This can occur if EOF=Y is specified in the PowerExchange configuration file (dbmover.cfg) CAPI_CONNECTION statement. When you set EOF=Y, PowerExchange returns an EOF (which stops the session) when it reaches the end of the change stream as determined at the time the session starts reading from it. As a result, the PowerCenter session completes instead of continuing to run. This message can also occur if the connection with PowerExchange is stopped using the PowerExchange STOPTASK command. My session seems to be processing the pipelines serially. The Integration Service may be configured to process master and detail pipelines sequentially as it did in versions prior to 7.0. As a result, it reads data from each source in change and realtime modes sequentially. Clear the PMServer 6.X Joiner Source Order Compatibility option on the Compatibility and Database tab in the Informatica Server Setup. When you rerun the session, the Integration Service will process pipelines concurrently. The session failed with a plug-in error:
MAPPING> SDKS_38007 Error occurred during [initializing] reader plug-in #30nnnn.
This is a generic message indicating the PWXPC encountered an error. Review the session log for other messages indicating what the problem is. If there are no other error messages in the session log, check the PowerExchange logs on both the Integration Service platform and the Listener platform.
248
Appendix C: Troubleshooting
I want to read all of the changes I have captured and have them be inserts into a staging area. How do I do this? When using PowerExchange ODBC to read captured changes, INSERT is the default operation. If you want to apply the changes to the target using the same operation as done on the source (INSERT, UPDATE, or DELETE), you need to explicitly include an Update Strategy transformation in the mapping to make this happen by testing the DTL__CAPXACTION field. In the Update Strategy Expression field, you would code:
DECODE(DTL__CAPXACTION,'I',DD_INSERT,'U',DD_UPDATE,'D',DD_DELETE,DD_REJECT)
When using PWXPC, the DTL__CAPXACTION field is automatically acted upon when processing changed data. If you want to have all changes processed as INSERTs regardless of the DTL__CAPXACTION field, you must code an update strategy specifying DD_INSERT in the Update Strategy Expression field.
Troubleshooting
249
250
Appendix C: Troubleshooting
Index
A
access method CAPX 7 CAPXRT 7, 8 Application Multi-Group Source Qualifiers See source qualifiers application name restart points 117, 167
B
batch extraction mode PowerExchange Condense 7 batch mode configuring sessions 129 Before image Flexible transactions 184 Bulk Load (property) configuring 87
C
CAPX access method 7 CAPXRT access method 7, 8
CBLO See constraint-based loading CDC data group source 8, 72 CDC data map extraction map 163 See also extraction map source definitions CDC sessions adding source 171 recovery example 174 removing source 171 restart 117, 151, 165 restart token file 162 stopping 116, 117, 170 CDEP restart 167 change data capture See change mode See also real-time mode Change Indicator Flexible transactions 184 change mode configuring connections 117 configuring sessions 135 code pages See also PowerCenter Installation and Configuration Guide configuring 245 supported code pages 245
251
compression configuring 108 Condense UOW Cleanser 7, 8 configuring code pages 245 compression 108 connections 83 encryption 108 pacing size 109 sessions 128, 212 workflows 128, 212 connections configuring 83 list by source type 80 list by target type 81 constraint-based loading description 136 FullCBLOSupport 136 continuous extraction mode PowerExchange Condense 7 creating DB2 source definitions 25 DB2 target definitions 25 IMS source definitions 43 Oracle source definitions 33 source qualifiers 75 Sybase source definitions 37 VSAM source definitions 43 custom property FullCBLOSupport 136
DB2/390 change mode application connections 89 configuring bulk load properties 87 connection types 80, 81 datatypes 243 real-time mode application connections 89 DB2/400 change mode application connections 89 connection types 80, 81 datatypes 243 real-time mode application connections 89 default restart points 157 DTL__CAPXACTION in CDC sessions 42, 54 in extraction maps 62 DTL__CAPXRESTART1 restart value 163 DTL__CAPXRESTART2 restart value 163 DTLUAPPL 167 example 168 restart 163 DTLUTSK utility 116 description 170
E
editing Source Qualifier transformations 75 encryption configuring 108 enhanced restart recovery processing 173 environment SQL See also PowerCenter Workflow Administration Guide configuring extraction map CDC data map 135 extraction map source definitions editing 62 viewing 60
D
data maps non-relational source definitions 43 viewing in the source definition 52 Datacom batch mode application connections 92 change mode application connections 93 datatypes DB2/390 243 DB2/400 243 overview 240 PowerExchange 241 transformation 241 transformation datatypes in source qualifiers 75 DB2 creating source definitions 25 creating target definitions 25
F
filelist description 129 flexible key transformations group source 187
252
Index
G
group source CDC data 8, 72, 187 description 71 flexible key transformations 187 multiple records 6, 71 sequential 6, 71 VSAM 6, 71
viewing metadata extensions 61 non-relational sources configuring batch mode sessions 129 non-relational target definitions editing 54 editing metadata extensions 54
O
Oracle connection types 80 creating source definitions 33 real-time application connections 99, 104
I
idle time description 115 Idle Time (property) configuring for a PWXPC session 115 IDMS batch mode application connections 92 change mode application connections 93 IMS batch mode application connections 92 change mode application connections 93 connection types 80 datatypes 241 real-time mode application connections 93
P
pacing size configuring 109 pipeline partitioning See also PowerCenter Workflow Administration Guide batch mode 212 description 128, 212 loading to targets 212 $PMRootDir Cache 90, 94, 100, 105 Restart 89, 93, 99, 104, 118, 154 PowerExchange performance 109 PowerExchange Change Data Capture Flexible transformations 182 PowerExchange Condense batch extraction mode 7 continuous extraction mode 7 PowerExchange Configuration File dbmover.cfg 18
L
loading constraints 136 logger token restart value 163
M
metadata extensions editing 54 viewing 53 viewing for non-relational source definitions 61
R
reader time limit description for PWXPC 90 Reader Time Limit (property) configuring for a PWXPC session 90, 94 real-time flush latency description for PWXPC 119 Real-time Flush Latency (property) configuring for a PWXPC session 119 real-time mode configuring sessions 135 recovery creating the tables 161
N
non-relational source definitions editing 54 editing metadata extensions 54 viewing data map details 52
Index
253
enhanced restart 173 example 174 PM_REC_STATE table 152 PM_RECOVERY table 151 PM_TGT_RUN_ID table 151 state file 153 tables 151 relational source definitions editing 41 relational sources configuring batch mode sessions 132 relational target definitions editing 41 relational targets configuring sessions 142, 144 restart $PMRootDir/Cache 90, 94, 100, 105 $PMRootDir/Restart 89, 93, 99, 104, 118, 154 application name 117, 167 CDC sessions 117, 165 CDEP 167 DTL__CAPXRESTART1 163 DTL__CAPXRESTART2 163 DTLUAPPL 163, 167 DTLUAPPL example 168 earliest points 157, 158 logger token 163 null restart tokens 158 operation 165 overview 151 PM_REC_STATE table 152 restart token file 118, 154, 162 restart token file folder 118 RESTART1 163 RESTART2 163 sequence token 163 state file 153 tokens 152, 153 restart points defaults 158 earliest 158 null 158 restart token file archiving 175 comment 162 configuring 162 example 164 explicit override 163 special override 163 syntax 162
S
sequence token restart value 163 sequential data sets group source 71 sessions overview 128, 212 source definitions DB2 25 editing metadata extensions 54 editing, extraction maps 62 editing, non-relational 54 editing, relational 41 IMS 43 viewing metadata extensions 53 viewing, extraction maps 60 VSAM 43 working with non-relational source definitions 43 Source Qualifier transformations See source qualifiers source qualifiers transformation datatypes 75 STOPTASK command CDC sessions, stopping 116, 170 Sybase creating source definitions 37
T
target definitions DB2 25 editing metadata extensions 54 editing non-relational 54 editing relational 41 viewing metadata extensions 53 terminating conditions PWPXC idle time 115 PWPXC real-time flush latency 119 PWXPC reader time limit 90 PWXPC UOW count 118 transformations affecting row ID 136 update strategy 42, 54, 62
U
UOW Cleanser Condense 7, 8
254
Index
UOW count description for PWXPC 118 UOW Count (property) configuring for a PWXPC session 118 update strategy 42, 54, 62
V
VSAM batch mode application connections 92 change mode application connections 93 connection types 80 datatypes 241 extracting data from multiple files 129 group source 71 real-time mode application connections 93
W
workflows overview 128, 212
Index
255
256
Index