Академический Документы
Профессиональный Документы
Культура Документы
SCDAdvanced
a talend custom component for slowly changing dimension
Index
1Introduction.............................................................................................................................................................................................................................................1
2Using guide..............................................................................................................................................................................................................................................1
2.1Base Parameters...............................................................................................................................................................................................................................1
2.2Advanced Parameters.......................................................................................................................................................................................................................3
2.3Job example.....................................................................................................................................................................................................................................4
3A look at the code....................................................................................................................................................................................................................................7
3.1Class view........................................................................................................................................................................................................................................7
1 Introduction
This component manage a slowly changing dimension table in a data warehouse environment, starting from the source table and its rows changing.
2 Using guide
2.1 Base Parameters
Typical parameters description (connections, schema, etc.) was omitted. Please refer to official documentation for these.
Following parameters are shared by the whole component:
Parameter name
Type
Description
Table
String
List of values
Versioning rule
List of values
Ver. 0.1
SCDAdvanced
a talend custom component for slowly changing dimension
Parameter name
Type
Description
String
String
Check
Check
If true use a specific rule for first versioning rows, choosing a rule from a list of value (look at Versioning
rule for description)
Versioning counter
Check
If true, use a column to store the versioning counter. Column name is specified in the text box and must be in
the output Schema.
Check
If true, use a column to store the active row flag. Column name is specified in the text box and must be in the
output Schema.
Check
If true, source key column may be null and managed correctly by the component
Check
If true, it's looking for rows in target that are missing in source and close them
Type
Description
Surrogate key
Check
Source key
Check
Change rule
List of values
Ver. 0.1
SCDAdvanced
a talend custom component for slowly changing dimension
Parameter name
Type
Description
Ignore column This column is every ignored (no value is passed to DB)
Versioning Close current row and insert a new one with new value
Last value only Update column with new value
Keep previous value Store current value in a previous value column (see Previous value column
name parameter), then update this column with new value
History correction Update all rows (by source key) with new value
Audit column No rule applied, store value from source if other columns needs insert or update (see
Audit rule)
String
Audit rule
List of values
String
If Close target missing rows is true, the audit value to be used when a missing row is found
(1) Every date is given in string format and parsed to java.util.Date using fomat defined in advanced parameters.
Type
Description
Check
If true, preload all target rows from DB, then processing every source row, it executes search in-memory
Check
Check
SCDAdvanced
a talend custom component for slowly changing dimension
Ver. 0.1
last update 2014-05-08
SCDAdvanced
a talend custom component for slowly changing dimension
2.3 Job example
This is an example of job use.
Ver. 0.1
last update 2014-05-08
SCDAdvanced
a talend custom component for slowly changing dimension
The schema used:
Ver. 0.1
last update 2014-05-08
SCDAdvanced
a talend custom component for slowly changing dimension
Ver. 0.1
last update 2014-05-08
SCDAdvanced
a talend custom component for slowly changing dimension
Ver. 0.1
last update 2014-05-08
tMSSqlSCDAdvanced_begin.javajet Initialize environment, create runtime-class source, get initial data (preloaded, etc..)
tMSSqlSCDAdvanced_main.javajet Process every source row applying changing rules
tMSSqlSCDAdvanced_end.javajet - Find missing target row and close
tMSSqlSCDAdvanced.skeleton Contain common code for environment, runtime-class definition, etc...
tMSSqlConnection.javajet Code for manage DB connection. Is a copy from official component.
scdAdvanced.jar Class library
At compiling, Javajet classes create some classes depending by input and output schema definition:
SCD_sourceKeyRowStruct Class that represent source key columns; it implements IStructureClass interface.
SCD_sourceRowStruct Class that represent input schema columns; it implements IStructureClass interface and also offers methods to check which
columns are changed.
SCD_targetRowStruct Class that represent output schema columns; it implements ITargetStructureClass interface; It's used to get existing rows from DB.
SCD_auditColumnForInsertRowStruct Class that contains audit data to apply when component executes an insert operation.
SCD_auditColumnForUpdateRowStruct Class that contains audit data to apply when component executes an update operation.
SCDAdvanced
a talend custom component for slowly changing dimension
Ver. 0.1
last update 2014-05-08
IStructureClass
This interface contains methods used to merge data from and to DB. The mergeWithDBGetter method get ResultSet as input and merge data with class attributes
using type-depending getter. The mergeWithDBSetter method get a PreparedStatement and a collection, that contains columns name and parameter index, as
input and merge attributes using type-depending setter.
ITargetStructureClass
It extends IstructureClass, adding getting methods for surrogate key and for versioning id.
SCDFactory
This abstract class contains all configuration depending data and creates sql statement definitions.
SCDManager
It offers operational methods to read and to write data to DB.
StatementAttribute
This class contains a sql statement string and its collections with parameter indexes. There are three different collections:
StructureColumn
This class represents all columns information to be passed to SCDFactory. Javajet class create a collection that contains this definition for every output column and
pass it to SCDFactory constructor.
Utility
It contains some general used methods (enum, type mapper, etc..).
MSSqlSCDFactory
This implements abstract SCDFactory to real class for Microsoft Sql Server. It implements some DB specific attributes (separator, true/false, etc..).