Академический Документы
Профессиональный Документы
Культура Документы
(Optional client
logo can
be placed here)
Disclaimer
(Optional location for any required disclaimer copy.
To set disclaimer, or delete, go to View | Master | Slide Master)
Course Objective
At the completion of this course you should be
able to understand :
Overview of processes followed in a
standard development project.
Course Content
Module 1: DataStage Low Level Design
Module 2: DataStage Coding Standards
Module 3: DataStage Best Practices Tips & Tricks
Module 4: Version Control
Course Title
IBM Global Business Services
(Optional client
logo can
be placed here)
Disclaimer
(Optional location for any required disclaimer copy.
To set disclaimer, or delete, go to View | Master | Slide Master)
Module Objectives
Design
Design (( Macro
Macro &
& Micro)
Micro)
Detailed
Technical
Design
Estimate
Estimate
Functional
Functional
Specification
Specification
Completed FS
Deployment
Deployment
Build
Build &
& Unit
Unit Test
Test
Build
Develop
Sign
Sign Off
Off
Technical
Design
Peer
Workshop
Review
QA Technical
Design
Offshore
Knowledge
Transfer
Functional
Functional
Spec
Spec Review
Review
Coding
Coding and
and
Unit
Unit Testing
Testing by
by
Developer
Developer
Rework
Required
Technical
Design
Approval
Estimation and
Delivery Plan
No
Signoff by
Team Lead
Peer
Review
Coding
OK ?
Onsite
Acceptance
Testing
Onsite
(UAT/System
Testing Test/
IntegrationTest
Issue
TPR/SCR
Logging
Issue
Resolution
No
Issues
Yes
Estimation
Estimation
OK
OK ?
OK
?
Send for
Onsite
Acceptance
Yes
Technical
Specification
- Onsite
- Onsite/Offshore
- Offshore
- QA Checkpoints
Development
Complete
LLD_Template
10
Key Points
Step Overview:
This shows the key elements e.g the inputs,outputs,key activities involved etc. along with the
artifacts.
Key Activities:
Inputs
Roles
Developer
11
Outputs
Technical Specification
Course Title
IBM Global Business Services
(Optional client
logo can
be placed here)
Disclaimer
(Optional location for any required disclaimer copy.
To set disclaimer, or delete, go to View | Master | Slide Master)
Module Objectives
At the completion of this chapter you should
be able to:
Know the Job Level Naming conventions
used in Data Stage.
Know the Parameter Naming
conventions used in DataStage.
Know proper Documentation
standards/Commenting within the Job.
Know proper Use of
Environmental/Generic parameters as a
standard practice.
Identify the key Coding standard
principles.
14
15
16
Coding standard
What is a Coding standard?
The set of rules or guidelines that tells developers how they must write their code.
Instead of each developer coding in their own preferred style, they will write all
code aligning to ETL standards ensuring the consistency of the designed ETL
application throughout the project.
Benefits
17
Coding standards
Repository structure :
The repository is the central storing place for Build related components. It is a key
component of the software whilst developing jobs in DataStage Designer
Data Elements - A specification that describes the type of data in a column and how the
data is converted. Server jobs only.)
Jobs Folder for jobs that are built, compiled and run.
Routines The BASIC language can be used to write custom routines that can be called
upon within server jobs. Routines can be re-used by several server jobs.
Stage Types Any stage used in a project this can be data source, data
transformation, or data warehouse.
Table Definitions - A definition describing the data you want including information about
the data table and the columns associated with it. Also referred to as meta data.
Transforms Similar to routines these take one value and compute another value from
it.
18
Coding standards
ETL Coding standard guidelines:
For each of these groups a Jobs and a Sequences folder is created. Thus, for each
group two separate folders are created under the Jobs folder. These groups in turn
can be divided into subgroups (and thus subfolders.
Templates are stored in a separate Templates folder directly under the Jobs folder.
It is expected that a small number of templates will suffice to create jobs at all
levels, so that there is no need to create specific folders for templates at every
level.
. Thoughtful naming of jobs and categories will help the developer in understanding
the structure.
19
Coding standards
Job Templates :
Each project should contain job templates in order to ensure that jobs are
created with the proper amount of job parameters, and the correct job
parameter names. These job templates are stored in a separate
Templates folder directly under the Jobs folder .
Jobs and Sequences
Jobs can be grouped into folders based on a common feature, notably the
architectural area they belong to. Thus, for each group a separate folder
is created under the Jobs folder. These groups in turn can be divided into
subgroups (and thus subfolders).
Table Definitions
The Table Definitions section contains metadata which can be imported
from a number of sources, e.g. Oracle tables, or flat files. The folders that
this metadata is stored in must represent the physical origin or destination
of a table or file. The recommended naming standard (and the default for
ODBC) is:
1st subfolder: database type (ODBC, Universe, DSDB2, ORAOCI9)
2nd subfolder: database name.
20
Coding standards
Hash Files :
Hash files can be stored either in Universe, or in the file system of the
operating system.
Sequential Files :
A DataStage project will potentially use source, target, and intermediate
files. These can be placed in separate directories. This will:
Simplify maintenance.
Allow data volumes to be spread evenly across multiple disks.
Allow for closer monitoring or file system.
Allow for closer monitoring of data flow.
Aid housekeeping processes.
21
Naming Conventions
What is 'Naming Convention'?
This is an industry accepted way to name various objects.
A variety of factors are considered when assessing the success of a
project. Naming standards are an important, but often overlooked
component. Appropriate Naming convention Establishes consistency in the repository,
Provides a developer friendly environment.
Benefits:
Facilitates smooth migrations and improves readability for anyone
reviewing or carrying out maintenance on the repository objects.
It helps to understand the processes being affected thereby saving
significant time.
22
Naming Conventions
The following pages suggest naming conventions for various repository
components .Whatever convention is chosen, it is important to make the
selection very early in the development cycle and communicate the
convention to project staff working on the repository. The policy can be
enforced by peer review and at test phases by adding processes to check
conventions both to test plans and to test execution documents.
23
Suggested Naming
Conventions
Project Name
24
Job
Suggested Naming
Conventions
The job names used are very
much dependent on the project.
Usually job names contain a
subject area (the target table),
and possibly a job function (load,
transform, clear, update, etc).
Job names have to be unique
across all folders.
For projects, the standard chosen
is:
<job function>_<target_table>
25
26
Component /Parameter
Sequential File
Hash file
Object/Parameter
Suggested Naming
Conventions
XML file
Oracle database
DB2 database
27
Suggested Naming
Conventions
ODBC source
Siebel DA
Dataset
28
Component /Parameter
Command
Cmd _<functional_name>
Aggregator
Agg _<functional_name>
Folder
Fld _<functional_name>
29
Object/Parameter
Suggested Naming
Conventions
Filter
Fltr _<functional_name>
Inter Process
Ipc _<functional_name>
Link Partitioner
Lpr _<functional_name>
Lookup
Lkp _<functional_name>
30
Merge
Mrg _<functional_name>
Sort
Srt _<functional_name>
Transformer
Xfm _<functional_name>
31
Cdc_<functional_name>
Funnel
Join
32
Fnl/Club _<functional_name>
Join _<functional_name>
SKey _<functional_name>
Remove Duplicates
Copy
33
Ddup _<functional_name>
Cpy _<functional_name>
Examples:
enrichedCustomer
sortedOrders
34
Local Containers
The names of Local Containers start with Lcn_ , followed by a meaningful name
describing its function.
Stage Variable :
A Stage Variable is an intermediate processing variable that retains its value during
read but does not pass its value to a target column.
Stage variable names start with stg_ and reflect their usage.
A standard must be set so that common stage variables are named consistently.
35
General
P_DB_<logical db name>_DSN
User Identification
36
P_DB_<logical db name>_USERID
P_DB_<logical db
name>_PASSWORD
P_DIR_TEMP
P_DIR_ERRORS
P_DIR_REF
37
38
39
40
Suggested
Coding principles
Avoid clutter comments, such as an entire line of asterisks. Instead, use white space to separate comments from code.
Avoid surrounding a block comment with a typographical frame. It may look attractive, but it is difficult to maintain.
Use complete sentences when writing comments. Comments should clarify the code, not add ambiguity.
Comment as you code because you will not likely have time to do it later. Also, should you get a chance to revisit code
you have written, that which is obvious today probably will not be obvious six weeks from now.
Comment anything that is not readily obvious in the code.
To prevent recurring problems, always use comments on bug fixes and work-around code, especially in a team
environment.
Use comments on code that consists of loops and logic branches. These are key areas that will assist source code
readers.
Establish a standard size for an indent, such as three spaces, and use it consistently. Align sections of code using the
prescribed indentation.
41
Use of parameters
Definition
Job parameters allow you to design flexible, reusable jobs, making a job independent from its
source and target environments.
If, for example, we want to process data using a certain userid and password, we can include
these settings as part of your job design. However, when we want to use the job again for a
different environment, we must most likely edit the design and recompile the job.
-- Instead of entering constants as part of the job design, you can set up parameters which
represent processing variables.
42
Use of parameters
Creating Project Specific Environment Variables :
Here are the steps to standard steps to follow:
variables.
Step 5->Type in all the required job parameters that are going to be shared between jobs
43
Use of parameters
Using Environment Variables as Job Parameters :
44
Use of parameters
Points to Note :
We set the Default value of the new parameter to "$PROJDEF" to ensure it dynamically set each time
the job is run.
When the job parameter is first created it has a default value the same as the Value entered in the
Administrator. By changing this value to $PROJDEF you instruct DataStage to retrieve the latest
Value for this variable at job run time
Set the value of these encrypted job parameters to $PROJDEF. We need to type it in twice to the
password entry box.
The "View Data" button will not work in server or parallel jobs that use environment variables set to
$PROJDEF or $ENV. This is a defect in DataStage. It may be preferable to use environment
variables in Sequence jobs and pass them to child jobs as normal job parameters. eg. In a sequence
job $DW_DB_PASSWORD is passed to a parallel job with the parameter DW_DB_PASSWORD.
45
Use of parameters
Points to Note :
We set the Default value of the new parameter to "$PROJDEF" to ensure it dynamically set each
time the job is run.
When the job parameter is first created it has a default value the same as the Value entered in the
Administrator. By changing this value to $PROJDEF you instruct DataStage to retrieve the latest
Value for this variable at job run time
Set the value of these encrypted job parameters to $PROJDEF. We need to type it in twice to the
password entry box.
The "View Data" button will not work in server or parallel jobs that use environment variables set to
$PROJDEF or $ENV. This is a defect in DataStage. It may be preferable to use environment
variables in Sequence jobs and pass them to child jobs as normal job parameters. eg. In a sequence
job $DW_DB_PASSWORD is passed to a parallel job with the parameter DW_DB_PASSWORD.
46
Application examples
Environment:
Database names or access details can vary between environments or can change over time. By
paramaterising these at Project level any change can be quickly applied without updating or
recompiling all Jobs.
All file names and locations were specific to each run thus the filenames themselves were hard
coded but the file batch and run reference and related location were parameterised .
47
Application examples
Process Flow :
Parameters can be manually entered at runtime, however, to avoid data entry errors and speed up turnaround, parameter files were pregenerated and loaded within DataStage with minimal manual input.
Generic Parameters
It is often seen that a number of parameters will apply across the whole Project. These will relate to either the Environment or specific
Business Rules within the mappings. For example:
TARGETSYSTEM set to the test environment name due to be loaded with data from this run .
48
Course Title
IBM Global Business Services
(Optional client
logo can
be placed here)
Disclaimer
(Optional location for any required disclaimer copy.
To set disclaimer, or delete, go to View | Master | Slide Master)
Module Objectives
51
52
53
54
7. Maintenance Activity
7.1 Backup and version control Activity
7.2 Version Control in ClearCase
7.3 DS Auditing Activity
7.4 Retrieving Job Statistics
Assuring Naming Conventions of components, jobs and categories
7.5 Performance Tuning of DS Jobs
8. Preparing UTP- guidelines
55
56
57
1. Getting Started
58
2.Prerequisites
59
60
4.Estimation a conversion
. An overview of the load job designs need to be chalked out.
1.The no of lookups to be performed in the load job. Design of lookup jobs
should be explored (scope of any join stage or whether it can be performed
using custom SQL in the source oracle stage)
2.The complexity of the transformer in the load job need to be determined.
In case of multiple lookups or large number of validations the complexity
should be high and the contingency factor in the estimation model can be
increased.
3.The existence of mandatory fields (must be loaded in target) should be
examined. The records can be rejected at the first opportunity (after source
DB stage) and sent to log without any further validation. For non mandatory
fields, the records can not be rejected and all the validations on other
columns need to be performed.
61
5. Preparing a DS environment
DataStage Installation should be in place along with other database
installations
Project Level Environment variables has to be created to hold connectivity
values of staging databases, the file locations for input, output and
temporary storage.
62
6. Designing Jobs
63
64
65
66
1.Use table method for selecting records from source. Provide select list
and where clause for better performance
2.Pull the metadata into appropriate staging folders in Table
Definitions>Oracle. Always use the Orchdb utility to import metadata. It
imports the description part also which is helpful to keep track of the
original metadata in case they are modified in the job flow.
3.Avoid using the table name in the form of parameter in oracle stages.
4.In case of some access restricted apps tables, to access the data from
oracle stage open command section should be used with the relevant
query
5.Native API stages always perform better compared to ODBC stage. So
Oracle stage should be used.
67
6.5.1.Performing Lookups
68
69
Flow
TX
lkp1
lkp2
TX
70
71
72
73
1. The safest and the easiest way to solve this problem is to run the
Sort stage in Sequential mode. This can be done by selecting
Sequential option in the Advanced Tab in the Stage page.
To select distinct values from the input dataset and also catch the
duplicates in a separate file a combination of a Sort stage and a
Transformer can be used. In the Properties page of the Sort stage the
option of CreateKeyChange is selected to be True. This creates an extra
column in the result dataset where this column contains 1 for distinct
values of the Sort key and 0 for the duplicate values. This column of the
dataset can be used in the Transformer separate the distinct and duplicate
values.
74
The approach can be to check for null using IsNull function or checking for
zero length after trimming the column and then explicitly set it to Null using
the SetNull function
75
Suppose we are generating a key message with more than one fields which
are coming from source. We need to be very careful about that. Because when
we are concatenating that field in the key message field and the field contains a
null then the record may get dropped, specially if more fields are concatenated
after that. Suppose this is our code to generate a key message :
Here the field BANK_NUM is a nullable field
If len(VarFndBnkNum) <> 0 Then 'Customer ID: ':
validateCustSiteUses.ID : ', BANK_ACCOUNT_NUM: ' :
validateCustSiteUses.BANK_ACCOUNT_NUM : ', BANK_NUM: ' :
validateCustSiteUses.BANK_NUM : ', ORG_ID' :
validateCustSiteUses.ORG_ID_LK Else ''
76
77
In most of the cases, the task of node configuration and partitioning has
been left to DataStage ( default Auto) and it partitions the input dataset
based on the number of nodes( two in our case: so two partitions)
Customization is required when a join is performed (presort the data
before join) or when a sort stage is used (typical cases found till date).
In some cases the stage may need to be restricted to one node so that it
creates only one process which will work on the entire dataset e.g. if we
need to know no of rows and write a stage variable as below:
svRowCount=svRowCount + 1;
Here if the stage runs on two nodes, it will create two processes which will
run on two partitions. So the final count would be half of the entire dataset.
Also applicable for the logic of vertical pivoting in Transformer using stage
variables.
78
79
80
81
82
83
84
85
86
87
The easiest way is to enable the automatically handle activities that fail
option in job properties tab of a sequence job. This allows DataStage to
send an abort request to a calling sequence if a subordinate job aborts.
DataStage provides some job control stages e.g. terminator activity stage
to further customize the restartability in your job.
88
7 . Troubleshooting
7.1 Troubleshooting: Some debugging Techniques
7.2 Oracle Error Codes in DataStage
7.3 Common Errors and Resolution
7.4 Tips: Message Handler
7.5 Local runtime Message Handling in Director
7.6 Tips: Job Level and Project Level Message Handling
7.7 Using Job Level Message Handler
89
91
OSH_DUMP shows OSH code for your job. Shows if any unexpected
settings
were set by the GUI.
92
ORACLE ERROR
CODES IN DS
93
94
95
96
97
98
99
100
101
102
103
104
105
9.Maintenance Activities
9.1 Backup and version control Activity
Taking whole project backup
Taking Job level Export
Taking folder level Export
Version Control in ClearCase
9.2 DS Auditing Activity
Tracking the list of modified jobs during a period
Retrieving Job Statistics
Getting the row counts of different jobs
9.3 Performance Tuning of DS Jobs
Analysing a flow
Measuring Performance
Designing for good performance
Improving performance
106
During test phase, the jobs enhanced each week is identified in the
weekend and backed up as a part of version control activity
107
Back up activity
Taking Job level Export:A Job Repository table has been created in stage
1.A sequence job runs to refresh this repository. This sequence calls a routine
which extracts the job names and the associated category path into a
sequential file. The subsequent load job loads the data into repository.
If some specific categories/jobs has to be exported, then the relevant sql file
has to be modified with the required query in the where clause to select the
required jobs to be exported.
If the requirement is version control, then the repository of modified jobs has
to be refreshed and then the main batch can be run directly to perform the
export. It will create job level dsx files. One report file will be generated.
If a job is locked by any user, the utility will cease to proceed further unless
the option to skip/abort is provided by the user. So, it is better to restart the
server before the export is started. The job level dsx files will be created with
the same folder structure as in the server
108
Back up activity
Taking folder level Export:
Once the job level backup is complete, those files can be concatenated
to create folder level dsx files.
If some specific categories has to be exported, then the relevant sql file
has to be modified with the required query in the where clause to select
the required jobs to be exported.
If the requirement is version control, then the repository of modified jobs
has to be refreshed and then the main batch can be run directly to
perform the export. It will concatenate the job level dsx files created
earlier to create folder wise dsx files.
If there exists a log file, the batch will abort. Unlock the job in the server
and perform the export batch again to take export of that job. If the export
program was successful, folder level dsx files will be generated along with
a report file.
109
Version Control
To upload the dsx into the respective folder in CC
connect to ClearCase web client and go to the proper path
Create the activity indicating the reason of change (defect number)
Check out the respective folder (folder> basic>check out).
Put the .dsx file into the CCRC path in your local machine
Check in the folder and click Tools > update resources with the selected
activity.Add the .dsx file to source control (Right click on the file in the right
hand pane > basic > add to source control. A blue background will come up
uncheck the option for checking out after adding to source control
Right click on the file in the right hand pane > Tools >show version tree.
The version tree will be taken.
To further apply any change to the code
Import the .dsx file to the local machine and make modifications as per
requirement
Compile and run the job and upload the new dsx as discussed
110
111
112
113
Analysing a flow
Measuring Performance
Designing for good performance
Improving performance
114
115
1.A score dump of the job helps to understand the flow. We can do this
by setting the APT_DUMP_SCORE environment variable true and running
the job (APT _DUMP_SCORE can be set in the Administrator client, under
the Parallel > Reporting branch). This causes a report to be produced
which shows the operators, processes and data sets in the job.
The report includes information about:
Where and how data is repartitioned.
Whether DataStage had inserted extra operators in the flow.
The degree of parallelism each operator runs with, and on which nodes.
Information about where data is buffered.
116
118
Check for any aggregator stage in your jobs - This is part of transformation
bottleneck but need to be given special attention. An aggregator stage in the
middle of a big job makes the enter job slow since all the records need to pass
the aggregator (cannot be processed in parallel).
To catch partitioning problems, run your job with a single node configuration file
and compare the output with your multi-node run. You can just look at the file
size, or sort the data for a more detailed comparison
119
120
121
Advanced steps:
Running the jobs which handle small volume of data to a single node
instead of multiple nodes. This will limit spawning up multiple processes and
partitions when there is no need. This can be done by adding the environment
$APT_CONFIG_FILE and setting it to use a single node configuration.
When writing intermediate results that will only be shared between parallel
jobs, always write to persistent data sets (using Data Set stages). Ensure that
the data is partitioned, and that the partitions, and sort order, are retained at
every stage. Avoid format conversion or serial I/O.
122
123
Options:
Create a basic routine and use it as before/after job subroutine or
using a routine activity stage.
Create a C++ routine and use it inside a PX transformer
Create custom operators and use them as a stage: This allows
knowledgeable Orchestrate users to specify an Orchestrate operator
as a DataStage stage. This is then available to use in DataStage
Parallel jobs
124
Course Title
IBM Global Business Services
(Optional client
logo can
be placed here)
Disclaimer
(Optional location for any required disclaimer copy.
To set disclaimer, or delete, go to View | Master | Slide Master)
Module Objectives
At the completion of this chapter We should be
able to:
Manage and track all DataStage
component code changes and releases.
Maintain an audit trail of changes made to
DataStage project components, and
records a history of when and where
changes e made.
Store different versions of DataStage jobs.
Run different versions of the same job.
Revert back a previous version of a job.
Store all changes in one centralized place.
127
Discipline.
Basic Principle/Approach.
Different Projects.
Filtering Components.
Different Methods.
128
Versioning Methodology
In a typical enterprise environment, there may be many developers working on jobs all at
different stages of their development cycle. Without version control, effective management
of these jobs could become very time consuming and they could be difficult to maintain.
It gives an overview of the methodology used in Version Control and highlight some of its
benefits. It is not intended as a comprehensive guide to version control management
theory
Benefits:
Central code repository - all coding changes are contained in one central managed
repository, regardless of project or server locations.
DataStage integration - Components are stored within the VERSION project, which can
be opened directly in DataStage from Version Control. Alternatively, Version Control can be
opened directly from within any DataStage client.
Team coordination - Components are marked as read-only as they are processed through
Version Control, ensuring that they cannot be modified in any way after being released.
129
Versioning Methodology
Discipline :
To gain the maximum benefit from using Version Control We must
exercise a disciplined approach. If We build in that discipline from the
start We will quickly realize the benefits as project grows.
Always ensure that We pass components though Version Control before
sending them to their next stage of development. This will make the
project development far easier to track, especially if We have complex
projects containing a large number of jobs
Basic Principle/Approach :
Most DataStage job developers adopt a three stage approach to
developing their DataStage jobs, which has become the de facto
standard.
These stages are:
130
Versioning Methodology
Basic Principle/Approach :
In this model, jobs are coded in the development environment, sent for test,
redeveloped until testing is completed, and then passed to production.
131
Versioning Methodology
Basic Principle/Approach :
132
Versioning Methodology
Different Projects :
The Version Control Project - Version Control uses a special
DataStage project as a repository to store all projects and their associated
components. This project is usually called VERSION, although We may
create a project with any name. Whatever name We choose for version
project, the principle remains the same. the Version Control repository
contains the archive of all components initialized into it. It therefore stores
every level of each code release for each component.
Other Projects-If We adopt the three stage approach, We would typically
have three other projects:
133
Contd
Production- the final destination from where the finished jobs are
actually run.
These projects can reside on different DataStage servers if required. Once
a development cycle is complete, components are initialized from the
Development project into the Version Control repository. From there they
are promoted to the Test project. When testing is complete (which may
include more development-test cycles), components are promoted from
the Version Control repository to the Production project.
134
Initializing Components
Initialization is the process of selecting components from a source
project and moving them into Version Control for processing and
promoting to a target project.
135
Initializing Components
Version Control Numbering:
The full version number of a DataStage component is broken down as
follows:
Release Number. Minor Number
where:
The Release Number is allocated when We initialize components in
Version Control. If required We can specify a release number in the
Initialize Options dialog box. By default, Version Control sets this to the
highest release number currently used by objects in its repository.
136
Initializing Components
Filtering Components:
We can filter a long list of components to show only those that we are
interested in for promotion.
For example, We may want select components associated with Sales
or Accounting. Rather than search through the entire list, We can
filter through the list, and select the subset for promotion.
137
Initializing Components
To filter components:
1. Click the Filter button in the Display toolbar so that a text entry field
appears:
2. In the text entry field, type in the text we want to filter by.
we can type letters or whole words, and separating letters or words
with a comma will result in an OR operation. For example, typing in
accounting, sales will result in a list showing components that have
accounting or sales in its name.
Click the arrow next to the Filter button to specify whether the filter is
case sensitive or not.
3. When we are happy with our filter text, click the Filter execute button,
press return, or click in the tree view of the Version Control window.
4. To return to the default view, click the Filter button again.
138
Promoting Components
We can promote components after they have been initialized into Version Control.
In a typical environment, components are initialized from a development project and
promoted to a test or production project.
Component selection for promotion:
We can select components for promotion in the following ways:
By individual selection
By batch
By user
By server
By project
By release
By date
139
Promoting Components
The different ways of selecting component for promotion are as follows:
140
Promoting Components
By user: We can select components that have been initialized by a particular
user. Select the required user from the menu. All the components that have
been initialized by that user are selected ready for promotion.
By server: Select the required server from the menu. All the components
that have been initialized from that server are selected ready for promotion.
By project: We can select components that have been initialized from a
particular project. Select the required project from the menu. All the
components that have been initialized from that project are selected ready
for promotion.
By release: We can select components that belong to a particular release.
All the components that belong to that release are selected ready for
promotion.
By date: We can select components that were initiated on a particular date.
All the components that were initialized on that date are selected ready for
promotion.
141
Best Practice
Using of Custom Folder in Version Control:
Many development projects which use DataStage for extraction, transformation
and loading (ETL) also incorporate other project related files which are not part of the
DataStage repository.
These files may contain DDL scripts or other resource data. Version Control can
process these ASCII files in the same way as it processes DataStage components.
If we choose to add Custom folders, they are automatically created by Version Control there is no need to create them manually.
Every time Version Control subsequently connects to a project, either for initialization or
for promotion, it checks to see if the custom folder exists. If it does not exist, then
Version Control will create it.
After Version Control has created a custom folder, it can then be populated with the
relevant items.
The only requirement for using custom folders in Version Control is that the components
must be stored within a folder in the project itself.
142
Best Practice
Starting of Version Control from DS-Designer :
We can run Version Control directly from within DataStage Designer, Director or
Manager by adding a link to the DataStage client tools menu.
We can also add options which will allow Version Control to start without displaying the
login dialog.
If We want Version Control to start with login details already filled in and without
display the login dialog, We can enter appropriate command line arguments. These are
entered in the Arguments field and have the following syntax:
/H=hostname /U=username /P=password
where:
hostname is the DataStage Server hosting project
username is DataStage username
password is DataStage password.
For example, if We have a hostname of ds_server, a username of vc_user, and a
password of control, then We would type in:
/H=ds_server /U=vc_user /P=control
Version Control can now be started from the DataStage Client.
143
144