Configuration File Final

Configuration File
-Prepared By
Himanshu Thakkar
2009 Wipro Ltd - Confidential
Contents
1
2
3
4
5
6
7
Features of Configuration File

Create/Edit Configuration File
Set Configuration File at Project Level
Set Configuration File at Job Level
Configuration file structure
Configuration file for simple SMP
Configuration File for MPP/Cluster/Grid
Features of Configuration File

The Datastage configuration file is a master control file for Datastage jobs
which describes the parallel system resources and architecture.
The configuration file provides hardware configuration for supporting such
architectures as SMP (Single machine with multiple CPU , shared memory
and disk), Grid , Cluster or MPP (multiple CPU, multiple nodes and
dedicated memory per node).
The main outcome from having the configuration file is to separate
software and hardware configuration from job design. The main use of
configuration file is to change nodes and control processes at run time.
A job can utilize different hardware architectures without being recompiled.
The Datastage configuration file is specified at runtime by a
$APT_CONFIG_FILE variable.
Create/Edit Configuration File

To change or create a new configuration file :
Go to Designer Client
Go to Tools
Select Configurations
Select the Configuration File to edit or create a new then save and check.
The easiest way to validate the configuration file is to export
APT_CONFIG_FILE variable pointing to the newly created configuration file
and then issue the following command: orchadmin check
After creating a new Configuration File we can set it at two levels Project
Level and Job Level.
Contd
Set Configuration File at Project

Level
To
set configuration file at project level:
Go to Administrator Client.
Select the Project for which Configuration file is to be set.
Go to General Properties.
Select Parallel Mode.
Set the value for APT_CONFIG_FILE parameter with the path of
newly/already created Configuration file.
Once we set this parameter, Datastage will follow the same path by default
for Configuration File for all the jobs of that project.
Set Configuration File at Job Level

To set configuration file at job level:
Go to Designer Client.
Open the job for which Configuration file is to be set.
Go to Job Properties.
Select Parameters.
Add Environment Variable.
Select $APT_CONFIG_FILE parameter and set the value with the path of
newly/already created Configuration file.
Configuration file for a simple

SMP
Configuration File Structure

Following are the different components in any Configuration File:
Node
Pool
Fastname
Resource Disk
Resource Scratch Disk
Contd
Node
It is a logical processing unit.
Each node in a configuration file is distinguished by a virtual name and
defines a number and speed of CPUs, memory availability, page and swap
space, network connectivity details, etc.
Within a configuration file, the number of processing nodes defines the
degree of parallelism and resources that a particular job will use to run.
A configuration file with a larger number of nodes generates a larger
number of processes that use more memory (and perhaps more disk
activity) than a configuration file with a smaller number of nodes.
While the DataStage documentation suggests creating half the number of
nodes as physical CPUs, this is a conservative starting point that is highly
dependent on system configuration, resource availability, job design, and
other applications sharing the server hardware.
10
Contd
Fastname
The fastname is the physical node name that stages use to open
connections for high volume data transfers.
Typically, you can get this name by using Unix command uname -n.
In SMP , it is the principal node name as all nodes uses same fastname .
Pool
Based on the characteristics of the processing nodes you can group
nodes into set of pools.
A pool can be associated with many nodes and a node can be part of
many pools.
A node belongs to the default pool unless you explicitly specify a pools
list for it, and omit the default pool name () from the list.
11
Contd
Resource disk :
Here a disk path is defined. The data files of the dataset that are
accessible to each nodes are stored in the resource disk.
Resource scratch disk :
Here also a path to folder is defined. This path is used by the parallel job
stages for buffering of the data when the parallel job runs.
12
Configuration file for a MPP
13
Thank You

Configuration File Final

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Configuration File Final

Загружено:

Авторское право:

Доступные форматы

Configuration File

2009 Wipro Ltd - Confidential

Features of Configuration File

2009 Wipro Ltd - Confidential

Features of Configuration File

2009 Wipro Ltd - Confidential

Create/Edit Configuration File

2009 Wipro Ltd - Confidential

2009 Wipro Ltd - Confidential

Set Configuration File at Project

2009 Wipro Ltd - Confidential

Set Configuration File at Job Level

2009 Wipro Ltd - Confidential

Configuration file for a simple

2009 Wipro Ltd - Confidential

Configuration File Structure

2009 Wipro Ltd - Confidential

2009 Wipro Ltd - Confidential

2009 Wipro Ltd - Confidential

2009 Wipro Ltd - Confidential

Configuration file for a MPP

2009 Wipro Ltd - Confidential

2009 Wipro Ltd - Confidential

Вам также может понравиться