Академический Документы
Профессиональный Документы
Культура Документы
NOTE: Dont move the mouse cursor very often and dont open the Internet
Explorer as it makes the services slower.
Goto Run.
Type services.msc
Press Enter.
Check whether IBM websphere service is started or not.
Run Cleanup.exe
Click the button cleanup
Wait for some time until all the temporary files get cleared.
Close.
Step 1:
File->New->Parallel Job.
Step 3:
Setting oltpsrc properties
Double click oltpsrc file on the work area
Set the properties as follows
Warnings
No limit: Runs the process even if n warnings are present
Abort job after: Aborts the process after encountering the specified no. of
warnings.
Note:
Before clicking Run close your src file and target file
Predicates:
1st Where clause condition for the link DSLink12(sal<=10000)
the sequential_file_1 will have the rows satisfies the above constraint
2nd where clause condition for the link DSLink11(sal>10000 and
sal<=20000)
the sequential_file_2 will have the rows satisfies the above constraint
Options:
output Rejects=true for DSLink10 and right click on the DSLink10->select
Convert to Stream
Mapping Columns:
1. Select the output link from the combo box
2. Drag and Drop the columns from left to right side
3. Redo the above steps for all the output links
Step 9: Set sequential_file_1, sequential_file_2, sequential_file_3 properties
same as in exercise 1.
Step 10: Compile
Step 11: Run the project and observe the output.
Exercise-3: Load the target file from multiple src files using Funnel
stage
Properties settings
Target file is loaded with all the src files in the sorted manner based on the
sor key value and sort order.
Output settings:
Step 8: Compile
Step 9: Run the project
Output:
Source files:
Target File on
1. Funnel Type=Continous Funnel
Exercise- 4: Pump the target file from the source file in the sorted
order using SORT stage
Step1: Create a new parallel project
Step 2: Save the project with a name.
Step 3: Drag and Drop two sequential files into the work area.
Step 4: Drag and Drop sort from processing option on the palette into the
work area.
Output setting:
Target File:
Sort can also be performed with the link directed from funnel
The above case wont work because Funnel link should be directed directly to
Sort
Exercise -5: Load the target file after removing duplicate rows from
the src file using Remove Duplicates stage.
Row Duplicates:
Eno, ename, salary
101,gokul,10000
102,gopal,20000
101,gokul,15000
101,gokul,25000
103,kumar,20000
The record (101,gokul) has been duplicated for 3 times with different salary
values. We need the latest updated row. So use the stage Remove
Duplicates as it removes all the duplicate rows keeping the last (or) first row
retained.
Duplicate row search is made using the key, eno in our case.
We can customize the duplicate to be retained by setting Duplicate to
Retain=Last | First.
Output Settings:
TARGET FILE:
Exercise 6: Join the rows in two src files and load them into the
target using JOIN stage
Step1: Create a new parallel project
Step 2: Save the project with a name.
Step 3: Drag and Drop three sequential files into the work area.
Step 4: Drag and Drop Join from processing option on the palette into the
work area.
Key= eno
Join Type= Inner|Left outer|Right outer|Full Outer
Output Settings:
Note:
While Joining keep your small table as left table and big table as right table
for better performance.
Step 7: set sequential_file_2 properties as same as in exercise 1.
Step 8: Compile and Run the project.
OUTPUT:
Source File 1 and 2:
Output Settings:
Length value for char is fixed length.(all the values of char domain column
have fixed no. of characters)
Length value for integer and varchar is their upper limit i.e., the max no. of
digits for integer and the max no. of characters for varchar.
Step 6: Set sequential_file_1 properties as same as in exercise 1.
Output:
Target File:
Exercise 8: Load data from a flat src file to a target oracle database
using oracle connector stage.
Step1: Create a new parallel project
Step 2: Save the project with a name.
Step 3: Drag and Drop a sequential file into the work area.
Step 4: Drag and Drop oracle connector from Database option on the palette
into the work area.
You can also View Data that has been imported using View Data button under
usage.
Output Settings:
Output:
Source File:
Target:
Username: Scott/tiger@orcl
Step 6: Import a table (This will take a snapshot of the original table and this
snapshot is used for further processing with better performance since
reading each and every record from the oracle database via an oracle
connection requires more overhead)
Since importing a table is equivalent to a snapshot, you have to perform it
for each time whenever the table faces any changes.
The changes you are making in the table should be committed before
importing it into the datastage, especially in oracle.
Username : scott
Password : tiger
Column Settings:
OUTPUT:
Target File:
Username: tduser
Password: tduser
You can also View Data that has been imported using View Data button under
usage.
Column Settings:
Procedure is same as in exercise 9.
Specifying the length and scale values is important here. (from any db to db
(or) from file to any db)
Sal=12000.00 (length=7 and scale=2)//generates all the values of decimal
domain column with same no. of digits.
Length value for char is fixed length.(all the values of char domain column
have fixed no. of characters)
Length value for integer and varchar is their upper limit i.e., the max no. of
digits for integer and the max no. of characters for varchar.
Step 7: Set Oracle Connector properties as same as in exercise 8.
Step 8: Compile and run the project.
OUTPUT:
Target:
Username: Scott/tiger@orcl
Exercise 16: Perform some aggregations on the src flat file and load
them into a target flat file using Aggregator stage.
Step1: Create a new parallel project
Step 2: Save the project with a name.
Step 3: Drag and Drop two sequential files into the work area.
Step 4: Drag and Drop Aggregator from processing option on the palette into
the work area.
Column Mapping:
Column Settings
By default data type for all aggregation type will be Double. So reset the type
as per your desire.
Step 7: Set Sequential_File_1 properties as same as in exercise 1.
Exercise 17: Load from src flat file to a target flat file with some
derived columns using Transformer stage.
Step1: Create a new parallel project
Step 2: Save the project with a name.
Step 3: Drag and Drop two sequential files into the work area.
Step 4: Drag and Drop Transformer from processing option on the palette
into the work area.
Drag and Drop the columns on which derivations have to be performed from
left to right (Column Mapping).
In the right hand side right click on each column and select function->any
desired function, then the function prototype will be loaded in the column.
Edit the column as per the prototype (for ex: on selecting UpCase,
UpCase(%string%) will be loaded. Edit the parameter value as
DSLink5.ename)
Deriving Grade column from the sal column using If Else with the same
procedure as above.
At the right bottom side rename the columns if you want (Here we are
renaming ename as Emp_Name, sal as Annual_salary ). Changes will get
updated in DSLink6 table.
Target File:
Exercise 18: Compare two tables (DWH and OLTP) and Capture the
changes in OLTP table with respect to DWH table then load the
changes to a flat file using Change Capture stage.
Step1: Create a new parallel project
Step 2: Save the project with a name.
Step 3: Drag and Drop oracle connectors from database option on the palette
into the work area.
Step 4: Drag and Drop a sequential file from file option on the palette into
the work area.
Step 5: Drag and Drop change capture from processing option on the
palette into the work area.
Step 6: Create two tables student and dupstudent with the structure
(rollno,name,age,deptid) and insert same records in student and dupstudent.
Make some changes in the dupstudent table (new insert,delete,update).
Step 7: set oracle connector properties as same as in exercise 9.
Step 8: set change capture properties as follows.
Setting Properties
Column Mappings:
OUTPUT:
Source tables:
Target File:
Exercise 19: Look up for the existence of records in DWH table with
respect to OLTP table and join the records using Look Up Stage
Step1: Create a new parallel project
Step 2: Save the project with a name.
Step 3: Drag and Drop three sequential files into the work area.
Step 4: Drag and Drop Look Up from processing option on the palette into
the work area.
Create a link with dno from oltp_link to dwh_link which act as a key for
comparison.
Drag and Drop the desired columns from oltp_link and dwh_link to
target_link.
Step 7: set target file properties as same as in exercise 1.
Step 8: Compile and run the project.
OUTPUT:
Source Files (DWH and OLTP):
Target File:
Inference:
If look up finds the existence of all the related records in DWH table with
respect to OLTP table on using a key (here dno) then it will join those records
and the join type is natural join with using clause.
So lookup can act as join with the above restriction.
Result:
Inference:
Since a record with the key (dno=6) in the oltp table is not exists in the dwh
table, error occurred.
Step 6: Create a table oltp with the following description and insert some
records then commit.
Derivation and Expire can be set by double click->right click->function>desired function on the respective columns.
Purpose Settings:
Business Key: primary key
Surrogate key: to locate changes (for system reference)
Type 1: Non-Changeable values but not but not a business key (eg: Date of
birth).
Type 2: Changeable values.
Effective Date: Entry date of the record
Expiration Date: Entry date of immediate duplicate record (so initially set it
as null)
Current Indicator: Indicates the active record
Active-1
Inactive-0
OUTPUT:
Deptdwh table is inserted with the records from oltp table with stdate as
current date, expdate as null and CID as 1(active record).
The dname value of the row with deptno =10 is changed from C to JAVA .
The old record gets the expiration date as the starting date of the newly
updated record
Current indicator (cid) of old record= 0 and for new record, cid=1.
Input settings:
Output Settings:
OUTPUT:
Source File:
Target File:
NOTE: Datatype of all horizontal columns except the primary key column in
source table should be same. In our case q1, q2, q3 column in source table
are integers. So that all these columns can fit into the column q with integer
datatype in target table.
Exercise 22: Run the jobs in sequential manner (one after other)
using Sequence Job
Sequence Job is mainly used for executing the jobs one after other.
It is very essential to execute the jobs in a particular sequence in which one
job depends on the finished execution state of another job.
For example consider the following query,
Select e.eno,e.ename,e.deptno,d.deptname from emp e join dept d
on(e.deptno=d.deptno) where e.deptno in(10,20,30) order by 2;
The above query needs to execute three jobs (1. Join, 2. Filter, 3. Sort) in
sequence.
Step1: Create a new sequence project
Step 3: Drag and Drop the jobs you want to execute sequentially from
repository into the work area.
Step 6: Open the run directory and observe the logs for successful execution
of all the jobs.