Академический Документы
Профессиональный Документы
Культура Документы
Ab Initio Software:
Part 1
24 August 2007
Course Structure
Moving Data
move small and large volumes of data in an efficient manner
deal with the complexity associated with business data
High Performance
scalable solutions
Better productivity
Ab Initio Software
Data transformation.
User Applications
Development Environments
Ab Initio
GDE Shell
Components
Dataset Datasets
Flows
The Graph Model: Some Details
Ports
Record format
Expression
metadata
metadata
Components
A dataset is made up of
records; a record 0345John Smith
consists of fields.
0212Sam Spade
0322Elvis Jones
Analogous database Records
terms are rows and 0492Sue West
columns 0121Mary Forth
0221Bill Black
Fields
Sources of Record Format Metadata
$AI_RUNrun directory
$AI_DMLrecord format files
$AI_XFRtransform files
$AI_MPgraphs
$AI_DBdatabase config files
Double click on a
component to bring
up its Properties Page
Viewing Port Properties
0345John Smith
0212Sam Spade
0322Elvis Jones
0492Sue West
0121Mary Forth
0221Bill Black
Editing Types in GDE
record
decimal(4) id;
string(6) first_name;
string(6) last_name;
string(5) newfield;
end
Field Names
There are several built-in types available via the drop-down menu. This
course uses three types: string, decimal (for all numbers), and date.
record
decimal(4) id;
string(6) first_name;
string(6) last_name;
date("YYYY-DD-MM") newfield;
end;
Expressions in DML
Type in an expression...
Expression text
Exercise 1: Writing DML
Open mp/ex1.mp
The data file ex1.dat contains these lines:
Smith,John,1992.02.23,2400
Jones,Jane,1993.10.29,320
Warren,Jake,1994.11.02,9045
Use the Record Format Editor (New) to create a description of this data:
lastname, firstname, pur_date, and amt. Then use View Data to verify
the description is correct.
Hint: Newline delimiters are written: \n
Simple Components
Reads records from input port, sorts them by key, and writes the
result on the output port.
Sorting (mp/figure-03.mp)
Sorting - The Key Specifier Editor
Exercise 3: Sorting
id+1000000
a b c
x y z
A Record arrives at the input port
9 45 QF
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
The Record is read into the component
9 45 QF
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
The Transformation Function is evaluated
9 45 QF
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
Since every rule within the Transform function
is successful, a result record is issued
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
44 9 RG
The result record is written to the output port of the component
out :: trans(in) =
begin
out.x :: in.b - 1;
out.y :: in.a;
out.z :: fn(in.c);
end;
44 9 RG
Exercise 4: Reformat Data
Then modify the transform to trim the spaces from the first name before
concatenating with last name to get John Smith rather than John
Smith
Data Aggregation
0345Smith Bristol 56
0121Forth Bristol 7 Bristol 63
0322Jones Compton 12 Compton 12
0212Spade London 8
0492West London 23 London 31
0221Black New York 42 New York 42
The Rollup Component (mp/figure-05.mp)
avg max
count min
first product
last sum
Rollup Wizard
0345Bristol 561997/09/24
0212London 81900/01/01
0322Compton 121997/04/02
0492London 231997/11/23
0121Bristol 71996/12/11
0221New York 421900/01/01
Joining Sorted Data on the id field
0121Bristol 71996/12/11
0212London 81900/01/01
...
Building the Output Record
in0: in1:
record record
decimal(4) id; decimal(4) id;
string(6) name; date(YYMMDD) dt;
string(8) city; decimal(9.2) cost;
decimal(3) amount; end
end
out:
record
decimal(4) id;
string(8) city;
decimal(3) amount;
date(YYYY/MM/DD)dt;
end
What if the in1 record is missing?
in0: in1:
record record
decimal(4) id; decimal(4) id;
string(6) name; date(YYMMDD) dt; ???
string(8) city; decimal(9.2) cost;
decimal(3) amount; end
end
out:
record
decimal(4) id;
string(8) city;
decimal(3) amount;
date(YYYY/MM/DD)dt;
end
Prioritized Assignment
a b c a q r
a x q
Records arrive at the inputs of the Join
G 234 42 G NY 4
Align inputs by a
G 234 42 G NY 4
Align inputs by a
G 234 42 G NY 4
Align inputs by a
Align inputs by a
G 234 42 G NY 4
Align inputs by a
G 234 42 G NY 4
Align inputs by a
G 24 NY
New records arrive at the inputs of the Join
H 79 23 K IL 8
Align inputs by a
H 79 23 K IL 8
Align inputs by a
H 79 23 K IL 8
Align inputs by a
K IL 8
Align inputs by a
H 79 23
K IL 8
Align inputs by a
H 79 23
K IL 8
Align inputs by a
H 89 XX
Exercise 7: Join Data
Using Last-Visits
as a lookup file
Configuring a Lookup File
Transform function:
out :: lookup_info(in) =
begin
out.id : : in.id;
out.city : : in.city;
out.amount : : in.amount;
out.dt :1 : lookup(Last-Visits, in.id).dt;
out.dt :2 : 1900/01/01;
end;
Exercise 9 (if time): Lookup
Note that if the Watcher files do not exist, the GDE will build them during the first run only,
using the Watchers on successive runs
Q&A
Any Questions ?
Capgemini
WORLDWIDE HEADQUARTERS 6400 SHAFER COURT ROSEMONT, ILLINOIS USA 60018
Tel. 847.384.6100 Fax 847.384.0500 WWW.Capgemini.COM
24 August 2007