Вы находитесь на странице: 1из 22

Join, LookUp, and Merge stages

These Three Stages combine two or more input links according to values of user-designated Key column(s)

1. Join Stage
The Join stage is an active stage. It performs join operations on two data sets input to the stage and then outputs the resulting data set. The input data sets are notionally identified as the right set and the left sets. You can specify which is which.
It has any number of input links and a single output link.

The stage can perform one of four join operations:


Inner transfers records from input data sets whose key columns contain equal values to the output data set. Records whose key columns do not contain equal values are dropped. Left outer transfers all values from the left data set but transfers values from the right data set. The operator drops the key column from the right data set. Right outer transfers all values from the right data set and transfers values from the left data set. The operator drops the key column from the left data set. Full outer transfers records in which the contents of the key columns are equal from the left and right input data sets to the output data set. It also transfers records whose key columns contain unequal values from both input data sets to the output data set.

Join Stage Ex

Join Stage Ex

2. Lookup Stage
The Lookup stage is an active stage. It is used to perform lookup operations on a lookup table contained in a Lookup File Set stage or provided by one of the database stages that support reference output links ! Lookup Tables should be small enough to fit into physical memory (otherwise, performance hit due to paging)

Lookup stage Ex

Lookup stage Editor

3. Merge Stage
The Merge stage is an active stage. It can have any number of input links,a single output link, and the same number of reject links as there are input links. The Merge stage combines a sorted master data set with one or more sorted update data sets. The columns from the records in the master and update data sets are merged so that the output record contains all the columns from the master record plus any additional columns from each update record.

3. Merge Stage

Synopsis: Joins, Lookup and Merge


Joins
Model Memory usage # and names of Inputs Mandatory Input Sort Duplicates in primary input Duplicates in secondary input(s) Options on unmatched primary Options on unmatched secondary On match, secondary entries are # Outputs Captured in reject set(s) RDBMS-style relational light exactly 2: 1 left, 1 right both inputs OK (x-product) OK (x-product) NONE NONE reusable 1 Nothing (N/A)

Lookup
Source - in RAM LU Table heavy 1 Source, N LU Tables no OK Warning! [fail] | continue | drop | reject NONE reusable

Merge
Master -Update(s) light 1 Master, N Update(s) all inputs Warning! OK only when N = 1 [keep] | drop capture in reject set(s) consumed

1 out, (1 reject) 1 out, (N rejects) unmatched primary entries unmatched secondary entries

Вам также может понравиться