Академический Документы
Профессиональный Документы
Культура Документы
Welcome to DataStage/390! This sample job introduces you to the features and functionality in DataStages Mainframe edition. It extracts data from two complex flat files, performs a series of processing steps, and writes the data to a delimited flat file. The processing steps include stages to transform data, perform a table lookup, join data from intermediate fixed-width flat files, and finally aggregate data before loading the target data warehouse. As a final step, the target file is prepared for transfer to a Unix system. The sample job is divided into four parts: Part 1 extracts data from a complex flat file called PRODUCT_MASTER, performs simple data transformations in a Transformer stage, performs a lookup to a relational table using a Lookup stage, and loads data to a fixedwidth flat file called PRODUCT. Part 2 extracts data from a complex flat file called SALES_ORDER_ENTRY, performs complex transformations in a Transformer stage, and loads data to a fixed-width flat file called ORDER_LINE_ITEMS. Part 3 joins the data from PRODUCT and ORDER_LINE_ITEMS and loads a fixed-width flat file called PRODUCT_SALES_ANALYSIS. Part 4 aggregates the data from PRODUCT_SALES_ANALYSIS and loads a delimited flat file called MONTHLY_PRODUCT_SALES. A final FTP stage collects the information needed to transfer the target file to the host machine.
This document takes you on a tour of the sample job, showing you each stage and how it is configured. Youll become familiar with the different stage types in mainframe jobs and their unique characteristics.
DataStage 4.1
Page 1 of 11
03/30/2013
In the center of the window is Part 3, a Join stage labeled Join_Products_Orders. Part 4 is displayed on the right side of the window. Data from the Join stage flows into a Fixed-Width Flat File stage labeled PRODUCT_SALES_ANALYSIS. The data is output to an Aggregator stage labeled Sales_Aggregation, and then loaded into a Delimited Flat File stage called MONTHLY_PRODUCT_SALES. The last stage in the job design is an FTP stage labeled UNIX_FTP.
1. Double-click the PRODUCT_MASTER Complex Flat File stage. The Complex Flat File Stage dialog box appears, displaying the stage General page. Notice that this page specifies the name of the file from which data is extracted, XDV4.PRODUCT.MASTER. It also specifies the DD name of the file, the access type, and the starting and ending rows. 2. Click the Columns tab. This page displays the columns definitions of the data being read by the stage. The columns were loaded from the SALESORD.CFD table definition (see page 8 for detailed definitions of the tables and columns used in the sample job). Right-click over the LAST_UPDATE_DATE field and select Edit row... from the shortcut menu. The Edit Column Meta Data dialog box appears. Notice that the Date format field specifies a date format of MMDDCCYY. Click the Next button to display the meta data for the EFF_START_DATE field. Its date format is also MMDDCCYY. 3. Click the File view tab. This page displays the COBOL PICTURE clauses for the columns and the exact storage layout in the file. 4. Now click the Outputs tab. The Constraint page is displayed by default. The constraint specifies that records without a date in the EFF_END_DATE field are not to be output from the stage. 5. Click the Selection tab. Notice that a subset of columns appears in the Selected columns list. 6. Click OK to close the Complex Flat File Stage dialog box.
Transformer Stage
Next, lets look at the Transformer stage. It specifies the transformations to be applied to the data before it is sent to the Lookup stage. 1. Double-click the Prod_Mstr_Transform Transformer stage. The Transformer Editor appears. The upper part of the Transformer Editor shows the columns
DataStage 4.1
Page 2 of 11
03/30/2013
on the input and output links, and the lower part displays the column meta data for each link. 2. Most of the output columns are derived from their corresponding input columns, as indicated by the relationship lines between them. Notice that the derivation for the EXTRACT_DATE output column is CURRENT_DATE. Doubleclick the Derivation cell to open the Expression Editor. Click Constants in the Item type list and notice that CURRENT_DATE is displayed in the Item properties list. This expression uses the constant to specify that EXTRACT_DATE is derived from the current data at the time of execution. 3. Click OK to close the Transformer Editor.
Lookup Stage
Now lets look at the Lookup stage, which is designed to match rows from the two input links based on unit-of-measure codes and return unit-of-measure descriptions. 1. Double-click Prod_Uom_Lookup to open the Lookup Stage dialog box. The General page is displayed by default. It indicates that a Singleton Lookup is to be performed using the Auto lookup technique. Skip Row is the action specified if the lookup fails. 2. Click the Inputs tab and select each of the two input links from the Input name field, noticing the column definitions displayed in the Columns grid. 3. Click the Outputs tab. The Lookup Condition page is displayed by default and contains the key expression for performing the lookup. The lookup will be performed when the UOMCD column from the reference link equals the UOM_CODE column from the primary link. 4. Click the Mapping tab. The left pane displays the columns from the reference and primary links. The right pane shows the output column derivations. Notice that the UOM_DESC column from the UOM Relational stage is mapped to the UNIT_OF_MEASURE output column. Most of the other output columns are derived from the primary link input columns. 5. Click OK to close the Lookup Stage dialog box.
DataStage 4.1
Page 3 of 11
03/30/2013
DataStage 4.1
Page 4 of 11
03/30/2013
2. Click Outputs and notice that all columns are being passed through the stage and no constraint has been specified.
Transformer Stage
This stage defines the field mappings and transformations of data flowing from the SALES_ORDER_ENTRY source to the ORDER_LINE_ITEMS target. 1. Open the Order_Transform Transformer stage. Click the Show/Hide Stage Variables button on the Transformer Editor toolbar to display the stage variables. Notice the link lines joining the input columns with the stage variables, and the link lines along the right side of the table connecting the variables to the output columns that use them. Four stage variables have been defined: a) TempColorDesc defines field conversions to convert COLOR_CODE input values before they are moved to the COLOR_DESC output column. Doubleclick the Derivation cell to open the Expression Editor, and examine the IFELSE statements used to build the expression. When you are done, click OK. b) WxGrossDisc is a working storage variable that stores intermediate results in the calculation of the WxDiscAmt variable. Right-click and choose Stage Variable Properties from the shortcut menu to display the Transformer Stage Properties dialog box. Notice the properties for this and the other variables. c) WxReturnDisc is also working storage for calculating the WxDiscAmt variable. Open the Expression Editor and look at the expression used to define this variable. d) WxDiscAmt is working storage used in the arithmetic specifications that calculate the values of the DISC_AMT and LINE_ITEM_SALES_AMT output columns. Open the Expression Editor and notice that the expression is based on the WxGrossDisc and WxReturnDisc variables. 2. Now look at the Derivation cells for the COLOR_DESC, DISC_AMT, and LINE_ITEM_SALES_AMT output columns. Notice how the stage variables are used in the expressions. 3. Look at the ORDER_DATE column derivation, which is the concatenation of the ORDER_YY, ORDER_MM, and ORDER_DD input values. 4. Finally, look at the QUANTITY_SOLD column derivation, which subtracts RETURN_QUANTITY from QUANTITY_ORDERED input values. 5. Click OK to close the Transformer Editor. Now that you are done with Part 2, lets move on to Part 3 where a Join stage combines data from the two input streams.
DataStage 4.1
Page 5 of 11
03/30/2013
Join Stage
1. Double-click the Join_Products_Orders Join stage. On the General page, notice that the join type is an inner join, which returns only those rows that have matching values in both input tables. The join technique is AUTO, which means DataStage will choose the technique based on the information specified in the stage. 2. Click Inputs and look at the column definitions being passed from the two input links. 3. Click Outputs. The Join Condition page is displayed by default. The join will be performed where PRODUCT_ID of the ORDER_LINE_ITEMS input table equals PRODUCT_ID of the PRODUCT input table. 4. Click Mapping and examine the mappings between input columns and output columns. 5. Click OK to close the Join Stage dialog box.
Aggregator Stage
The Aggregator stage groups data from the input link, performs aggregation functions, and outputs the data on a single output link. 1. Double-click the Sales_Aggregation stage. The Outputs page is active by default. Control break aggregation is selected, meaning that the input rows will not be sorted before aggregation occurs. 2. Click Aggregation and examine the settings. Where are first and last values returned? Which columns are summarized and which are averaged? Also
DataStage 4.1
Page 6 of 11
03/30/2013
notice that Group By is checked for the rest of the columns. Every output column from an Aggregator stage must be either aggregated or grouped by. 3. Click Mapping to look at the input-to-output column mappings. Input column names are appended with a tag indicating the aggregation function being performed. Output column derivations also display these tags. 4. Click OK to close the Aggregator Stage dialog box.
Summary
This sample job introduced you to the capabilities of DataStage/390. It featured most of the source, target, and processing stage types that are available in mainframe jobs. You saw how to configure the individual stages and link them together in manageable steps, resulting in an effective design for building a data warehouse. For more information about DataStage/390, refer to DataStage/390 Job Developers Guide and DataStage/390 Tutorial.
DataStage 4.1
Page 7 of 11
03/30/2013
PRODMSTR.CFD
1 PRODUCT-MASTER. 05 PRODUCT-ID. 10 PRODUCT-LINE PIC X(04). 10 PRODUCT-MODELPIC X(05). 05 LAST-UPDATE-DATE PIC X(08). 05 EFF-START-DATE PIC X(08). 05 EFF-END-DATE PIC X(08). 05 ORDER-LEAD-TIME PIC X(02). 05 STOCK-INVENTORY PIC X. 05 UOM-CODE PIC X. 05 UNIT-PRICE PIC S9(5)V99 COMP- 3. 05 WARRANTY-TYPE PIC XX. 05 WARRANTY-PERIOD PIC S9(3) COMP-3. 05 PRODUCT-DESC PIC X(20). 05 AVAILABLE-COLORS OCCURS 10 TIMES. 10 COLOR-CODE PIC X(04). 10 COLOR-DESC PIC X(15). 05 PROD-DISCOUNTS OCCURS 5 TIMES. 10 DISC-FROM-DATE PIC X(08). 10 DISC-END-DATE PIC X(08). 10 DISC-PCT PIC SV9(3) COMP-3.
UOMTBLE.DFD
EXEC SQL DECLARE XDV4.UOM TABLE ( UOMCD CHAR(1) NOT NULL, UOM_DESC CHAR(5) NOT NULL ) END-EXEC.
PRODTBL.CFD
01 PRODUCT-TABLE. 10 PRODUCT-ID 10 EXTRACT-DATE 10 EFF_START_DATE 10 LAST_UPDATE_DATE 10 PROD_DESC 10 UNIT_OF_MEASURE 10 WARRANTY_TYPE 10 WARRANTY_PERIOD PIC X (09). PIC X (08). PIC X (08). PIC X (08). PIC X (20). PIC X (05). PIC X (02). PIC S 9(3)
COMP-3.
DataStage 4.1
Page 8 of 11
03/30/2013
SALESORD.CFD
01 SALES-ORDER-INFO. 05 ORDER-NUMBER 05 LINE-ITEM-NO 05 ORDER-STATUS 05 ORDER-DATE. 10 ORDER-YY 10 FILLER 10 ORDER-MM 10 FILLER 10 ORDER-DD 05 SHIPMENT-DATE. 10 SHIPMENT-YY 10 FILLER 10 SHIPMENT-MM 10 FILLER 10 SHIPMENT-DD 05 CUSTOMER-ID 05 SALES-REP-ID 05 ROUTE-CODE 05 ORDER-TOTAL-AMT 05 SHIPPING-CHARGE 05 TAXES-PAID 05 LINE-ITEM-STATUS 05 PRODUCT-ID 05 QUANTITY-ORDERED 05 UNIT-PRICE 05 COLOR-CODE 05 DISC-PCT 05 LINE-ITEM-AMOUNT 05 LINE-ITEM-TAX 05 ITEM-ORDER-DATE. 10 ITEM-ORDER-YY 10 FILLER 10 ITEM-ORDER-MM 10 FILLER 10 ITEM-ORDER-DD 05 ITEM-SHIP-DATE 05 QUANTITY-SHIPPED 05 RECEIVED-DATE 05 BACK-ORDER-QUANTITY 05 BACK-ORDER-DATE 05 BACK-ORDER-SHIP-DATE 05 RETURN-DATE 05 RETURN-QUANTITY 05 RETURN-REASON-CODE PIC X(10). PIC 9(05). PIC X. PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC X(02). X. X(02). X. X(02). X(02). X. X(02). X. X(02). X(10). X(08). X(10). S9(7)V99 COMP-3. S9(3)V99 COMP-3. S9(5)V99 COMP-3. X. X(09). S9(3) COMP-3. S9(5)V99 COMP-3. X(02). SV9(3) COMP-3. S9(7)V99 COMP-3. PIC S9(3)V99 COMP-3. X(02). X. X(02). X. X(02). X(08). S9(03) X(08). S9(3) S9(5) S9(5) X(08). S9(3) X(02).
DataStage 4.1
Page 9 of 11
03/30/2013
ORDITEMP.CFD
01 SLS-ORD-ITEM-TEMP. 05 PRODUCT-ID 05 ORDER-YY 05 ORDER-MM 05 ORDER-DATE 05 ORDER-NUMBER 05 CUSTOMER-ID 05 SALES-REP-ID 05 COLOR-DESC 05 ITEM-SHIP-DATE 05 RECEIVED-DATE 05 LINE-ITEM-STATUS 05 QUANTITY-ORDERED 05 QUANTITY-SOLD 05 UNIT-PRICE 05 DISC-AMT 05 LINE-ITEM-ORDER-AMT 05 LINE-ITEM-SALES-AMT 05 BACK-ORDER-QUANTITY 05 BACK-ORDER-DATE 05 BACK-ORDER-SHIP-DATE 05 RETURN-DATE 05 RETURN-REASON-CODE PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC X(09). X(02). X(02). X(08). X(10). X(10). X(08). X(10). X(08). X(08). X. S9(3) COMP-3. S9(3) COMP-3. S9(5)V99 COMP-3. S9(3)V99 COMP-3. S9(7)V99 COMP-3. PIC S9(7)V99 COMP-3. PIC S9(03) COMP-3. PIC S9(05) COMP-3. PIC S9(05) COMP-3. PIC X(08). PICX(02).
PRODSALE.CFD
01 PRODUCT-SALES. 10 PRODUCT-ID 10 ORDER-YY 10 ORDER-MM 10 EXTRACT-DATE 10 ORDER-STATUS 10 ORDER-NUMBER 10 CUSTOMER-ID 10 SALES-REP-ID 10 PROD-DESC 10 COLOR-DESC 10 ITEM-SHIP-DATE 10 RECEIVED-DATE 10 QUANTITY-ORDERED 10 QUANTITY-SOLD 10 UNIT-PRICE 10 DISC-AMT 10 ITEM-ORDER-AMT 10 ITEM-SALES-AMT 10 BACK-ORDER-QUANTITY 10 BACK-ORDER-DATE 10 BACK-ORDER-SHIP 10 RETURN-DATE 10 RETURN-REASON-CODE PIC X(09). PIC X(2). PIC X(2). PIC X(10). PIC X(1). PIC X(10). PIC X(10). PIC X(8). PIC X(20). PIC X(10). PIC X(8). PIC X(8). PIC S9(3) COMP-3. PIC S9(3) COMP-3. PIC S9(5)V99 COMP-3. PIC S9(3)V99 COMP-3. PIC S9(7)V99 COMP-3. PIC S9(5)V99 COMP-3. PIC S9(3) COMP-3. PIC X(10). PIC X(10). PIC X(8). PIC X(2).
DataStage 4.1
Page 10 of 11
03/30/2013
MTHSALES.CFD
01 MONTHLY-PRODUCT-SALES. 10 PRODUCT-ID 10 ORDER-YY 10 ORDER-MM 10 PROD-DESC 10 AVG-QTY-ORDERED 10 AVG-QTY-SOLD 10 AVG-UNIT-PRICE 10 AVG-DISC-AMT 10 GROSS-ORDER_AMT 10 ACT-SALES_AMT 10 BACK-ORDER-QUANTITY PIC PIC PIC PIC PIC PIC PIC PIC PIC PIC X(09). X(2). X(2). X(20). S9(3) COMP-3. PIC S9(3) COMP-3. S9(5)V99 COMP-3. S9(3)V99 COMP-3. S9(7)V99 COMP-3. S9(7)V99 COMP-3. S9(3) COMP-3.
DataStage 4.1
Page 11 of 11
03/30/2013