Вы находитесь на странице: 1из 665

IBM WebSphere DataStage and QualityStage

Version 8 Release 1

IBM WebSphere DataStage and QualityStage Version 8 Release 1 Parallel Job Developer Guide LC18-9891-01

Parallel Job Developer Guide

LC18-9891-01

IBM WebSphere DataStage and QualityStage

Version 8 Release 1

IBM WebSphere DataStage and QualityStage Version 8 Release 1 Parallel Job Developer Guide LC18-9891-01

Parallel Job Developer Guide

LC18-9891-01

Note Before using this information and the product that it supports, read the information in

Note Before using this information and the product that it supports, read the information in “Notices” on page 645.

Note Before using this information and the product that it supports, read the information in “Notices”

© Ascential Software Corporation 2001, 2005.

© Copyright International Business Machines Corporation 2006, 2008. All rights reserved. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

Contents

Chapter 1. WebSphere DataStage

parallel jobs .

.

.

.

.

.

.

.

.

.

.

.

.

1

Chapter 2. Designing parallel jobs

 

.

.

.

3

Parallel processing

.

.

.

.

.

.

.

.

.

.

3

.

.

.

.

.

.

.

.

.

.

3

Pipeline parallelism . Partition parallelism .

. Combining pipeline and partition parallelism

.

.

.

.

.

.

.

 

.

.

. 4

.

5

Repartitioning data .

.

.

.

.

.

.

.

.

.

.

5

Parallel processing environments

.

.

.

.

.

.

.

6

. Partitioning, repartitioning, and collecting data .

The configuration file .

.

.

.

.

.

.

.

 

.

.

.

.

7

8

Partitioning

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

8

Collecting

.

.

.

.

.

.

.

. 17

Repartitioning data .

.

.

.

.

.

.

.

. 22

The mechanics of partitioning and collecting .

 

. 23

Sorting data .

.

.

.

.

.

.

.

.

.

.

.

.

. 25

Data sets .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 25

Metadata .

.

.

.

.

.

.

.

. 26

Runtime column propagation .

.

.

.

.

.

. 26

. Schema files and partial schemas .

Table definitions

.

.

.

.

.

.

.

.

.

.

.

.

. 26

. 27

Data types

.

.

.

.

.

.

.

.

.

. 27

.

.

.

.

.

.

.

.

.

. 30

Strings and ustrings Complex data types

. Date and time formats .

.

.

.

. Incorporating server job functionality

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 30

. 31

. 37

Chapter 3. Parallel Jobs and NLS

Maps and Locales in WebSphere DataStage Parallel

39

. Using Maps in Parallel Jobs .

Jobs .

.

.

. Character Data in Parallel Jobs .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 39

. 39

. 39

Specifying a Project Default Map .

.

.

.

.

. 40

Specifying a Job Default Map .

.

.

.

.

.

. 40

Specifying a Stage Map .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 40

Specifying a Column Map

Using Locales in Parallel Jobs .

Specifying a Project Default Locale .

.

.

.

.

.

.

.

.

.

. 41

. 41

. 41

Specifying a Job Default Locale

.

.

.

.

.

.

.

. 42

Specifying a Stage Locale

Defining Date/Time and Number Formats .

.

.

.

.

.

. 42

. 42

Specifying Formats at Project Level

.

.

.

.

.

.

.

. 42

Specifying Formats at Job Level

Specifying Formats at Stage Level .

Specifying Formats at Column Level .

.

.

.

.

.

.

.

.

.

. 42

. 43

. 43

Chapter 4. Stage editors

45

Showing stage validation errors

.

.

.

.

.

.

. 49

The stage page .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 49

General tab .

.

.

.

. 50

Properties tab .

.

.

.

. 50

Advanced tab

Link ordering tab .

.

.

.

.

.

.

. 52

. 52

NLS Map tab .

.

.

.

.

.

.

.

.

.

.

.

. 54

NLS Locale tab .

.

.

.

.

.

.

.

.

.

.

. 54

Inputs page .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 55

General tab

Properties tab .

Partitioning tab .

.

.

.

.

.

.

.

.

.

.

.

.

. 55

. 55

. 55

Format tab .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 57

Columns Tab .

.

.

.

.

. 58

Advanced tab

.

.

.

.

. 69

Output page .

.

.

.

.

. 70

General tab

.

.

.

.

. 70

Properties tab .

.

.

.

.

.

.

.

.

.

.

.

. 70

Format tab .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 70

Columns tab .

.

.

.

.

. 71

Mapping tab .

.

.

.

.

. 72

Advanced tab

.

.

.

.

. 73

Chapter 5. Reading and Writing Files

Data set stage

.

.

.

.

.

.

.

.

.

.

.

75

. 75

. Data Set stage: fast path .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 75

Data Set stage: Stage page

.

.

.

.

. 76

Data Set stage: Input page

Data Set stage: Output page .

.

.

.

.

.

.

.

.

. 76

. 79

Sequential file stage

.

.

.

.

.

.

.

.

.

.

. 79

Example of writing a sequential file .

 

.

.

.

. 81

Example of reading a sequential file .

.

.

.

. 82

Sequential File stage: fast path .

File set stage

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 83

Sequential File stage: Stage page

.

.

.

.

. 84

Sequential File stage: Input page

.

.

.

.

. 85

Sequential File stage: Output page

.

.

.

.

.

.

.

. 95

. 106

File Set stage: fast path .

.

.

.

.

. 107

File Set stage: Stage page

.

.

.

.

. 108

File Set stage: Input page

.

.

.

.

. 108

File Set stage: Output page .

.

.

.

.

.

.

. 119

Lookup file set stage .

. Lookup File Set stage: fast path .

Lookup File Set stage: Stage page

.

.

.

.

.

.

.

.

.

.

.

.

. 128

. 130

.

.

.

.

. 130

Lookup File Set stage: Input page

.

.

.

.

. 131

Lookup File Set stage: Output page .

 

.

.

.

. 135

External source stage .

. External Source stage: fast path .

.

.

.

.

.

.

.

.

.

.

.

.

. 136

. 136

External Source stage: Stage page

External Target stage .

.

.

.

.

.

 

.

.

.

. 137

External Source stage: Output page .

.

.

.

.

.

.

. 138

. 146

. External Target stage: fast path

.

.

.

.

.

. 147

.

.

.

.

. 148

External Target stage: Stage page . External Target stage: Input page

. Using RCP with External Target stages .

 

.

.

.

.

. 148

. 158

Complex Flat File stage .

. Editing a Complex Flat File stage as a source

.

.

.

.

.

.

.

 

. 159

159

Editing a Complex Flat File stage as a target

166

Reject links

.

.

.

.

.

.

.

.

. 167

Chapter 6. Processing Data

169

Transformer stage .

.

.

.

.

.

.

.

.

.

.

. 169

. Transformer editor components . Transformer stage basic concepts . Editing Transformer stages

Transformer stage: fast path

.

.

.

.

. The WebSphere DataStage expression editor .

.

.

.

.

.

.

.

.

.

.

.

.

.

. 170

. 170

. 172

. 173

. 180

Compress stage: fast path Compress stage: Stage page Compress stage: Input page

. Compress stage: Output page .

.

.

Expand Stage .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 277

. 277

. 278

. 279

. 280

Transformer stage properties .

.

.

.

.

.

. 184

. Expand stage: fast path .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 280

BASIC Transformer stages .

. BASIC Transformer stage: fast path .

.

. BASIC Transformer editor components .

.

.

.

.

.

.

.

.

.

. 188

. 189

. 189

Expand stage: Stage page Expand stage: Input page

Expand stage: Output page .

.

.

.

.

.

.

.

.

.

.

.

.

. 280

. 281

. 282

BASIC Transformer stage basic concepts

.

.

. 191

Copy stage .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 282

.

The WebSphere DataStage expression editor .

BASIC Transformer stage properties .

Editing BASIC transformer stages

.

.

.

.

.

.

. 192

. 200

. 202

Example .

Copy stage: fast path .

Copy stage: Stage page .

.

.

.

.

.

.

.

.

.

.

.

.

. 283

. 288

. 289

Aggregator stage .

.

.

.

.

.

.

.

.

.

.

. 205

Copy stage: Input page

.

.

.

.

.

.

.

.

. 289

. Aggregator stage: fast path

Example .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 206

. 208

Copy stage: Output page

.

Modify stage

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 291

. 292

Aggregator stage: Stage page

Aggregator stage: Input page .

. 209

. 215

. Modify stage: fast path .

Examples

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 292

. 294

Aggregator stage: Output page

. 216

Modify stage: Stage page

.

.

.

.

. 294

Join stage .

. Join versus lookup Example joins

.

.

.

.

.

.

.

.

.

.

.

Merge Stage .

.

.

.

Merge stage: Stage page .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 217

. 219

. 219

. Modify stage: Output page .

Modify stage: Input page

Filter Stage .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 305

. 307

. 307

Join stage: fast path .

. 221

. Specifying the filter .

.

.

.

.

.

.

.

External Filter stage: Stage page .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

External Filter stage .

.

.

.

.

.

. 308

Join stage: Stage page

. 221

Filter stage: fast path .

.

.

.

.

. 310

Join stage: Input page

Join stage: Output page .

Example merge .

Merge stage: fast path

. 223

. 225

. 225

. 227

. 228

Filter stage: Stage page Filter stage: Input page

Filter stage: Output page

External Filter stage: fast path .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 311

. 312

. 314

. 314

. 315

. 228

.

.

.

.

. 315

Merge stage: Input page

. 230

External Filter stage: Input page .

.

.

.

.

. 316

Merge stage: Output page .

.

.

.

.

.

.

. 232

External Filter stage: Output page

.

.

.

.

. 318

Lookup Stage .

.

.

. Lookup stage: fast path .

. Lookup Editor Components

.

.

.

. Lookup Versus Join . Example Look Up

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 233

. 234

. 234

. 235

. 238

Change Capture stage

.

.

. Change Capture stage: fast path . Change Capture stage: Stage page

Example Data .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 318

. 319

. 320

. 321

. 324

Editing Lookup Stages .

. 240

Change Capture stage: Input page . Change Capture stage: Output page

.

.

. 326

Lookup Stage Properties . Lookup Stage Conditions Range lookups

. 243

. 246

. 247

Change Apply stage . Example Data .

.

. Change Apply stage: fast path .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 326

. 328

. 329

. The WebSphere DataStage Expression Editor

248

Change Apply stage: Stage page

.

.

.

.

. 329

Sort stage . Examples

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 250

. 252

Change Apply stage: Input page . Change Apply stage: Output page

.

.

.

.

.

.

.

.

. 332

. 334

Sort stage: fast path .

. 256

Difference stage

.

.

.

.

.

.

.

.

.

.

.

. 334

Sort stage: Stage page Sort stage: Input page

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 256

. 260

. Difference stage: fast path .

Example data

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 335

. 336

Sort stage: Output page .

. 261

Difference stage: Stage page

.

.

.

.

. 337

Funnel Stage

.

.

.

.

.

.

.

.

.

.

.

.

. 262

Difference stage: Input page

.

.

.

.

. 340

. Funnel stage: fast path . Funnel stage: Stage page Funnel stage: Input page

Examples

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 263

. 266

. 266

. 268

Difference stage: Output page .

.

. Compare stage: fast path

.

.

.

.

Compare stage . Example Data .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 342

. 342

. 343

. 344

. Funnel stage: Output page .

.

.

.

.

.

.

. 270

. Compare stage: Stage page .

.

.

.

.

.

.

. 345

Remove Duplicates Stage

.

.

.

.

.

.

.

.

. 270

Compare stage: Input page

Encode Stage

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 347

. Remove Duplicates stage: fast path .

Example .

.

.

.

.

.

.

.

.

Remove Duplicates stage: Stage page

.

.

.

.

.

.

.

.

.

. 271

. 272

. 272

Compare stage: Output page .

Encode stage: fast path .

.

.

.

.

.

.

.

.

.

.

.

.

. 349

. 349

. 349

Remove Duplicates stage: Input page . Remove Duplicates stage: Output page

.

.

.

. 274

. 275

Encode stage: Stage page Encode stage: Input page

.

.

.

.

.

.

.

.

. 350

. 350

Compress stage .

.

.

.

.

.

.

.

.

.

.

.

. 276

Encode stage: Output page .

.

.

.

.

. 352

Decode stage

. Decode stage: fast path .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 352

. 353

Decode stage: Stage page

.

.

.

.

. 353

Decode stage: Input page

.

.

.

.

. 354

Decode stage: Output page

.

.

.

.

. 355

Switch stage .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 355

Example

Switch stage: fast path .

.

.

.

.

.

.

.

.

. 355

. 357

Switch stage: Stage page

.

.

.

.

. 357

Switch stage: Input page

.

.

.

.

.

.

.

.

.

. 360

Switch stage: Output page .

.

.

.

.

. 361

FTP Enterprise Stage .

.

.

.

.

.

.

.

.

. 362

. Surrogate Key Generator stage

Generic stage

.

.

.

.

.

.

.

.

.

.

.

.

.

. 372

. 376

Slowly Changing Dimension stage .

.

.

.

. 378

<