Вы находитесь на странице: 1из 16

DataStage

Essentials

Hashed Files
Server Edition
Module Objectives

Upon module completion, students will be able to:


 Define what a hashed file is
 List different types of hashed files
 List various uses for hashed files
 Create hashed files
 Source hashed files
 Use caching attributes to accelerate reads and writes
 Import metadata from hashed files
 Delete hashed files
What is a Hashed File?

 A DataStage file written to the file system


 Most use a hashing algorithm based on key
column values
 Files can be cataloged in the project’s VOC file for
easy retrieval
Types of Hashed Files

 21 different types
 Type 1 and 19 do not use a hashing algorithm
 Types 2-18 are static hashed files and use a
hashing algorithm
 Type 25 is static and uses a B-tree algorithm
 Type 30 is dynamic and uses a hashing algorithm
Uses of Hashed Files

 Good for locally storing tables of a remote


database that will be read from frequently
 Good as an intermediate file location in
sequences of jobs
 Main use is as a reference lookup table
Hashing Algorithms
(Static File Types)

Location where most variation occurs in key column

Character type

Right Middle Left Any

Wholly numeric 2 6 10 14

Numeric &
3 7 11 15
separators

ASCII 4 8 12 16

Any 5 9 13 17
Inserting Records into
Hashed Files

Hashing
Algorithm

Group 1 Group 2 Group 3 Group 4 Group 5


Overflow Groups

 When there is not enough space remaining in a


group, the group overflows

Group
2048 4096 6144 8192 10240 12288
Address

Header Group 1 Group 2 Group 3 Group 4 Group 5 overflow


Group 2
Creating Hashed Files

Insert the name of


the hashed file—
can use parameters

Let the job create


the hashed file

 Can optionally use the CREATE.FILE command


from a DataStage command shell or program
Options for Creating Hashed Files

Which type of file


How many groups
to create
to create initially
How large the
What percentage
groups will be
of file capacity to
create a new group
What percentage
of file capacity to
Similar to dropping remove a group
a table
Hashed File Locations

Create and/or write


records to a
hashed file in a
specific project—
doesn’t have to be
yours!

Create and/or write


records to a
hashed file in a
specific directory—
can present issues
Write Caching

 Enabled from Input page of hashed file stage


 Allows records to be written to memory buffer
initially and flushed to disk once
 I/O is extremely expensive—this minimizes writes
to disk
 Can present issues when reading from a hashed
file at the same time
Importing Metadata
from a Hashed File

Choose the project


from the drop-down
list

Available hashed
files within the
project appear—
select one or
multiples
Sourcing a Hashed File

Insert the name of


the hashed file or
use drop-down list—
can use parameters

Enable or disable
read caching—four
methods
Read Caching

 Enabled from Output page of hashed file stage


 Four methods are available—be sure to choose
the correct method
 Allows records to be read into memory buffer
 I/O is extremely expensive—this minimizes reads
from disk
 Can present issues when writing to a hashed file
at the same time
Deleting Hashed Files

 Not advisable to manually delete operating


system directories and files
 Issue DELETE.FILE command from DataStage
Administrator or DataStage command shell
 Create shell script to run delete_file.exe program
and supply hashed file name
– delete_file.exe program is installed when the server is
installed and is located in the server’s engine directory
in the bin subdirectory

Вам также может понравиться