Вы находитесь на странице: 1из 53

DataStage Administrator and Director

Basic

C3: Protected

About the Author


Created By: Credential Information: Version and Date: Mandhagini P.S (127057) An expert in DataStage having 3 years of IT experience DS/PPT/1106/1.0

Copyright 2005, Cognizant Academy, All Rights Reserved

Icons Used
Questions Hands-on Exercise

A Welcome Break

Test Your Understanding

Coding Standards

Reference

Demo

Key Contacts

Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Administrator and Director: Overview


Introduction:
DataStage is a Widely used Data Warehousing (DW) tool used to develop Complex ETL jobs. It has a unique feature of Real Time Integration and also provides a very user friendly Interface. DataStage has many features to make easier back end query. DataStage administrator allows you to prepare the setup for DataStage Projects and General Administration of DataStage DataStage director allows you to monitor, schedule, and run the jobs and helps in viewing the Job Log after running the job

Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Administrator and Director: Objectives


Objective:
After completing this chapter, you will be able to: Identify what is DataStage tool Define DataStage Administrator Work with DataStage Administrator Explain DataStage Director Work with DataStage Director

Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Administrator: Logging In


Logging into a DataStage server using the Administrator requires the host name of the server, the fully qualified name if necessary or the servers IP address, and an operating system username and password. For UNIX servers, users logging in as root or as a root-equivalent account, or as dsadm will have full administrative rights. For Windows servers, users logging in who are members of the Local Administrators (standalone server) or Domain Administrators (domain controller or servers in an Active Directory Forest) groups will have full administrative rights.

Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Administrator: Logging In (Contd.)


The Administrator Login Dialog Box

Enter the hostname or IP address of the server where DataStage is installed Enter your operating system username and password

Copyright 2005, Cognizant Academy, All Rights Reserved

Viewing the Project List


This page lists the DataStage projects, and shows the pathname of the selected project in the Project pathname field. The Projects page has the following buttons: Add: Adds new DataStage projects. This button is enabled only if you have administrator status. Delete: Deletes projects. This button is enabled only if you have administrator status. Properties: Views or sets the properties of the selected project. NLS: Lets you change project maps and locales (if the NLS option was installed during the server installation). Command: Issues DataStage Engine commands directly from the selected project.

Copyright 2005, Cognizant Academy, All Rights Reserved

Adding Projects
Provided that you have the proper permissions, you can add as many projects to the DataStage server as necessary. In normal projects any DataStage developer can create, delete, or modify any object within the project once it has been created.

Tip: The default directory path in which to create projects is located under the root directory of the DataStage server installation. For example, if the server was installed to /appl/Ascential/DataStage the projects would be installed to /appl/Ascential/DataStage/Projects/{project name}.
Copyright 2005, Cognizant Academy, All Rights Reserved 9

Deleting Projects
Highlight the project to be deleted

Make sure you have a current backup of your project, just in case!

Copyright 2005, Cognizant Academy, All Rights Reserved

10

General Project Options


Enable job administration in Director - enabling this feature allows the user the ability to Cleanup Resources and Clear Status File from within the Job menu of DataStage Director. Enable Runtime Column Propagation for Parallel Jobs - if you enable this feature, stages in parallel jobs can handle undefined columns that they encounter when the job is run, and propagate these columns through to the rest of the stages in the job. Auto-purge of job log - this setting will automatically purge job log entries for jobs based on the auto-purge action setting. For example, if you specify to auto purge up to the previous 3 job runs, entries for the previous 3 job runs are kept as new job runs are completed.

Copyright 2005, Cognizant Academy, All Rights Reserved

11

General Project Options (Contd.)

Auto purge settings for job logsnot a global or retroactive setting

Create Environmental Variables

Copyright 2005, Cognizant Academy, All Rights Reserved

12

Setting Project-wise Environment Variables


You can set project-wide defaults for general environment variables or ones specific to parallel jobs from this page. You can also specify new variables. All of these are then available to be used in jobs. In each of the categories except User Defined, only the default value can be modified. In the User Defined category, users can create new environment variables and assign default values.

Copyright 2005, Cognizant Academy, All Rights Reserved

13

Setting Project-wise Environment Variables (Contd.)

Copyright 2005, Cognizant Academy, All Rights Reserved

14

Enable Server-Side Job Tracing


You can trace the activities on the server to help diagnose project problems. Enable or disable tracing in the project

View or delete the currently highlighted file

Trace files that have been created

Copyright 2005, Cognizant Academy, All Rights Reserved

15

Validating User Account for Job Scheduling


This tab applies to Windows NT/2000 servers only. DataStage uses the Windows NT Schedule service to schedule jobs.

Select a user account with proper access to the DataStage project

Verification that the currently selected user account can schedule jobs

Copyright 2005, Cognizant Academy, All Rights Reserved

16

Performance Tuning Options


Some performance tuning options are: Row buffering Hashed file stage caching

Copyright 2005, Cognizant Academy, All Rights Reserved

17

Server Commands
Select a project and click Command

Enter a valid DataStage command

When you execute the command, a new window will show the response from the engine

Copyright 2005, Cognizant Academy, All Rights Reserved

18

Assigning Roles (Operator/Developer) to User Accounts


There are four roles for a DataStage user account: DataStage Developer: Has full access to all areas of a DataStage project. DataStage Production Manager: Has full access to all areas of a DataStage project, and can also create and manipulate protected projects. DataStage Operator: Has permission to run and manage DataStage jobs. <None>: Does not have permission to log on to DataStage.

Copyright 2005, Cognizant Academy, All Rights Reserved

19

Assigning Roles (Operator/Developer) to User Accounts (Contd.)

Select the user role, which is to be assigned to particular user accounts.

Copyright 2005, Cognizant Academy, All Rights Reserved

20

Settings for Parallel Jobs


Enable Runtime Column Propagation for Parallel Jobs
When this feature is enabled, stages in parallel jobs can handle undefined columns that they encounter when the job is run, and propagate these columns through to the rest of the job.

Enable Remote Execution of Parallel Jobs


Select this to specify that parallel jobs in this project are to be deployed on USS machine (Unix systems Services). When this option is selected, the Remote tab is enabled and you can specify details about the jobs that are deployed

Copyright 2005, Cognizant Academy, All Rights Reserved

21

Settings for Parallel Jobs (Contd.)

Enable these options.

Copyright 2005, Cognizant Academy, All Rights Reserved

22

Settings for Parallel Jobs (Contd.)

Copyright 2005, Cognizant Academy, All Rights Reserved

23

DataStage Director: Logging In


Logging into a DataStage server using the Director requires. The host name of the server, the fully qualified name if necessary, or the servers IP address and the operating system username and password.

Copyright 2005, Cognizant Academy, All Rights Reserved

24

DataStage Director: Logging In (Contd.)


The Director Login Dialog Box
Enter the hostname or IP address of the server where DataStage is installed Enter your operating system username and password Select the project to attach to

Copyright 2005, Cognizant Academy, All Rights Reserved

25

Viewing the Job Run Status


The Job Status view shows the status of all the jobs in the currently selected job category, or, if the job category pane is hidden, in the current project. The view has the following columns: Job name: The name of the job. Status: The status of the job. Started on date: The time and date a job was started. These fields are only filled in for a job with a status of Running. Last ran on date: The time and date the job was finished, stopped, or aborted. These columns are blank for jobs that have never been run. Description: A description of the job, if available. To view more details about a jobs status, select the job and do one of the following: Choose View > Detail. Right-click to display the shortcut menu and choose Detail. Double-click the job.
Copyright 2005, Cognizant Academy, All Rights Reserved 26

Viewing the Job Run Status (Contd.)

Detailed information about a jobs status

Copyright 2005, Cognizant Academy, All Rights Reserved

27

Validating a Job
You can check that a job or job invocation will run successfully by validating it. Jobs should be validated before running them for the first time, or after making any significant changes to job parameters. When a server job is validated, the following checks are made without actually extracting, converting, or writing data. Connections are made to the data sources or data warehouse. SQL SELECT statements are prepared. Files are opened. Intermediate files in Hashed File, UniVerse, or ODBC stages that use the local data source are created, if they do not already exist.

Copyright 2005, Cognizant Academy, All Rights Reserved

28

Validating a Job (Contd.)

Click Validate when Job Run Options and parameters have been set
Copyright 2005, Cognizant Academy, All Rights Reserved 29

Running a Job

Click Run when Job Run Options, parameters and tracing options have been set
Copyright 2005, Cognizant Academy, All Rights Reserved 30

Monitoring a Job

Expand tree to see all links attached to an active stage

Optionally show CPU utilization for each active stage

Copyright 2005, Cognizant Academy, All Rights Reserved

31

Stopping a Job

Click Stop button to stop a running job

Copyright 2005, Cognizant Academy, All Rights Reserved

32

Resetting a Job
If a job has stopped or aborted, then it is difficult to determine whether all the required data was written to the target data tables. When a job has a status of Stopped or Aborted, you must reset it before running the job again. By resetting a job, you set it back to a runnable state and, optionally, return your target files to the state they were in before the job was run. To reset a job or job invocation: 1. Select the job or invocation you want to reset in the Job Status view. 2. Choose Job > Reset or click the Reset button on the toolbar. A message box appears. 3. Click Yes to reset the tables. All the files in the job are reinstated to the state they were in before the job was run. The jobs status is updated to Has been reset.
Copyright 2005, Cognizant Academy, All Rights Reserved 33

Resetting a Job (Contd.)

Click Reset button to return a job to a runnable state

Copyright 2005, Cognizant Academy, All Rights Reserved

34

Interpreting the Job Execution Details in Log View

Current runblack Previous runblue

Additional information is available for this entry ()


35

Copyright 2005, Cognizant Academy, All Rights Reserved

Log Event Detail Window


Detail information can be copied to the system clipboard and pasted into a text editor useful for sending errors to support!

Additional lines of information regarding this particular event

Copyright 2005, Cognizant Academy, All Rights Reserved

36

Filtering Log Events


Where to start showing log entries

Where to stop showing log entries What type of log entries to show How many log entries to show

Copyright 2005, Cognizant Academy, All Rights Reserved

37

Clearing Log Entries

Immediately delete log entries or automatically purge entries

Which entries to remove immediately

Which entries to remove automatically

Copyright 2005, Cognizant Academy, All Rights Reserved

38

Clearing Log Entries (Contd.)


Options in Auto- Purge: Up to previous (job runs): Purges old log entries, leaving the specified number of recent job run entries in the file. Older than (days): Purges all log entries older than the specified number of days. Specify the number of job run entries or days by clicking the arrow buttons or entering the value directly.

Copyright 2005, Cognizant Academy, All Rights Reserved

39

Schedule View

Copyright 2005, Cognizant Academy, All Rights Reserved

40

Scheduling a Job Execution


You can schedule a job to run in a number of ways: Once today at a specified time Once tomorrow at a specified time On a specific day and at a particular time Daily at a particular time On the next occurrence of a particular date and time

Copyright 2005, Cognizant Academy, All Rights Reserved

41

Scheduling a Job Execution (Contd.)

Select a job and click Schedule button

Copyright 2005, Cognizant Academy, All Rights Reserved

42

Rescheduling a Job Execution

Select a previously scheduled job and click Reschedule button

Copyright 2005, Cognizant Academy, All Rights Reserved

43

Un-scheduling a Job Execution

Right click on a previously scheduled job and click Unschedule

Copyright 2005, Cognizant Academy, All Rights Reserved

44

Cleaning Up Resources
If the Enable Job Administration in Director option has been set in the DataStage Administrator, then certain functions are available to help you clean up the resources of a job that has hung or aborted or return a job to a state in which you can rerun it after the cause of the problem has been fixed. You should use them with care, and only after you have tried to reset the job and you are sure it has hung or aborted. The Cleanup Resources command lets you: View and end job processes View and release the associated locks

Copyright 2005, Cognizant Academy, All Rights Reserved

45

Cleaning Up Resources (Contd.)


Operating systems process ID number Logout (kill) selected O/S process Engine locks associated with processes

Copyright 2005, Cognizant Academy, All Rights Reserved

46

Clearing the Status File

Select a hung job and select Clear Status File from Job menu

Copyright 2005, Cognizant Academy, All Rights Reserved

47

Clearing the Status File (Contd.)


Before you clear a status file you should: Try to reset the job. Ensure that all the jobs processes have ended.

Copyright 2005, Cognizant Academy, All Rights Reserved

48

Allow time for questions from participants

Copyright 2005, Cognizant Academy, All Rights Reserved

49

Test Your Understanding


What is the use of having User Defined Environment Variables? Can a DataStage operator manipulate a protected Project? What is the default cache size of a Hash size? When will Clear Status File be enabled in Director? What does () in the JOB LOG mean? Where do you see the CPU Utilization of each stage in a job?

Copyright 2005, Cognizant Academy, All Rights Reserved

50

DataStage Administrator and Director: Summary


DataStage is an ETL tool widely used in Data Warehousing. It has 4 components: Administrator, Director, Designer and Manager. Administrator can be used to: Create or delete projects Assign roles to user accounts Set project specific environment variables Enable tracing and Performance tuning Director can be used to: View job statistics Validate/Run/Monitor/Stop/Reset and Schedule jobs View logs/ filter log events and clear log entries Clean up job resources
Copyright 2005, Cognizant Academy, All Rights Reserved 51

DataStage Administrator and Director: Source


DataStage 7.5.1 manual

Disclaimer: Parts of the content of this course is based on the materials available from the Web sites and books listed above. The materials that can be accessed from linked sites are not maintained by Cognizant Academy and we are not responsible for the contents thereof. All trademarks, service marks, and trade names in this course are the marks of the respective owner(s).
Copyright 2005, Cognizant Academy, All Rights Reserved 52

You have successfully completed DataStage Administrator and Director.

Вам также может понравиться