Beruflich Dokumente
Kultur Dokumente
Version 8 Release 5
SC18-9894-02
SC18-9894-02
Note Before using this information and the product that it supports, read the information in Notices and trademarks on page 53.
Copyright IBM Corporation 1997, 2010. US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Chapter 1. The InfoSphere DataStage and QualityStage Director client . . . . 1
Starting the Director client . . . . . . . . . . 1 The Director client window . . . . . . . . . 1 Repository pane . . . . . . . . . . . . 2 Display area . . . . . . . . . . . . . 2 Menu bar . . . . . . . . . . . . . . 2 Toolbar . . . . . . . . . . . . . . . 3 Status bar . . . . . . . . . . . . . . 3 Job Status view . . . . . . . . . . . . . 4 Job states . . . . . . . . . . . . . . 4 Job status details . . . . . . . . . . . . 5 Shortcut menus . . . . . . . . . . . . . 6 Shortcut menus in the Job Status view . . . . . 6 Shortcut menus in the Job Log view . . . . . 6 Shortcut menus in the Job Schedule view . . . . 6 Shortcut menus in the repository pane. . . . . 7 Shortcut menu in the Monitor window . . . . 7 Filtering the Job Status or Job Schedule view . . . 7 Examples of filtering by job name . . . . . . 8 Finding text . . . . . . . . . . . . . . 9 Sorting columns . . . . . . . . . . . . . 9 Printing the current view . . . . . . . . . . 10 What Is in the Printout? . . . . . . . . . 10 Changing the printer setup . . . . . . . . 11 DataStage Director options . . . . . . . . . 11 General page . . . . . . . . . . . . . 11 Limits page . . . . . . . . . . . . . 12 View page . . . . . . . . . . . . . . 13 Priority page . . . . . . . . . . . . . 13 Choosing an alternative project . . . . . . . . 14 Viewing jobs in another project . . . . . . . 14 Viewing jobs on a different engine tier . . . . 14 Exiting the IBM InfoSphere DataStage Director client 14 Multiple job invocations . . . . . . . . Creating multiple job invocations . . . . Running a job invocation . . . . . . . Viewing the job log for an invocation. . . Setting tracing options. . . . . . . . . Enabling operational metadata at the job level (parallel and server jobs) . . . . . . . . Disabling message handlers . . . . . . . . . . . . . . . . . . . 25 26 26 26 27
. 27 . 27
Product accessibility
. . . . . . . . 47
Accessing product documentation. . . 49 Links to non-IBM Web sites. . . . . . 51 Notices and trademarks . . . . . . . 53 Contacting IBM . . . . . . . . . . . 57 Index . . . . . . . . . . . . . . . 59
iii
iv
Repository pane
The left pane of the Director client window is the repository pane. The repository pane displays the repository tree, which lists folders and sub-folders that contain parallel, sequence, and server jobs. The jobs in the currently selected folder are listed in the display area. You can hide the repository pane by choosing View Show Folders.
Display area
The display area is the main part of the Director window. There are three views: v Job Status. The default view, which appears in the right pane of the Director window. It displays the status of all jobs in the folder currently selected in the repository tree. If you hide the repository pane, the Job Status view includes a Folder column which shows the folder name, and displays the status of all jobs in the current project, regardless of what folder they are in. See Job Status View for more information. v Job Schedule. Displays a summary of scheduled jobs and batches in the currently selected folder. If the repository pane is hidden, the display area shows all scheduled jobs and batches, regardless of what folder they are in. See "Job Batches," for a description of this view. To switch to the Job Schedule view, choose View Schedule, or click the Schedule button on the toolbar. v Job Log. Displays the log file for a job chosen from the Job Status view or the Job Schedule view. The repository pane is always hidden. See "The Job Log File," for more details. To switch to this view, choose View Log, or click the Log button on the toolbar.
Menu bar
The menu bar has six pull-down menus that give access to all the functions of the Director. v Project. Opens an alternative project and sets up printing.
v View. Displays or hides the toolbar, status bar, buttons, or repository pane, specifies the sorting order, changes the view, filters entries, shows further details of entries, and refreshes the screen. v Search. Starts a text search dialog box. v Job. Validates, runs, schedules, stops, and resets jobs, purges old entries from the job log file, deletes unwanted jobs, cleans up job resources (if the administrator has enabled this option), and allows you to set default job parameter values. You can also start the resource estimation tool. v Tools. Monitors running jobs, manages job batches, and starts the Designer client, and allows you to manage data sets. v Help. Invokes the Help system. You can also get help from any screen or dialog box in the Director.
Toolbar
The toolbar gives quick access to the main functions of the Director. The toolbar is displayed by default, but can be hidden by choosing View Toolbar or by changing the Director options. See Director Options for more details. To display ToolTips, let the cursor rest on a button in the toolbar. From left to right the buttons are: v Open project v Print view v Job status v v v v v v v v v v v Job schedule Job log Find Sort - ascending Sort - descending Resource estimation Stop a job Reset a job Schedule a job Reschedule a job Help
v Run a job
Status bar
The status bar is displayed at the bottom of the Director window The status bar displays the following information: v The name of a job (if you are displaying the Job Log view). v The number of entries in the display. If you look at the Job Status or Job Schedule view and use the Filter Entries... command, this panel specifies the number of lines that meet the filter criteria. If you have set a filter then (filtered) or (limited) is displayed. v The date and time on the engine.
Note: Under certain circumstances, the number of entries in the display is replaced by the last error message issued by the engine. The message disappears when the screen is refreshed.
Description
If you hover your mouse pointer over a job icon, you can view a tooltip that shows a sketch of the job. The sketch shows the job as it appears in the Designer client.
Job states
The Status column in the Job Status view displays the current status of the job. The possible job states are given in the following table:
Table 2. Job States Job State Job State Compiled Not compiled Running Finished Finished (see log) Description Description The job has been compiled but has not been validated or run since compilation. The job is under development and has not been compiled successfully. The job is currently being run, reset, or validated. The job has finished. The job has finished but warning messages were generated or rows were rejected. View the log file for more details. The job was stopped by the operator.
Stopped
Table 2. Job States (continued) Job State Aborted Validated OK Validated (see log) Description The job finished prematurely. The job has been validated with no errors. The job has been validated but warning messages were generated or rows were rejected. View the log file for more details. The job has been validated, but an error was found. The job has been successfully reset .
Use Copy to copy the whole window or selected text to the Clipboard for use elsewhere. Click Next or Previous to display status details for the next or previous job in the list. Click Close to close the dialog box.
Chapter 1. The InfoSphere DataStage and QualityStage Director client
Shortcut menus
The Director client has shortcut menus that are displayed when you right-click in the display area or repository pane. The menu you see depends on the view or window you are using, and what is highlighted in the window when you click the mouse.
v v v v
Use Find to search for text in the display area (Find... and Find Next) Filter (limit) the jobs listed in the display area (Filter...) Refresh the display (Refresh) Display details of an entry in the Job Schedule view (available only if an entry is selected) (Detail) v Delete an entry (Delete)
Wildcard/Pattern Description ? * # Matches any single character. Matches zero or more characters. Matches a single digit.
[charlist] Matches any single character in charlist. [!charlist] Matches any single character not in charlist. [a-z] Matches any single character in the range a-z. 4. Choose which jobs to exclude from the view by clicking either the No jobs or Jobs matching option button in the Include area. If you select Jobs matching, enter a string in the Jobs matching field. Only jobs that match this string will be excluded. The string definition is the same as in step 3. 5. Specify the status of the jobs you want to display by clicking an option button in the Job status area. v All lists jobs that have any status. v All, except "Not compiled" lists jobs with any status except Not compiled. v Terminated normally lists jobs with a status of Finished, Validated, Compiled, or Has been reset. v Terminated abnormally lists jobs with a status of Aborted, Stopped, Failed validation, Finished (see log), or Validated (see log). 6. Click OK to activate the filter. The updated view displays the jobs that meet the filter criteria. The status bar indicates that the entries have been filtered.
Example 1
Table 4. Example 1 Job Names job2input job2output job3input job3output "Include" Filter job2* Job View job2input job2output
Example 2
Continuing Example 1, if you also specify *input as an "Exclude" filter, the Job Status view shows only job2output.
Example 3
Table 5. Example 3 Job Names A3tires "Include" Filter [A-E]3* Job View A3tires
Table 5. Example 3 (continued) Job Names A3valves B3tires B3valves F3tires F3valves "Include" Filter Job View A3valves B3tires B3valves
Finding text
If there are many entries in the display area, you can use Find to search for a particular job or event. You start Find in one of three ways: v Choose Search Find... . v Choose Find... from the shortcut menu. v Click the Find button on the toolbar. The Find dialog box appears. The Look in field shows the currently selected folder. If the repository pane is hidden, and the display area lists jobs from all folders, the Look in field specifies All Folders. You cannot edit this field. To use Find: 1. Enter text in the Find what field. This could be a date, time, status, or the name of a job. Note: If the text entered matches any portion of the text in any column, this constitutes a match. If the displayed entry must match the case of the text you entered, select the Match Case check box. The default setting is cleared. Choose the search direction by clicking the Up or Down option button. The default setting is Down. Click Find Next. The display columns are searched to find the entered text. The first occurrence of the text is highlighted in the display. The text can appear in any column or row of the display area. Click Find Next again to search for the next occurrence of the text. Click Cancel to close the Find dialog box. Note: You can also use Search Find Next to search for an entry in the display. If there is a search string in the Find dialog box, Find Next acts in the same way as the Find Next button on the Find dialog box. If there is no search string in the Find dialog box, this option displays the Find dialog box where you must enter a search string.
2. 3. 4.
5. 6.
Sorting columns
You can organize the entries in the display area by sorting the columns in ascending or descending order.
Chapter 1. The InfoSphere DataStage and QualityStage Director client
The column currently being used for sorting is indicated by a symbol in the column title: v > indicates the sort is in ascending order. v < indicates the sort is in descending order. To sort a column do any one of the following: v Click the column title. This selects the column for sorting toggles between ascending and descending. v Click the Ascending or Descending button on the toolbar. v Choose View Ascending or View Descending. If you choose a column that contains a date or a time, both the date and time columns are sorted together.
10
v For the Job Status view, the printout contains the current status for each job in the project, and the date and time the job was last run. v For the Job Schedule view, the printout contains an entry for each scheduled job in the project specifying when the job is scheduled to run. v For the Job Log view, the printout can include information about each event in the job log file. For more information about the job log file, see The Job Log File.
General page
From the General page you can: v Enable/Disable automatic engine update v Change the engine update interval v Compare the client and engine tier host times v Save window settings
11
Note that if you choose a long refresh time, the status displayed in the Director window might not represent what is happening on the engine. For example, if you start a run, the job status might not update to Running until a whole refresh interval has elapsed. Conversely, if you choose a refresh time that is too short, the Director requests information from the engine at a rate that is too frequent and unproductive. You must find a value between these extremes that meets your update requirements. If you have a large project, you might wish to disable automatic refresh altogether by clearing the Enabled check box. You can refresh manually by using View Refresh.
Limits page
The Limits page sets the maximum number of rows to process in a job run, and the maximum number of warning messages to allow before a job aborts. The row limit applies to all server jobs in the current session. You can override the settings for an individual job when it is validated, run, or scheduled.
12
warnings can be logged before the job terminates. It is even possible for the job to complete if the warning limit is reached just before the end of the job run.
View page
The options on the View page determine what is displayed in the Director client window. The check boxes in the Show area are selected by default: Toolbar Displays the toolbar. Status bar Displays the status bar. Date and time Displays the date and time (of the engine tier host) on the status bar. Icons Displays the icons in the views.
Specify the view to display when the Director is started, by clicking the appropriate option button: v Status of jobs (the default setting) v Schedule v Log for last job
Priority page
When jobs are running, the performance of the Director client might be noticeably slower. You can improve the performance by changing the priority of the Director process.
13
14
These tasks are performed from the Job Status view in the Director window. To switch to this view, choose View Status, or click the Status button on the toolbar.
15
If the job's designer included help text for the job parameters, you can get help by selecting the parameter and clicking Property Help. You can also use this dialog box to set values for environment variables that affect parallel job runs. When you design the job, you can add environment variables to the list of job parameters, this dialog box will then ask you to supply values for those variables for this run. Environment variables are identified by a $ sign. When setting a value for an environment variable, you can specify one of the following special values: v $ENV the value for the environment variable is retrieved from the environment in which the job is run. v $PROJDEF the current project default as defined in the Administrator client is used. v $UNSET the environment variable is unset. Note: The dialog box displays a Parameters page only if the job has parameters.
Validating a job
You can check that a job or job invocation will run successfully by validating it. Jobs should be validated before running them for the first time, or after making any significant changes to job parameters. When a server job is validated, the following checks are made without actually extracting, converting, or writing data: v Connections are made to the data sources or data warehouse. v SQL SELECT statements are prepared. v Files are opened. Intermediate files in Hashed File, UniVerse, or ODBC stages that use the local data source are created, if they do not already exist. When a parallel job is validated, the job is run in `check only' mode so data is not affected. To validate a job: 1. Select the job or job invocation you want to validate in the Job Status view. 2. Choose Job Validate... . The Job Run Options dialog box appears. See Setting Job Options. 3. Fill in the job parameters as required. 4. Click Validate. Click OK to acknowledge the message. The job is validated and the job's status is updated to Running. Note: It might take some time for the job status to be updated, depending on the load on the engine tier host and the refresh rate for the client. Once validation is complete, the updated job's status displays one of these status messages: v Validated OK. You can now schedule or run the job. v Failed validation. You need to view the job log file for details of where the validation failed. For more details, see "The Job Log File." If you want to monitor the validation in progress, you can use a Monitor window. For more information, see "Monitoring Jobs."
16
Running a job
You can run a job in two ways: v Immediately. v By scheduling it to run at a later time or date. See Job Scheduling for how to do this. If you run a job immediately, you must ensure that the data sources and data warehouse are accessible, and that other users on your system will not be affected by the job run. To run a job immediately: 1. Select the job or job invocation in the Job Status view. 2. Do one of the following: v Choose Job Run Now... . v Click the Run button on the toolbar. The Job Run Options dialog box appears. See Setting Job Options. 3. Fill in the job parameters and check warning and row limits for the job, as appropriate. 4. Optionally, click Validate to validate the job. 5. Click Run. The job is scheduled to run with the current date and time and the job's status is updated to Running. Note: It might take some time for the job status to be updated, depending on the load on the engine tier host and the refresh rate for the client. All types of job in your project (other than mainframe jobs) are displayed in the Director repository pane. This can include Web services jobs, which are distinguished by having a separate icon and being grayed out. If you attempt to run an Web service enabled job from within the Director, you will be warned with a message asking whether you really want to run the job from the Director client (that is, not from the web services console).
Stopping a job
To stop a job that is currently running: 1. Select it in the Job Status view. 2. Do one of the following: v Choose Job Stop. v Click the Stop button on the toolbar. The job or invocation is stopped, regardless of the stage currently being processed, and the job's status is updated to Stopped. Note: It might take some time for the job status to be updated, depending on the load on the engine tier host and the refresh rate for the client.
Resetting a job
If a job has stopped or aborted, it is difficult to determine whether all the required data was written to the target data tables.
17
When a job has a status of Stopped or Aborted, you must reset it before running the job again. By resetting a job, you set it back to a runnable state and, optionally (on server jobs only), return your target files to the state they were in before the job was run. Note: You can only reinstate server job sequential files and hashed files to a pre-run state if the backup option has been chosen on the corresponding stage in the job design. If you want to undo the updates performed during a successful job, you can also use the Reset command for jobs with a status of Finished. The Reset command is not available for jobs with a status of Not compiled or Running. To reset a job or job invocation: 1. Select the job or invocation you want to reset in the Job Status view. 2. Choose Job Reset or click the Reset button on the toolbar. A message box appears. 3. Click Yes to reset the tables. All the files in the job are reinstated to the state they were in before the job was run. The job's status is updated to Has been reset. Note: It might take some time for the job status to be updated, depending on the load on the engine tier host and the refresh rate for the client.
In this case you can take one of the following actions: v Run Job. The sequence is re-executed, using the checkpoint information to ensure that only the required components are re-executed. v Reset Job. All the checkpoint information is cleared, ensuring that the whole job sequence will be run when you next specify run job. Note: If, during sequence execution, the flow diverts to an error handling stage, IBM InfoSphere DataStage does not checkpoint anything more. This is to ensure that stages in the error handling path will not be skipped if the job is restarted and another error is encountered.
18
2. Choose Job Set Defaults.... The Set Job Parameter Defaults dialog box appears. 3. If defaults have been set in the Designer for this job, they will be displayed. Edit them to override them.
Job scheduling
You can schedule a job to run in a number of ways: v Once today at a specified time v Once tomorrow at a specified time v On a specific day and at a particular time v Daily at a particular time v On the next occurrence of a particular date and time Each job can be scheduled to run on any number of occasions, by using different job parameters if necessary. For example, you can schedule a job to run at different times on different days. The scheduled jobs are displayed in the Job Schedule view. Note: Windows restricts job scheduling to administrators, therefore you need to be logged on as a Windows administrator in order to use the IBM InfoSphere DataStage scheduling features. Microsoft has published a workaround to this restriction - visit http://support.microsoft.com/directory/ and look up article Q124859 for details.
19
W = Wednesday Th = Thursday F = Friday S = Saturday Su = Sunday For example, Every Th&F means the job is scheduled to run every Thursday and Friday. Every n&x n is a date and x is a day of the week (as above). For example, Every 10&Su means the job is scheduled to run on every 10th day of the month and every Sunday. Today The job is run today at the specified time. Tomorrow The job will run tomorrow at the specified time. Next n n is a date (as above). For example, Next 28 means the job is run on the next 28th of the month. Next x x is a day of the week. For example, Next W means the job is run the next Wednesday in the month. Next n&x n is a number and x is a day of the week. For example, Next 5&12&T means the job is scheduled to run on the next 5th and 12th day of the month, and the next Tuesday. The At time column lists the time at which the job will run. This is displayed in the system's current time format: 12-hour or 24-hour clock. The Parameters/Description column lists the parameters required to run the job. Each job has built-in job parameters which must be entered when you schedule or run a job. The entered values are displayed in this column in the following format: parameter1 name = value, parameter2 name = value, ... A brief description appears here if there is a short description defined and there are no job parameters.
20
This field... Contains this information... Project The name of the project and the computer that hosts the engine tier. Schedule # The schedule number the job has been assigned. Occurrences The number of times the job will be run using this schedule. A value of Repeats means that the job is continuously rescheduled. Job name The name of the job. Run time The time the job is set to run, in 24-hour format. Run date The date the job is set to run. Job parameters The job parameters. Each entry in this field is in the format parameter name=value. Note: The parameter name displayed here is the name used internally by the job, not the descriptive parameter name you see when you enter job parameter values. Use Copy to copy the schedule details and job parameters to the Clipboard for use elsewhere. Click Next or Previous to display schedule details for the next or previous job in the list. These buttons are only active if the next or previous job is scheduled to run. Click Close to close the window.
Scheduling a job
To schedule a job: 1. Select the job or job invocation you want to schedule in the Job Status or Job Schedule view. Note: You cannot schedule a job with a status of Not compiled or a web service-enabled job. 2. Do one of the following to display the Add to schedule dialog box: v Choose Job Add to Schedule... . v Choose Add To Schedule... from the appropriate shortcut menu. v Click the Schedule button on the toolbar.Choose when to run the job by clicking the appropriate option button: Today runs the job today at the specified time (in the future). Tomorrow runs the job tomorrow at the specified time. Every runs the job on the chosen day or date at the specified time in this month and repeats the run at the same date and time in the following months.
21
Next runs the job on the next occurrence of the day or date at the specified time. Daily runs the job every day at the specified time. 3. If you selected Every or Next in step 3, choose the day to run the job by doing one of the following: v Choose an appropriate day or days from the Day list. v Choose a date from the calendar. Note: If you choose an invalid date, for example, 31 September, the behavior of the scheduler depends upon the operating system of the computer that hosts the engine tier, and you might not receive a warning of the invalid date. Refer to your documentation for the engine tier host for further information. 4. Choose the time to run the job. There are two time formats: v 12-hour clock. Click either AM or PM. v 24-hour clock. Click 24H Clock. Click the arrow buttons to increase or decrease the hours and minutes, or enter the values directly. 5. Click OK. The Add to schedule dialog box closes and the Job Run Options dialog box appears. 6. Fill in the job parameter fields and check warning and row limits, as appropriate. 7. Click Schedule. The job is scheduled to run and is added to the Job Schedule view.
Unscheduling a job
If you want to prevent a job from running at the scheduled time, you must unschedule it. To unschedule a job: 1. Select the job you want to unschedule in the Job Schedule view. 2. Do one of the following: v Choose Job Unschedule. v Choose Unschedule from the Job shortcut menu. If the job is not scheduled to run at another time, the job status is updated to Not scheduled in the To be run column, and is not run again until you add it to the schedule.
Rescheduling a job
If you have a job scheduled to run, but you want to change the frequency, day, or time it is run, you can reschedule it. To reschedule a job: 1. Select the job you want to reschedule in the Job Schedule view. 2. Do one of the following to display the Add to schedule dialog box: v Choose Job Reschedule... . v Choose Reschedule... from the Job shortcut menu. v Click the Reschedule button on the toolbar. The current settings for the job are shown in the dialog box.
22
3. Edit the frequency, day, or time you want the job to run. 4. Click OK. The Add to schedule dialog box closes and the Job Run Options dialog box appears. 5. Fill in the job parameters and check warning and row limits as appropriate. 6. Click Reschedule. The job is rescheduled and the To be run column in the Job Schedule view is updated.
Deleting a job
You can remove unwanted or old versions of jobs from your project as follows: 1. Select the job or job invocation in the Job Status view. You can make multiple selections. 2. Choose Job Delete. A message confirms that you want to delete the chosen job, or jobs. 3. Click Yes to delete the jobs. A message confirms they have been deleted. 4. Click OK. The job design and all the associated components used at run time are deleted, including the files and records used by the Job Log view and the Monitor window. 5. If you delete a job that is part of a batch, edit the batch to remove the deleted job to prevent the batch from failing. See "Job Batches."
Job administration
From the Administrator client, the administrator can enable job administration commands in the Director client that let you clean up the resources of a job that has hung or aborted. These commands help you return the job to a state in which you can rerun it after the cause of the problem has been fixed. You should use them with care, and only after you have tried to reset the job and you are sure it has hung or aborted. There are two job administration commands: v Cleanup Resources v Clear Status File
23
PID # The process identification number. Context The process context. In a job with more than one active stage, a PID might be reused during a job run. In that case the context field might have entries for more than one active stage. The context will always be listed as Unavailable if the Show All option button is selected. User Name The identity of the user whose job started the process. Last Command Processed The command executed most recently by the process. In the Locks area: This column... Displays this information... PID/User # The identification number of the process associated with the lock. Lock Type The type of lock: File, Record, or Group. Item Id The identity of the item (record) locked by the process. For a Group lock, this column is left blank.
24
4. 5. 6. 7.
If this procedure fails to end a process that you believe is causing a job to hang, try the following steps: Log out of all IBM InfoSphere DataStage clients. Try to end the process by using the Windows Task Manager or kill the process in UNIX. Stop and restart the InfoSphere DataStage Server Engine. Reset the job from the Director (see Resetting a Job).
If there is a problem with a job, you can also release locks (see the next section), or clear the job status file (see Clearing a Job Status File).
Releasing locks
On UNIX or Linux systems, you must be user dsadm or root in order to release locks. To release locks: 1. From the Job Resources dialog box, choose the range of locks to list by doing either of the following: v Click the Show by job option button in the Locks area. v Select a process in the Processes area, then click the Show by process option button. Note: You cannot release locks if you have clicked the Show All option button in the Locks area. 2. Click Release All. Each of the displayed locks is unlocked and the display updates automatically. (You cannot select or release individual locks.) You can refresh the display manually at any time by clicking Refresh.
25
A job invocation can be invoked regardless of the state of other invocations which are processing different data sets. If you run a job in the Director without giving an invocation Id then you cannot create any new invocations of that job until the job has finished. If you want to run several invocations of the same job at the same time, you must give an invocation Id for the first invocation. The job designer should ensure that the job is suitable to have multiple invocations run. For example, an unsuitable job might have different invocations running concurrently and writing to the same table. An unsuitable job might also adversely affect job performance. Parallel job invocations resulting from a decision to invoke multiple invocations of a job should not be confused with the several instances of the same job that you get when running a partitioned job across several processors. In the latter case the partitioning and collecting built-in to the job will handle the situation where several processes want to read or write to the same data source.
If you select myjob and switch to the log view, you will see the log for myjob and both of the invocations of myjob. A column named Invocation in the log identifies the invocation that generated each log entry.
26
If you select myjob.invk1 or myjob.invk2 and switch to the log view, you will see only the job log for that particular invocation of the job.
Enabling operational metadata at the job level (parallel and server jobs)
If operational metadata is not enabled at the project level, you can enable it for individual jobs in the project by selecting Generate operational metadata on the General tab of the Job Run Options dialog box in the Director client. If operational metadata is enabled at the project level, you can disable it for individual jobs by clearing the selection.
27
Job Design level (from the Designer), or for individual job runs (from the Director). You can define a handler from the Director, or from the Designer. For more details see Message Handlers. Message handlers are disabled on the General page of the Job Run Options dialog box. To disable message handling: v Select Disable project-level message handling to disable the handler that has been defined in the Administrator to apply to all jobs in the project. v Select Disable compiled-in job-level message handling to disable a local handler that has been compiled into this particular job. The message handlers are only disabled for this particular job run.
28
29
Table 6. The Monitor window: stage selected (continued) This column... Num Rows Contains this information... The number of rows of data processed so far by each stage on its primary input. The time the processing started on the engine. The elapsed time since processing started. The number of rows processed per second. The percentage of CPU the stage is using (you can turn the display of this column on and off from the shortcut menu - see Showing CPU Usage).
Started at
If you are monitoring a parallel job, and have not chosen to view instance information (see below), the monitor displays information for Parallel jobs as follows: v If a stage is running in parallel then x N is appended to the stage name, where N gives how many instances are running. v If a stage is running in parallel then the Num Rows column shows the total number of rows processed by all instances. The Rows/sec is derived from this value and shows the total throughput of all instances. v If a stage is running in parallel then the %CP might be more than 100 if there are multiple CPUs on the engine tier host. For example, on a machine with four CPUs, %CP could be as high as 400 where a stage is occupying 100% of each of the four processors, alternatively, if the stage is occupying only 25% of each processor the %CP would be 100%. To monitor instances of parallel jobs individually, choose Show Instances from the shortcut menu. The monitor will then show each instance of a stage as a sub-branch under the `parent' stage, The monitor displays the information for all stage instances under the `parent' stage. Only relevant information is shown for each stage instance.
Table 7. The Monitor window: job instance selected This column... Status Contains this information... The status of each stage. The possible states are: Meaning The processing terminated abnormally at this stage. All data has been processed by the stage. The stage is ready to process the data. Data is being processed by the stage. The processing is starting.
30
Table 7. The Monitor window: job instance selected (continued) This column... Stopped Waiting Num Rows Contains this information... The processing was stopped manually at this stage. The stage is waiting to start processing. The number of rows of data processed so far by each stage on its primary input. The percentage of CPU the instance is using (you can turn the display of this column on and off from the shortcut menu - see Showing CPU Usage).
%CP
If you select a link in the tree (either under a stage or a stage instance) the following information is shown:
Table 8. The Monitor window: link selected This column... Link Type Type <<Pri <Ref >Out >Rej Num Rows Contains this information... The type of link as follows: Meaning primary input link input link output link output link for rejected rows The number of rows of data processed so far by each stage on its primary input. The number of rows processed per second.
Rows/sec
The status bar at the bottom of the window displays the name and status of the job, the name of the project and the engine tier host, and the current time on the engine tier host.
31
32
If you are looking at an instance of a parallel job stage the stage name will be given as Stage.Instance, for example CTransformerSatge1xz.2. The fields in the window cannot be edited. This field... Displays this information... Project The name of the project and the computer that hosts the engine tier. Job name The name of the job. Stage The name of the stage (or stage and instance)
Status The status of the stage. This is one of the states described in The Monitor Window. Started at The date and time (on the engine) the processing started at this stage. Ended at The date and time (on the engine) the processing ended. This is set to 00:00:00 or is left blank if processing is still taking place. Row count The number of rows processed. Control An internal number assigned by IBM InfoSphere DataStage. Wave # An internal number assigned by InfoSphere DataStage. User The name or process number of the user who ran the job.
Retrieved The date and time (on the engine) the stage retrieved the data to process. The same fields described earlier are displayed in this window except: v Stage is replaced by Stage.Link (or Stage.Instance.Link). This field contains the name of the stage followed by the name of the link. v Control is replaced by Link type. This field contains the type of link. There are four possible settings: Primary A primary input link Reference A reference input link Output An output link Reject An output link marked as "Rejected Rows"
33
34
35
Table 9. Job log columns (continued) This column... Event Contains this information... A message describing the event. The first line of the message is displayed. If a message has an ellipsis (...) at the end, it contains more than one line. You can view the full message in the Event Detail window. If the log is for a job invocation, gives the invocation ID, otherwise is blank.
Invocation
36
Table 10. Event Detail window (continued) This field... Message Contains this information... The full event message. The text contains names of stages in the job which were processed during this event. If you are viewing an event with an event type of Fatal or Warning, this text describes the cause of the event. The message also displays data associated with the event, unless the Administrator has restricted the operator's view of the log, in which case the data is visible to users with Developer rights only.
You can use Copy to copy the event details and message or selected text to the Clipboard for use elsewhere. Click Next or Previous to display details for the next or previous event in the list. Click Close to close the window. If you are running a parallel job, and your administrator has enabled the Generated OSH Visible option in the Administrator, the Event Detail window contains the OSH that was run for the job. This appears after the 'Parallel Job Initiated' message (usually event 3). This facility is intended for expert users.
37
You can choose to display only those entries between particular dates and times. You can also further reduce the entries by displaying only those entries with a particular event type. To use the Filter facility: 1. Choose where to start the filter by clicking the appropriate option button: v Oldest displays all the entries from the oldest event in the log file. v Start of last run displays the entries from the start of the last run. v Day and Time displays all the entries from the specified date and time. Enter the date and time or click the arrow buttons. The format of the date depends on how your Windows system is set up, for example, dd\mm\yyyy or mm\dd\yyyy. The time is always in 24-hour format. 2. Choose where to end the filter by clicking the appropriate option button: v Newest displays entries up to the most recent date and time. v Day and Time displays all the entries up to the specified date and time. 3. Choose what to display from the filtered selection by clicking the appropriate option button: v Select all entries displays all the entries between the chosen start and end points. v Last displays the last n number of entries, where n is a number. The default setting is 100. Click the arrow buttons to increase or decrease the value, or enter a value 1 through 9999. 4. Choose the event types you want to display by selecting the appropriate check boxes: v Information v Warning v Fatal v Reject v Other. All event types other than those listed above. 5. Click OK to activate the filter. The updated Job Log view displays the entries that meet the filter criteria. The status bar indicates that the entries have been filtered. The Job Log view uses the filter settings until you change them or redisplay the whole log file. To display all the entries, choose View Show All.
38
For job shows the name of the job. Oldest entry shows the date and time of the oldest entry in the log file. (The format of the date and time might look different on your screen depending on your Windows settings.)
Message handlers
When you run a parallel job, any error messages and warnings are written to the job log and can be viewed from the Director client. You can choose to handle specified errors in a different way by creating one or more message handlers. A message handler defines rules about how to handle messages generated when a parallel job is running. You can, for example, use one to specify that certain types of message should not be written to the log.
39
You can edit message handlers in the Designer or in the Director. The recommended way to create them is by using the Add Rule to Message Handler feature in the Director. You can specify message handler use at different levels: v Project Level. You define a project level message handler in the Administrator, and this applies to all parallel jobs within the specified project. v Job Level. From the Designer you can specify that any existing handler should apply to a specific job. When you compile the job, the handler is included in the job executable as a local handler (and so can be exported to other systems if required). You can also add rules to handlers when you run a job from the Director (regardless of whether it currently has a local handler included). This is useful, for example, where a job is generating a message for every row it is processing. You can suppress that particular message. When the job runs it will look in the local handler (if one exists) for each message to see if any rules exist for that message type. If a particular message is not handled locally, it will look to the project-wide handler for rules. If there are none there, it writes the message to the job log. Note that message handlers do not deal with fatal error messages, these will always be written to the job log. Note: You cannot add message rules to jobs from an earlier release of IBM InfoSphere DataStage without first re-running those jobs.
40
The Message ID, Message type and Example of message text fields are all filled in from the log entry you have currently selected. You cannot edit these. 3. Click Add Rule to add the new message rule to the chosen handler. 4. Click Previous or Next to move through the messages in the job log and add further rules to the selected handler. If you click Edit Handler..., the Edit Message Handlers dialog box appears, enabling you to manage the message handlers.
To delete a handler:
1. Choose an option to specify whether you want to delete the local runtime handler for the currently selected job, delete the project-level message handler, or delete a specific message handler. If you want to delete a specific message handler, select the handler from the drop-down list. The settings for whichever handler you have chosen to edit appear in the grid. 2. Click the Delete button.
Chapter 4. The job log
41
42
43
9. Define any parameters that you want to specify for the batch. For example, a user name and password to prompt for when the batch is run. 10. When you have defined the batch, click OK. The batch is compiled and appears in the Job Status view. Note: The Dependencies page allows you to specify the dependencies of the batch job. You only need to do this if you intend to export the batch job for deployment on another system by using the Designer. Information on dependencies and exporting jobs is in IBM InfoSphere DataStage Designer Client Guide. Note: When you create a batch job you are, in effect, creating a server job whose function is to run other jobs (these can be other server jobs or parallel job). The Job Properties dialog box give access to the same performance improvement facilities of ordinary server jobs. The Performance page allows you to improve the performance of the job by specifying the way the system divides jobs into processes. For a full explanation of this, see Chapter 2 of InfoSphere DataStage Server Job Developer Guide.
44
2. Select the batch in the list and do one of the following to display the Add to schedule dialog box: v Choose Job Add to Schedule... . v Choose Add To Schedule... from the Job shortcut menu. v Click the Schedule button on the toolbar. 3. Choose when to run the batch by clicking the appropriate option button: Today runs the batch today at the specified time (in the future). Tomorrow runs the batch tomorrow at the specified time. Every runs the batch on the chosen day or date at the specified time in this month and repeats the run at the same date and time in the following months. Next runs the batch on the next occurrence of the day or date at the specified time. Daily runs the batch every day at the specified time. 4. If you selected Every or Next in step 3, choose the day to run the batch from the Day list or a date from the calendar. Note: If you choose an invalid date, for example, 31 September, the behavior of the scheduler depends upon the operating system of the computer hosting the engine tier, and you might not receive a warning of the invalid date. Refer to your computer documentation for further information. 5. In the Time area, select one option from AM, PM, or 24H Clock. Then click the arrow buttons to increase or decrease the hours and minutes, or enter the values directly. 6. Click OK. The Add to schedule dialog box closes and the Job Run Options dialog box appears. 7. Click Schedule. The batch is scheduled to run and is added to the Job Schedule view. The job parameters entered when the batch was created are used when the batch is run.
45
v Click the Reschedule button on the toolbar. The Add to schedule dialog box appears with the current settings for the batch. 2. Edit the frequency, day, or time you want the batch to run. 3. Click OK. The Add to schedule dialog box closes, the batch is rescheduled, and the To be run column in the Job Schedule view is updated.
46
Product accessibility
You can get information about the accessibility status of IBM products. The IBM InfoSphere Information Server product modules and user interfaces are not fully accessible. The installation program installs the following product modules and components: v IBM InfoSphere Business Glossary v IBM InfoSphere Business Glossary Anywhere v IBM InfoSphere DataStage v IBM InfoSphere FastTrack v v v v IBM IBM IBM IBM InfoSphere InfoSphere InfoSphere InfoSphere Information Analyzer Information Services Director Metadata Workbench QualityStage
For information about the accessibility status of IBM products, see the IBM product accessibility information at http://www.ibm.com/able/product_accessibility/ index.html.
Accessible documentation
Accessible documentation for InfoSphere Information Server products is provided in an information center. The information center presents the documentation in XHTML 1.0 format, which is viewable in most Web browsers. XHTML allows you to set display preferences in your browser. It also allows you to use screen readers and other assistive technologies to access the documentation.
47
48
49
50
51
52
Notices
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd. 1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web
53
sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact: IBM Corporation J46A/G4 555 Bailey Avenue San Jose, CA 95141-1003 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. This information is for planning purposes only. The information herein is subject to change before the products described become available. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to
54
IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs. Each copy or any portion of these sample programs or any derivative work, must include a copyright notice as follows: (your company name) (year). Portions of this code are derived from IBM Corp. Sample Programs. Copyright IBM Corp. _enter the year or years_. All rights reserved. If you are viewing this information softcopy, the photographs and color illustrations may not appear.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at www.ibm.com/legal/copytrade.shtml. The following terms are trademarks or registered trademarks of other companies: Adobe is a registered trademark of Adobe Systems Incorporated in the United States, and/or other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. The United States Postal Service owns the following trademarks: CASS, CASS Certified, DPV, LACSLink, ZIP, ZIP + 4, ZIP Code, Post Office, Postal Service, USPS and United States Postal Service. IBM Corporation is a non-exclusive DPV and LACSLink licensee of the United States Postal Service. Other company, product or service names may be trademarks or service marks of others.
55
56
Contacting IBM
You can contact IBM for customer support, software services, product information, and general information. You also can provide feedback to IBM about products and documentation. The following table lists resources for customer support, software services, training, and product and solutions information.
Table 11. IBM resources Resource IBM Support Portal Description and location You can customize support information by choosing the products and the topics that interest you at www.ibm.com/support/ entry/portal/Software/ Information_Management/ InfoSphere_Information_Server You can find information about software, IT, and business consulting services, on the solutions site at www.ibm.com/ businesssolutions/ You can manage links to IBM Web sites and information that meet your specific technical support needs by creating an account on the My IBM site at www.ibm.com/account/ You can learn about technical training and education services designed for individuals, companies, and public organizations to acquire, maintain, and optimize their IT skills at http://www.ibm.com/software/swtraining/ You can contact an IBM representative to learn about solutions at www.ibm.com/connect/ibm/us/en/
Software services
My IBM
IBM representatives
Providing feedback
The following table describes how to provide feedback to IBM about products and product documentation.
Table 12. Providing feedback to IBM Type of feedback Product feedback Action You can provide general product feedback through the Consumability Survey at www.ibm.com/software/data/info/ consumability-survey
57
Table 12. Providing feedback to IBM (continued) Type of feedback Documentation feedback Action To comment on the information center, click the Feedback link on the top right side of any topic in the information center. You can also send comments about PDF file books, the information center, or any other documentation in the following ways: v Online reader comment form: www.ibm.com/software/data/rcf/ v E-mail: comments@us.ibm.com
58
Index A
Active stages 29 Add to schedule dialog box 45 alternative printer 11 alternative project 14 Attach to Project dialog box 1 examples filtering views 8 Job Log view 35 Job Schedule view 19 Job Status view 4 exiting DataStage Director job results, Viewing 35 Job Run Options dialog box 15 Job Schedule Detail window 20 Job Schedule view 2, 19 filtering the display 7 shortcut menus 6 Job Status Detail dialog box 5 job status file, clearing 25 Job Status view 2, 4 filtering the display 7 shortcut menus 6 jobs deleting 23 rescheduling 22 resetting 18 running 17 scheduling 19 stopping 17 unscheduling 22 validating 16
14
C
changing the printer setup 11 choosing an alternative printer 11 cleaning up job resources 23 clearing job log file 38 job status file 25 columns, sorting 10 CPU usage, showing 32 Create New Batch dialog box 43 creating job batches 43 customer support 57
F
Filter dialog box 37 Filter facility 7, 37 Filter Jobs dialog box 7 filtering examples 8 Job Schedule view 7 Job Status view 7 Find dialog box 9 Find facility 9
H
Help system, starting 3 hiding icons 2 repository pane 2
D
DataStage Director exiting 14 starting 1 DataStage Director options 11 display settings 13 Filter settings 12 limits 12, 14 Main window size and position Options dialog box 11 priority 13 Refresh setting 11 save settings 12 DataStage Director window 2 display area 4 menu bar 2 shortcut menus 6 status bar 3 toolbar 3 DataStage engine tier 1 default display options 11 deleting job batches 46 jobs 23 dependencies, specifying 44 displaying Monitor window 29 Stage Status window 32 ToolTips 3
L
legal notices 53 limiters row 12 warning 12 locks releasing 23 viewing 23
I
icons, hiding 2 instances, job 26 12
J
job administration 23 job batches 43 copying 46 creating 43 deleting 46 editing 46 related log entries 37 rescheduling 45 running 44 scheduling 44 unscheduling 45 job instances 26 job log file purging entries 38 Job Log view 2, 35 Event Detail window 36 filtering the display 37 shortcut menus 6 job parameters 15 setting defaults 18 job processes ending 23 viewing 23 Job Properties Job control page Job Resources dialog box 23 job resources, cleaning up 23
M
mainframe jobs 1 menus pull-down 2 shortcut 6 message handlers 39 adding rules to 40 disabling for job runs 27 managing 41 Monitor window % CPU 32 displaying 29 shortcut menu 7 switching between 32 multiple job instances 26
N
non-IBM Web sites links to 51
E
editing job batches 46 ending job processes 23 entering job parameters 15 Event Detail window 36
O
options, DataStage Director 43 11
59
P
parallel jobs 1 parameters, see job parameters 15 Print dialog box 10 Print Setup dialog box 11 printers, choosing alternative 11 printing current view 10 priority of Director process 13 process priority 13 product accessibility accessibility 47 projects choosing alternative 14 purging log file entries 38
T
times, comparing client and server toolbar 3 ToolTips 3 tracing options 27 12
U
unscheduling job batches 45 jobs 22 using Filter option 37 Find facility 9 job parameters 15
R
Refresh setting 11 related logs 37 releasing locks 23 repository pane 2 hiding 2 shortcut menus 7 repository tree 2 rescheduling job batches 45 jobs 22 resetting jobs 18 results, Viewing job 35 row limiters 12 running jobs 17 immediately 17 scheduling 17
V
validating jobs 16 viewing event details 36 job processes 23 job status details 5 jobs in another project 14 jobs on a different server 14 locks 23 schedule details 20 Viewing job results 35 views filtering 7 Job Log 2, 35 Job Schedule 2, 19 Job Status 2, 4
S
saving window settings 12 scheduling job batches 44 jobs 19 server jobs 1 setting default display options 11 shortcut menus 6 in Monitor windows 7 in the Job Log view 6 in the Job Schedule view 6 in the Job Status view 6 in the repository pane 7 showing CPU usage 32 software services 57 sorting columns 10 Stage Status window contents 33 displaying 32 starting DataStage Director 1 Help system 3 status bar 3 stopping jobs 17 support customer 57 switching between Monitor windows
W
warning limiter maximum 12 web service enabled jobs 17 Web sites non-IBM 51 window settings, saving 12
32
60
Printed in USA
SC18-9894-02