Sie sind auf Seite 1von 3

Datastage - Logging and Retaining Job Statistics

Authors et al.: IP.com number: Original Publication Date: IP.com Electronic Publication: IBM IPCOM000175773D October 24, 2008 October 24, 2008

IP.com, Inc. is the world's leader in defensive publications. The largest and most innovative companies publish their technical disclosures into the IP.com Prior Art Database. Disclosures can be published in any language, and they are searchable in those languages online. Unique identifiers indicate documents containing chemical structures as well as publications open for comment in the IP Discussion Forum. Original disclosures which are published online also appear in The IP.com Journal. The IP.com Prior Art Database is freely available to search by patent examiners throughout the world. Client may copy any content obtained through the site for Client's individual, noncommercial internal use only. Client agrees not to otherwise copy, change, upload, transmit, sell, publish, commercially exploit, modify, create derivative works or distribute any content available through the site. Note: This is a pdf rendering of the actual disclosure. To access the disclosure package containing an exact copy of the publication in its original format as well as any attached files, please download the full document from the IP.com Prior Art Database at: http://www.ip.com/pubview/IPCOM000175773D

www.ip.com
Copyright IP.com, Inc. All rights reserved.

Datastage - Logging and Retaining Job Statistics While job statistics in Datastage are visible in the Designer client during development, these values are, by default, not presented or stored in any way in a production environment. Secondly, job statuses (successes, failures, reasons for failure) are stored in a proprietary manner and visible only in the Director client. Error reasons are often difficult to locate, and log entries are cleared periodically by the system to preserve space. The solution presented here involves the use of the built-in DSJobReport subroutine, which generates an XML output of job statistics, as well as a Datastage job that reads and processes these statistics, as well as pulls Error reasons from logs for failures identified in the XML. Each job to be monitored should include a call to DSJobReport in its After-job subroutine on the General tab of the Job Parameters screen. Two separate Datastage, along with a kornshell script, are then used to read and process the XML reports to, store the results in a relational database (Fig. 1).

Start

XML Reports

Identify Successful and Failed jobs

Data Counts and Execution Timesfor Successful jobs

Datastage Hash mapping file: Datastage job Name to ID


ID

Failed Job Log ID + Execution time

for ce on d lle Ca

job ch ea

Datastage Hash Log files

Filter only Error messages

Error Reasons for failed jobs

End

Figure 1

Data will then be utilized to analyze data volumes and job performance over time and,

ultimately, determine the ROI (Return On Investment) of the ETL (Extract Transform Load) process. Database views provide required aggregates necessary to address how Datastage generates statistics for Parallel jobs (where data counts are stored separately for each parallel thread) (Fig. 2).

1a
DS jobs generate XML output

DS job loads XML data to database

2 Raw data generated automatically by DataStage Jobs 3b 3a 1b


FRS_DS_ASCA_LOG FRS_DS_FAILURE_LOG PK PK JOB_NAME JOB_START_DATE_TIME ERROR_REASON PK PK PK PK PK JOB_NAME JOB_START_DATE_TIME STAGE_NAME LINK_NAME PARALLEL_PID JOB_END_DATE_TIME STAGE_ELAPSED_TIME ROW_COUNT FRS_DS_JOB_DETAIL PK PK PK JOB_NAME STAGE_NAME LINK_NAME STAGE_TYPE JOB_DESCR STAGE_DESCR LINK_DESCR JOB_GROUPING

Filter Table maintained manually by development

Union of successful jobs (FRS_DS_ASCA_LOG) and failures with reason (FRS_DS_FAILURE_LOG)

Aggregate parallel threads by PID Aggregate parallel threads by PID and join to filter data down to what is specifically needed

FRS_DS_JOB_STATUS JOB_GROUPING JOB_NAME JOB_DESCR JOB_START_DATE_TIME JOB_STATUS ERROR_REASON FRS_ASCA_LOG_VIEW_ALL JOB_NAME JOB_START_DATE_TIME STAGE_NAME LINK_NAME ROW_COUNT

FRS_ASCA_LOG_VIEW JOB_GROUPING JOB_NAME JOB_DESCR JOB_START_DATE_TIME STAGE_NAME STAGE_DESCR STAGE_TYPE LINK_NAME LINK_DESCR ROW_COUNT

View Reports

Figure 2 Datastage is a registered trademark of International Business Machines Corporation in the United States, other countries, or both.

Das könnte Ihnen auch gefallen