DataStage Custom Stages

Custom Stages
Agenda
Introduction Types of Stages How to build Custom Stages
2002. Infosys Technologies Ltd.
Introduction
Data stage provides large no of inbuilt Stages to extract and transform data. In addition to existing Stages, also provides capability to build custom Stages.
Types of Stages
There are three different types of Stages that can be built. Custom use an existing Orchestrate operator as a Stage and use in parallel jobs.
Build
Creator own operators and use them in Stage. Wrapper Specify a UNIX command as a Stage and use it.
Custom Stage
Custom Stages use already existing Orchestrate operators. Steps in defining Custom Stages.
Select the category from repository. Select File -> New Parallel Stage -> Custom On General page specify the name of the operator to be used. On Links page specify the maximum and minimum no of input and output links. On Properties page specify the properties.
Wrapped Stages
Wrapper Stages use UNIX commands. When defining a Build stage you provide the following information: Details of the UNIX command that the stage will execute. Description of the data that will be input to the stage. Description of the data that will be output from the stage. Definition of the environment in which the command will execute.
Unix command can be any command like sort, grep, a script, etc.
Build Stages
Enables you to create own operators. Written in C++ Gives advantage of programming language control.
Build Stages
Buildop provides a simple means of creating own operator. It does not use an existing operators or executable Reasons to use Buildop include:
Functionality of Multiple Stages can be combined into
Complex business logic that cannot be easily using existing stages
Lookups across a range of values Surrogate key generation
Better Performance as there is no unwanted functionality Buildop is reusable. It can used within a project as well as exported and used in other projects also
Build Stage
Interface is similar to wrapper Stage. When defining a Build Stage one needs to provide
Input interface/schema
Output interface /schema

Transfer type, if Auto Transfer is selected all the input columns are output. Header files and definitions Code to be executed before the stage Code to be executed for each record input Code to be executed after the stage
Build Stage
10
Steps
Steps for defining a Build Stage 1. Select the Stage Types category in which the Stage is to be created 2. Choose File ->New Parallel Stage -> Build from the main menu or New Parallel Stage -> Build from the shortcut menu.
General tab has Stage Type, Category, Operator by default the Build Stage name and Class name by default the Build Stage name. Creator tab has generic information about the version of build Stage, Author name, copy right information. Properties page all the options to be passed to Build Stage as run time options are defined. Build page contains three tabs
Interfaces This page contains input and output interfaces/schemas defined.
Logic This tab contains three sections Pre Loop, Per Record and Post Loop
Advanced
11
Build Stage Macros

There are a number of macros you can use when specifying Pre-Loop, Per-Record, and Post-Loop code.
Informational
Flow-control
Input and output Transfer
12
This slide shows Interfaces tab in Build page. This tab contains the input and output interfaces defined.
13
Build Stage Macros

Informational Macros These macros are used to determine the number of inputs, outputs,and transfers inputs() - returns the number of inputs to the stage. outputs() - returns the number of outputs from the stage. transfers() - returns the number of transfers in the stage. Flow-Control Macros
These macros used to override the default behavior of the Per-Record loop in stage definition
endLoop() - stops looping after completion of the current loop after writing any auto outputs for this loop. nextLoop() - immediately move control to the start of next loop failStep() - return a failed status and terminate the job
14
Build Stage Macros

Input and Output Macros The following macros are available: readRecord(input) - reads the next record from input, if there is one. If there is no record, the next call to inputDone() will return false. writeRecord(output) - writes a record to output. inputDone(input) - returns true if the last call to readRecord() for the specified input failed to read a new record, because the input has no more records. holdRecord(input) - auto input is suspended for the current record discardRecord(output) - auto output is suspended for the current record, so that the operator does not output the record at the end of the current loop.
discardTransfer(index) - auto transfer is suspended
15
Build Stage Macros

Transfer Macros The following macros are available: doTransfer(index) transfers data specified by index. doTransfersFrom(input) - transfers input from the index specified. doTransfersTo(output) - transfers output to the index specified. transferAndWriteRecord(output) - transfers and writes a record for the specified output. Calling this macro is equivalent to calling the macros doTransfersTo() and writeRecord().
16
Build Stage
This page contains all header file information and definitions
17
Example
Definitions tab contains Header files and definitions
#include "apt_util/string.h" #include "apt_util/ints.h" int iHold = 0; int iVar = 0; int iCounter=0; struct extract_type { long long gst_i; long long mail_addr_i; char surname[32]; long long acct_cd_seq_i; long long dummy_grp_seq_i; char grp_end_d[10]; }; struct extract_type extract_rec[100];
18
Example
Pre Loop section contains Code to be executed before processing of input. Per Record section.
This section contains logic to be implemented for each record.
if (input.MAIL_ADDR_I!=tempMail ) { // reading first record extract_rec[i].gst_i=input.GST_I; extract_rec[i].mail_addr_i=input.MAIL_ADDR_I; extract_rec[i].acct_cd_seq_i=input.ACCT_CD_SEQ_I; // Begin of Grouping logic
Each of the input column is accessed as input.Column where input is the name of input interface
19
Per Record section contains the code to be executed for each of the input record.This page shows code to be executed for each record.
20
Example
Code is written in C++ same as any C++ program without main
//write output to output interface for ( m=0;m<i;m++) { output.GST_I=extract_rec[m].gst_i; output.MAIL_ADDR_I=extract_rec[m].mail_addr_i; output.ACCT_CD_SEQ_I=extract_rec[m].acct_cd_seq_i; output.PRIM_LAST_NAME=extract_rec[m].surname; // Writing the record to Output writeRecord(output.portid_);
Data is transferred to output interface by assigning the computed values to output interface using output.Column where output is the interface name.
Output is written by calling writeRecord(output) macro. It transfers the data to output interface.
21
Example
Post Loop section contains code to be executed after the processing. This is same as Pre Loop and Per Record sections but is executed after completion of Per Record section.
22

DataStage Custom Stages

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

DataStage Custom Stages

Hochgeladen von

Copyright:

Verfügbare Formate

Custom Stages

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

Output interface /schema

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

Build Stage Macros

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

Build Stage Macros

2002. Infosys Technologies Ltd.

Build Stage Macros

discardTransfer(index) - auto transfer is suspended

2002. Infosys Technologies Ltd.

Build Stage Macros

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

2002. Infosys Technologies Ltd.

Das könnte Ihnen auch gefallen