Sie sind auf Seite 1von 17

DATASTAGE OPERATORS

Understand Them, Then Roll Your Own!

2012 Data Migrators Pty. Ltd.

Audience
! Who has DataStage experience? ! Who understands difference between Parallel and Server

jobs?
! Who knows what an Operator is? ! Who knows what OSH is?

2012 Data Migrators Pty. Ltd.

Introduction
! What happens when you click Run? ! Whats an operator? ! Whats OSH? ! Creating your own operators ! Summary

2012 Data Migrators Pty. Ltd.

Introduction
! Well cover Parallel jobs ! Operators are a Parallel concept

! Rich functionality and connectivity ! Linear scalability

! How many people have built production Server jobs?


! Skills issue!?
2012 Data Migrators Pty. Ltd.

How does a job Run?


! Compilation converts the graphical job into a shell script

! Script runs in the Orchestrate Shell OSH ! Each Job becomes one OSH script ! Each Stage becomes one or more executable Operators ! Operators are (sort of) equivalent to commands in Unix shell script, but ! ! have multiple inputs and outputs, ! and hence a slightly different syntax to Unix shell scripts
2012 Data Migrators Pty. Ltd.

Stage To Operator Mapping


Stage Sequential File External Source External Target Transformer Aggregator Join Merge Lookup Funnel Sort Operator import, export import export transform group innerjoin, leftouterjoin, rightouterjoin, fullouterjoin merge lookup, oralookup, db2lookup, sybaselookup, sqlsrvrlookup sortfunnel, sequence psort, tsort

Remove Duplicates remdup

2012 Data Migrators Pty. Ltd.

OSH Operator Syntax


#### STAGE: MyInput! ## Operator! Operator Type import! ## Operator options! -schema record! {final_delim=end, delim='|'}! (! inRecord:int32;! inAddr:string[];! Operator telNum:nullable string[];! Parameters )! -file 'C:\\Users\\isuser\\Documents\\MyData.csv'! -rejects continue! -reportProgress yes! -firstLineColumnNames! ## General options! [ident(MyInput'); jobmon_ident(MyInput')]! Identification ## Outputs! 0> [] 'Input:AddressIn.v'! Interface(s) ;!

End

2012 Data Migrators Pty. Ltd.

An OSH Script
InputOperator! {parameters}! 0> [] FirstLink.v! ;! ! ProcessOperator! {parameters}! 0< [] FirstLink.v! 0> [] SecondLink.v! ;! ! OutputOperator! {parameters}! 0< [] SecondLink.v! ;! ! !2012 Data Migrators Pty. Ltd. !

Youre building OSH


! You can create and execute your own OSH scripts ! No DataStage Designer necessary! ! [demo] ! Programmatically generate DataStage jobs ! Generate hundreds of jobs from Ruby, C, Python, etc., etc. ! Generate a bespoke job in response to a Web Page submission, ! etc. ! No compilation necessary ! Although transformers are a bit special ! Start with a DataStage job in Designer ! Use the generate OSH as a template
2012 Data Migrators Pty. Ltd.

Visualise OSH
! Writing stand-alone OSH, or

diagnosing generated OSH can be very cumbersome.


! Use a tool to visualise your OSH ! http://gosh.datamigrators.com ! [demo]

2012 Data Migrators Pty. Ltd.

OSH At Runtime
! A Node Configuration file tells DataStage ! ! How to execute multiple parallel instances of your job ! How to map operators to O/S processes ! DataStage may combine operators ! Good for performance, bad for debugging ! Can disable this with $APT_DISABLE_COMBINATION ! DataStage may add additional operators to your job ! E.g. Sort or Partition to ensure correct operation of Join Operators ! Can disable this with $APT_NO_SORT_INSERTION

2012 Data Migrators Pty. Ltd.

The Orchestra

2012 Data Migrators Pty. Ltd.

4 Ways to Integrate Custom Functionality


! Transformer functions ! Built using any language that can compile into a shared library ! Called once per data item ! Integrated into the transform operator ! Wrapped Stages ! Pipe rows through operating system commands ! Slowest performance

!
2012 Data Migrators Pty. Ltd.

4 Ways to Integrate Custom Functionality


! Build Stages ! A GUI custom operator constructor ! Built in C/C++ with helper macros
! E.g. readRecord(), writeRecord(), doTransfer()

! Some restrictions ! e.g. minimum 1 input, 1 output

! Custom Stages ! Built in C/C++ ! Fewer restrictions than a Build Stage. E.g.
! Can create data sources and data targets ! Can create combinable operators

! Documented in the Custom Operator Reference

! Both of these !
! ! ! !

Creates a native OSH operator Offer high performance Custom Icon DataStage native interface

2012 Data Migrators Pty. Ltd.

Example
! Experian QAS Batch ! Postal address cleaning solution ! A bespoke database ! A C/C++ API which provides !
! Start(), Open(), Clean(), Close(), Shutdown()

! Thats it!

! We integrated QAS Batch so it runs as an operator ! Scales performance of QAS Batch linearly ! QAS Batch is now grid-enabled ! [demo]
2012 Data Migrators Pty. Ltd.

Summary
! Dont fear the OSH! ! It represents your real DataStage job ! It tells you whats really happening under the hood ! Understanding them can help performance diagnosis ! OSH scripts can be auto-generated ! Build an operator ! Theyre fast ! Theyre reusable ! They can be used to integrate virtually anything, seamlessly ! If you can do it in C/C++, then you can build an operator for it ! They open new possibilities
2012 Data Migrators Pty. Ltd.

Fin

2012 Data Migrators Pty. Ltd.

Das könnte Ihnen auch gefallen