Beruflich Dokumente
Kultur Dokumente
TABLE OF CONTENTS
Abstract................................................................................................................................3 Content overview.................................................................................................................3 1. Lookup - Performance considerations.............................................................................3 1.1. Unwanted columns....................................................................................................3 1.2. Si e of t!e source versus si e of lookup...................................................................3 1.3. "#$% instead of Lookup............................................................................................& 1.&. Conditional call of lookup........................................................................................& 1.'. S(L )uer*.................................................................................................................& 1.+. $ncrease cac!e...........................................................................................................& 1.,. Cac!efile file-s*stem................................................................................................& 1.-. Useful cac!e utilities.................................................................................................& 2. .orkflow performance / basic considerations................................................................' 2.1. S(L tunin0....................................................................................................................+ 3. Pre1Post-Session command - Uses..................................................................................., &. Se)uence 0enerator / desi0n considerations....................................................................'. 23P Connection ob4ect / platform independence............................................................-
Abstract
This article explains a few of the important development best practices, like lookups, workflow performance etc.
Content overview
Lookup - Performance considerations Workflow performance basic considerations Pre/Post- ession commands - !ses e"uence #enerator desi#n considerations $TP %onnection ob&ect platform independence
(f the same lookup 1L is bein# used b) another lookup, then shared cache or a reusable lookup should be used. *lso, if )ou have a table where the data is not chan#ed often, )ou can use the persist cache option to build the cache once and use it man) times b) consecutive flows.
1. ( would alwa)s su##est )ou to think twice before usin# an !pdate trate#), thou#h it adds a certain level of flexibilit) in the mappin#. (f )ou have a strai#ht-throu#h mappin# which takes data from source and directl) inserts all the records into the tar#et, )ou wouldn:t need an update strate#). 2. !se a pre- 1L delete statement if )ou wish to delete specific rows from tar#et before loadin# into the tar#et. !se truncate option in the session properties, if )ou wish to clean the table before loadin#. ( would avoid a separate pipe-line in the mappin# that runs before the load with update-strate#) transformation. 3. 3ou have ; sources and ; tar#ets with one-on-one mappin#. (f the load is independent accordin# to business re"uirement, ( would create ; different mappin#s and ; different session instances and the) all run in parallel in m) workflow after m) < tart= task. (:ve observed that the workflow runtime comes down between ;.->.? of serial processin#. &. Power%enter is built to work of hi#h volumes of data. o let the server be completel) bus). (nduce parallelism as far as possible into the mappin#/workflow. '. (f usin# a transformation like a @oiner or *##re#ator transformation, sort the data on the &oin ke)s or #roup b) columns prior to these transformations to decrease the processin# time. +. $ilterin# should be done at the database level instead within the mappin#. The database en#ine is much more efficient in filterin# than Power%enter.
The above examples are &ust some thin#s to consider when tunin# a mappin#.
!sin# the execution plan to tune a "uer) is the best wa) to #ain an understandin# of how the database will process the data. ome thin#s to keep in mind when readin# the execution plan includeA BFull Table Scans are not evilB, BIn e!es are not alwa"s fastB, and <In e!es can be slow tooB. *nal)se the table data to see if pickin# up C. records out of C. million is best usin# index or usin# table scan. $etchin# -. records out of -D usin# index is faster or usin# full table scan is easier. 6an) times the relational tar#et indexes create performance problems when loadin# records into the relational tar#et. (f the indexes are needed for other purposes, it is su##ested to drop the indexes at the time of loadin# and then rebuild them in post1L. When droppin# indexes on a tar#et )ou should consider inte#rit) constraints and the time it takes to rebuild the index on post load vs. actual load time.
(t is a ver) #ood practice to email the success or failure status of a task, once it is done. (n the same wa), when a business re"uirement drives, make use of the Post ession uccess and $ailure email for proper communication. The built-in feature offers more flexibilit) with ession Lo#s as attachments and also provides other run-time data like Workflow run instance (7, etc. *n) archivin# activities around the source and tar#et flat files can be easil) mana#ed within the session usin# the session properties for flat file command support that is new in Power%enter vE.>. $or example, after writin# the flat file tar#et, )ou can setup a command to ,ip the file to save space. (f )ou have an) editin# of data in the tar#et flat files which )our mappin# couldn:t accommodate, write a shell/batch command or script and call it in the Post- ession command task. ( prefer takin# trade-offs between Power%enter capabilities and the 8 capabilities in these scenarios.
$ewer Power%enter ob&ects will be present in a mappin# which reduces development time and also maintenance effort. (7 #eneration is Power%enter independent if a different application is used in future to populate the tar#et. 6i#ration between environments is simplified because there is no additional overhead of considerin# the persistent values of the se"uence #enerator from the repositor) database.
(n all of the above cases, a se"uence created in the tar#et database would make life lot easier for the table data maintenance and also for the Power%enter development. (n fact, databases will have specific mechanisms /focused0 to deal with se"uences and so )ou can implement manual Push-down optimi,ation on )our Power%enter mappin# desi#n for )ourself. 7+*s will alwa)s complain about tri##ers on the databases, but ( would still insist on usin# se"uence-tri##er combination for hu#e volumes of data as well.