Sie sind auf Seite 1von 78

Filter Transformation

Drops rows conditionally

Active Transformation
Connected

Ports
• All input / output

Specify a Filter condition

Usage
• Filter rows from
flat file sources
• Single pass source(s)
into multiple targets
1
Aggregator Transformation

Performs aggregate calculations

Active Transformation
Connected

Ports
• Mixed
• Variables allowed
• Group By allowed

Create expressions in
output or variable ports

Usage
• Standard aggregations
2
Informatica Functions

Aggregate Functions
AVG  Return summary values for non-null data
COUNT in selected ports
FIRST
LAST
 Use only in Aggregator transformations
MAX
 Use in output ports only
MEDIAN
MIN  Calculate a single value (and row) for all
PERCENTILE
records in a group
STDDEV
SUM  Only one aggregate function can be
VARIANCE nested within an aggregate function
 Conditional statements can be used with
these functions
3
Aggregate Expressions

Aggregate
functions are
supported
only
in the
Aggregator
Transformation

Conditional
Aggregate
expressions
are supported Conditional SUM format: SUM(value, condition)
4
Aggregator Properties

Sorted Input Property

Instructs the
Aggregator to
expect the data
to be sorted

Set Aggregator
cache sizes (on
Informatica Server
machine)

5
Sorted Data

 The Aggregator can handle sorted or unsorted data


• Sorted data can be aggregated more efficiently, decreasing
total processing time

 The Server will cache data from each group and


release the cached data -- upon reaching the first
record of the next group
 Data must be sorted according to the order of the
Aggregator “Group By” ports
 Performance gain will depend upon varying factors

6
Incremental Aggregation
MTD
Trigger in calculation
Session Properties,
Performance
Tab

 Cache is saved into $PMCacheDir: aggregatorname.DAT


aggregatorname.IDX
 Upon next run, files are overwritten with new cache information

Example: When triggered, PowerCenter Server will save


new MTD totals. Upon next run (new totals), Server will
subtract old totals; difference will be passed forward

Best Practice is to copy these files in case a rerun of data is ever required.
Reinitialize when no longer needed, e.g. – at the beginning new month processing 7
Joiner Transformation

By the end of this section you will be familiar with:


 When to use a Joiner Transformation
 Homogeneous Joins
 Heterogeneous Joins
 Joiner properties
 Joiner Conditions
 Nested joins

8
Homogeneous Joins
Joins that can be performed with a SQL SELECT statement:
 Source Qualifier contains a SQL join

 Tables on same database server (or are synonyms)


 Database server does the join “work”
 Multiple homogenous tables can be joined

9
Heterogeneous Joins

Joins that cannot be done with a SQL statement:


 An Oracle table and a Sybase table

 Two Informix tables on different database servers


 Two flat files
 A flat file
and a
database
table

10
Joiner Transformation

Performs heterogeneous joins on records from


different databases or flat file sources

Active Transformation
Connected
Ports
• All input or input / output
• “M” denotes port comes
from master source
Specify the Join condition
Usage
• Join two flat files
• Join two tables from
different databases
• Join a flat file with a
relational table 11
Joiner Conditions

Multiple
join
conditions
are supported

12
Joiner Properties

Join types:
• “Normal”
(inner)
• Master outer
• Detail outer
• Full outer

Set
Joiner Cache

Joiner can accept sorted data (configure the join condition to


use the sort origin ports) 13
Mid-Mapping Join

The Joiner does not accept input in the following situations:


 Both input pipelines begin with the same Source Qualifier
 Both input pipelines begin with the same Normalizer
 Both input pipelines begin with the same Joiner
 Either input pipeline contains an Update Strategy

14
Sorter Transformation

 Can sort data from relational tables or flat files


 Sort takes place on the Informatica Server machine
 Multiple sort keys are supported
 The Sorter transformation is often more efficient than
a sort performed on a database with an ORDER BY
clause

15
Lookup Transformation

By the end of this section you will be familiar with:


 Lookup principles
 Lookup properties
 Lookup conditions
 Lookup techniques
 Caching considerations

16
How a Lookup Transformation Works
 For each Mapping row, one or more port values are looked
up in a database table
 If a match is found, one or more table values are returned
to the Mapping. If no match is found, NULL is returned

Lookup Lookup transformation


value(s)

Return value(s)

17
Lookup Transformation
Looks up values in a database table and provides
data to other components in a Mapping

Passive Transformation
Connected / Unconnected
Ports
• Mixed
• “L” denotes Lookup port
• “R” denotes port used as a
return value (unconnected
Lookup only)
Specify the Lookup Condition
Usage
• Get related values
• Verify if records exists or
if data has changed 18
Lookup Properties

Override
Lookup SQL
option

Toggle
caching

Native
Database
Connection
Object name

19
Additional Lookup Properties

Set cache
directory

Make cache
persistent

Set
Lookup
cache sizes

20
Lookup Conditions
Multiple conditions are supported

21
To Cache or not to Cache?
Caching can significantly impact performance
 Cached
• Lookup table data is cached locally on the Server
• Mapping rows are looked up against the cache
• Only one SQL SELECT is needed

 Uncached
• Each Mapping row needs one SQL SELECT
 Rule Of Thumb: Cache if the number (and size) of
records in the Lookup table is small relative to the
number of mapping rows requiring lookup
22
Target Options

By the end of this section you will be familiar with:


 Row type indicators
 Row operations at load time
 Constraint-based loading considerations
 Rejected row handling options

23
Target Properties

Session Task

Select target
instance

Row loading
operations

Error handling

Properties Tab
24
Constraint-based Loading
Maintains referential integrity in the Targets
pk1
Example 1
fk1, pk2 With only One Active source, rows
for Targets 1-3 will be loaded
properly and maintain referential
fk2 integrity

pk1 Example 2
With Two Active sources, it is not
fk1, pk2 possible to control whether rows for
Target 3 will be loaded before or
fk2 after those for Target 2

The following transformations are ‘Active sources’: Advanced External Procedure,


Source Qualifier, Normalizer, Aggregator, Sorter, Joiner, Rank, Mapplet (containing
any of the previous transformations)
25
Update Strategy Transformation

By the end of this section you will be familiar with:


 Update Strategy functionality
 Update Strategy expressions
 Refresh strategies
 Smart aggregation

26
Update Strategy Transformation
Used to specify how each individual row will be used to
update target tables (insert, update, delete, reject)

Active Transformation
Connected

Ports
• All input / output

Specify the Update


Strategy Expression

Usage
• Updating Slowly
Changing Dimensions
• IIF or DECODE logic
determines how to
handle the record
27
Target Refresh Strategies

 Single snapshot: Target truncated, new records


inserted
 Sequential snapshot: new records inserted

 Incremental: Only new records are inserted.


Records already present in the target are ignored
 Incremental with Update: Only new records are
inserted. Records already present in the target are
updated

28
Router Transformation

Rows sent to multiple filter conditions

Active Transformation
Connected
Ports
• All input/output
• Specify filter conditions
for each Group
Usage
• Link source data in one
pass to multiple filter
conditions

29
Router Transformation in a Mapping

30
Parameters and Variables

By the end of this section you will understand:


 System Variables
 Creating Parameters and Variables
 Features and advantages
 Establishing values for Parameters and Variables

31
System Variables
SYSDATE  Provides current datetime on the
Informatica Server machine
• Not a static value

$$$SessStartTime  Returns the system date value as a


string. Uses system clock on machine
hosting Informatica Server
• format of the string is database type
dependent
• Used in SQL override
• Has a constant value

SESSSTARTTIME  Returns the system date value on the


Informatica Server
• Used with any function that accepts
transformation date/time data types
• Not to be used in a SQL override
• Has a constant value 32
Mapping Parameters and Variables

 Apply to all transformations within one Mapping


 Represent declared values
 Variables can change in value during run-time
 Parameters remain constant during run-time
 Provide increased development flexibility
 Defined in Mapping menu
 Format is $$VariableName or $$ParameterName

33
Mapping Parameters and Variables
Sample declarations

Set the
User- appropriate
defined aggregation
names type

Set optional
Initial Value

Declare Variables and Parameters in the Designer Mappings menu


34
Functions to Set Mapping Variables

 SetCountVariable -- Counts the number of


evaluated rows and increments or decrements a
mapping variable for each row
 SetMaxVariable -- Evaluates the value of a mapping
variable to the higher of two values
 SetMinVariable -- Evaluates the value of a mapping
variable to the lower of two values
 SetVariable -- Sets the value of a mapping variable
to a specified value

35
Unconnected Lookup
 Will be physically “unconnected” from other transformations
• There can be NO data flow arrows leading to or from an
unconnected Lookup
Lookup function can be set within any
transformation that supports expressions

Lookup data is
called from the
point in the
Mapping that
needs it

Function in the
Aggregator calls the
unconnected Lookup
36
Conditional Lookup Technique
Two requirements:
 Must be Unconnected (or “function mode”) Lookup
 Lookup function used within a conditional statement
Row keys
Condition (passed to Lookup)

IIF ( ISNULL(customer_id),:lkp.MYLOOKUP(order_no))

Lookup function

 Conditional statement is evaluated for each row


 Lookup function is called only under the pre-defined
condition 37
Conditional Lookup Advantage

 Data lookup is performed only for those rows which


require it. Substantial performance can be gained

EXAMPLE: A Mapping will process 500,000 rows. For two


percent of those rows (10,000) the item_id value is NULL.
Item_ID can be derived from the SKU_NUMB.

IIF ( ISNULL(item_id), :lkp.MYLOOKUP (sku_numb))

Condition Lookup
(true for 2 percent of all (called only when condition is
rows) true)
Net savings = 490,000 lookups
38
Connected vs. Unconnected Lookups

CONNECTED LOOKUP UNCONNECTED LOOKUP

Part of the mapping data flow Separate from the mapping data
flow
Returns multiple values (by linking Returns one value (by checking the
output ports to another Return (R) port option for the output
transformation) port that provides the return value)
Executed for every record passing Only executed when the lookup
through the transformation function is called
More visible, shows where the Less visible, as the lookup is called
lookup values are used from an expression within another
transformation
Default values are used Default values are ignored

39
Heterogeneous Targets

By the end of this section you will be familiar with:


 Heterogeneous target types
 Heterogeneous target limitations
 Target conversions

40
Definition: Heterogeneous Targets

Supported target definition types:


 Relational database
 Flat file
 XML
 ERP (SAP BW, PeopleSoft, etc.)

A heterogeneous target is where the target types are


different or the target database connections are different
within a single Session Task

41
Step One: Identify Different Target Types

Oracle table

Oracle table
Tables are EITHER in two
different databases, or
require different (schema-
specific) connect strings
Flat file
One target is a flatfile load

42
Step Two: Different Database Connections

The two database


connections WILL
differ

Flatfile requires
separate location
information

43
Target Type Override (Conversion)

Example: Mapping has SQL Server target definitions.


Session Task can be set to load Oracle tables instead,
using an Oracle database connection.

Only the following overrides are supported:


 Relational target to flat file target
 Relational target to any other relational database type
 SAP BW target to a flat file target

CAUTION: If target definition datatypes are not compatible with datatypes in newly
selected database type, modify the target definition 44
Mapplet Designer

Mapplet Designer Tool Mapplet


Transformation
Icons

Mapplet Output
Transformation

45
Mapplet Advantages

 Useful for repetitive tasks / logic


 Represents a set of transformations
 Mapplets are reusable
 Use an ‘instance’ of a Mapplet in a Mapping
 Changes to a Mapplet are inherited by all instances
 Server expands the Mapplet at runtime

46
Active and Passive Mapplets

 Passive Mapplets contain only passive transformations


 Active Mapplets contain one or more active
transformations

CAUTION: changing a passive Mapplet into an active


Mapplet may invalidate Mappings which use that
Mapplet
• Do an impact analysis in Repository Manager first

47
Using Active and Passive Mapplets

Multiple Passive
Mapplets can populate
Passive the same target
instance

Multiple Active Mapplets


or Active and Passive
Mapplets cannot
Active populate the same
target instance

48
Reusable Transformations

By the end of this section you will be familiar with:


 Reusable transformation advantages
 Reusable transformation rules
 Promoting transformations to reusable
 Copying reusable transformations

49
Reusable Transformations
 Define once - reuse many times
 Reusable Transformations
• Can be a copy or a shortcut
• Edit Ports only in Transformation Developer
• Can edit Properties in the mapping
• Instances dynamically inherit changes
• Be careful: It is possible to invalidate mappings by
changing reusable transformations
 Transformations that cannot be made reusable
• Source Qualifier
• ERP Source Qualifier
• Normalizer used to read a Cobol data source

50
Promoting a Transformation to Reusable

Place a
check in the
“Make
reusable” box

This action
is not
reversible

51
Sequence Generator Transformation

Generates unique keys for any port on a row

Passive Transformation
Connected

Ports
• Two predefined
output ports,
NEXTVAL and
CURRVAL
• No input ports allowed

Usage
• Generate sequence numbers
• Shareable across mappings
52
Sequence Generator Properties

Number
of
Cached
Values

53
Dynamic Lookup

By the end of this section you will be familiar with:


 Dynamic lookup theory
 Dynamic lookup advantages
 Dynamic lookup rules

54
Additional Lookup Cache Options

Make cache
persistent

Cache File Name Prefix


• Reuse cache by
name for another
similar business
purpose

Recache from Database


• Overrides other
Dynamic Lookup Cache
settings and Lookup
data is refreshed • Allows a row to know about the
handling of a previous row
55
Persistent Caches

 By default, Lookup caches are not persistent


 When Session completes, cache is erased
 Cache can be made persistent with the Lookup
properties
 When Session completes, the persistent cache is
stored on server hard disk files
 The next time Session runs, cached data is loaded
fully or partially into RAM and reused
 Can improve performance, but “stale” data may pose
a problem

56
Dynamic Lookup Cache Advantages

 When the target table is also the Lookup table,


cache is changed dynamically as the target load
rows are processed in the mapping
 New rows to be inserted into the target or for
update to the target will affect the dynamic Lookup
cache as they are processed
 Subsequent rows will know the handling of
previous rows
 Dynamic Lookup cache and target load rows
remain synchronized throughout the Session run

57
Update Dynamic Lookup Cache
 NewLookupRow port values
• 0 – static lookup, cache is not changed
• 1 – insert row to Lookup cache
• 2 – update row in Lookup cache
 Does NOT change row type
 Use the Update Strategy transformation before or after
Lookup, to flag rows for insert or update to the target
 Ignore NULL Property
• Per port
• Ignore NULL values from input row and update the cache
using only with non-NULL values from input

58
Example: Dynamic Lookup Configuration

Router Group Filter Condition should be:


NewLookupRow = 1

This allows isolation of insert rows from update rows


59
Concurrent and Sequential Workflows

By the end of this section you will be familiar with:


 Concurrent Workflows
 Sequential Workflows
 Scheduling Workflows
 Stopping, aborting, and suspending Tasks and
Workflows

60
Multi-Task Workflows - Sequential

Tasks can be run sequentially:

Tasks shows are all Sessions, but they can also be


other Tasks, such as Commands, Timer or Email Tasks

61
Multi-Task Workflows - Concurrent

Tasks can be run concurrently:

Tasks shows are all Sessions, but they can also be


other Tasks such as Commands, Timer or Email Tasks.

62
Multi-Task Workflows - Combined

 Tasks can be run in a combination concurrent and


sequential pattern within one Workflow:

 Tasks shows are all Sessions, but they can also be


other Tasks such as Commands, Timer or Email Tasks
63
Additional Transformations

By the end of this section you will be familiar with:


 The Rank transformation
 The Normalizer transformation
 The Stored Procedure transformation
 The External Procedure transformation
 The Advanced External Procedure transformation

64
Rank Transformation
Filters the top or bottom range of records

Active Transformation
Connected
Ports
• Mixed
• One pre-defined
output port
RANKINDEX
• Variables allowed
• Group By allowed

Usage
• Select top/bottom
• Number of records
65
Normalizer Transformation

Normalizes records from relational or VSAM sources

Active Transformation
Connected

Ports
• Input / output or output

Usage
• Required for VSAM
Source definitions
• Normalize flat file or
relational source
definitions
• Generate multiple
records from one record
66
Normalizer Transformation
Turn one row
YEAR,ACCOUNT,MONTH1,MONTH2,MONTH3, … MONTH12
1997,Salaries,21000,21000,22000,19000,23000,26000,29000,29000,34000,34000,40000,4500
0
1997,Benefits,4200,4200,4400,3800,4600,5200,5800,5800,6800,6800,8000,9000
1997,Expenses,10500,4000,5000,6500,3000,7000,9000,4500,7500,8000,8500,8250

Into multiple rows

67
Stored Procedure Transformation

Calls a database stored procedure

Passive Transformation
Connected/Unconnected

Ports
• Mixed
• “R” denotes port will
return a value from the
stored function to the
next transformation

Usage
• Perform transformation
logic outside PowerMart /
PowerCenter
68
External Procedure Transformation (TX)
Calls a passive procedure defined in a dynamic linked
library (DLL) or shared library

Passive Transformation
Connected/Unconnected

Ports
• Mixed
• “R” designates return
value port of an
unconnected
transformation
Usage
• Perform transformation
logic outside PowerMart /
PowerCenter
Option to allow partitioning 69
Advanced TX Transformation
Calls an active procedure defined in a dynamic linked
library (DLL) or shared library

Active Transformation
Connected Mode only

Ports
• Mixed

Usage
• Perform
transformation logic
outside PowerMart /
PowerCenter
• Sorting, Aggregation

Option to allow partitioning 70


Transaction Control Transformation
Allows custom commit types (source- or target-
based) and user-defined conditional commits

Passive Transformation
Connected Mode Only

Ports
• Input and Output

Properties
• Continue
• Commit Before
• Commit After
• Rollback Before
• Rollback After

71
Transaction Control Functionality
 Commit Types
• Target Based Commit -
Commit Based on “approximate” number of records
written to target
• Source Based Commit –
Ensures that a source record is committed in all
targets
• User Defined Commit –
Uses Transaction Control Transform to specify
commits and rollbacks in the mapping based on
conditions
 Set the Commit Type (and other specifications) in the
Transaction Control Condition
72
Versioning

 View Object Version Properties


 Track Changes to an Object
 Check objects “in” and “out”
 Delete or Purge Object version
 Apply Labels and Run queries
 Deployment Groups

73
Informatica Business Analytics Suite

Modular
Plug-&-Play
Approach

Packaged
Analytic Solutions Custom Built
Analytic Solutions
74
Informatica Warehouses / Marts

Informatica Warehouse™
Customer Finance Human Supply Chain
Relationship Resources
Sales G/L Compensation Planning
Marketing Receivables Scorecard Sourcing
Service Payables Inventory
Web Profitability Quality

Common Dimensions
Customer Product Supplier Geography
Organization Time Employee

75
Inside the Informatica Warehouse
Business Intelligence  Business Adapters™ (Extract)
• Data Source Connectivity with Minimal
Load
Informatica Warehouse™ • Structural/Functional Knowledge of
Sources
Analytic Advanced
Data Model Calculation  Analytic Bus™ (Transform)
Engine • Transaction consolidation and
standardization
• Source independent interface
Load

Warehouse Loader™  Warehouse Loader (Load)


• Type I, II slowly changing dimensions
Transform

• History and changed record tracking


Analytic Bus™  Analytic Data Model
• Industry Best Practice Metrics
Extract

• Process-centric model & conformed


Business Adapters™ dimensions
 Advanced Calculation Engine
• Pre-aggregations for rapid query
SAP ORCL i2 SEBL PSFT Custom
response
• Complex calculation metrics (e.g. 76
statistical)
PowerConnect Products

Family of enterprise software products that allow


companies to directly source and integrate ERP, CRM,
real-time message queue, mainframe, AS/400, remote
data and metadata with other enterprise data
 PowerConnect for MQSeries (real time)
 PowerConnect for TIBCO (real time)
 PowerConnect for PeopleSoft
 PowerConnect for SAP R/3
 PowerConnect for SAP BW
 PowerConnect for Siebel
 PowerConnect for Mainframe
 PowerConnect for AS/400
 PowerConnect for Remote Data
 PowerConnect SDK
77
78

Das könnte Ihnen auch gefallen