Sie sind auf Seite 1von 160

Informatica

Agenda

Overview & Components


Informatica Server & Data Movement
Repository Server & Repository Manager
Designer
Transformations used in Informatica
Re-usable Transformations & Mapplets
Workflow Manager & Workflow Monitor
Performance Tuning & Troubleshooting

Center Of Excellence

Overview &
Components

Informatica Power Center


Architecture

Target
s

Center Of Excellence

PowerCenter 8x Architecture
Client Tools

Sources
Designer
Standards,
Messaging,
Web Services

Packaged
Applications

Relational/Flat
Files

Mainframe/
Midrange

WF Manager

Rep. Manager

Targets
Monitor

Administration Console

Application Services
Integration
Service(s)

Repository
Service(s)

Web Services
Hub

SAP BW
Service

PowerCenter
Connects

Repository
Database

Standards,
Messaging,
Web Services

Packaged
Applications

Relational/Flat
Files

Mainframe/
Midrange

PowerExchange
Core Services

Domain/Gateway Service

Log Service

Authentication
Configuration
Service management
-

Center Of Excellence

What is a Domain?
Unified and single point of admin/config for:

Integration Service (Informatica Server)


Repository Service (Repository Server)
Web Services Hub Service (WSH)
BW Integration Service (BW Integration )

Domain Consists of

Set of Nodes
Set of Services
Zero or more Grids
Set of Resources

Center Of Excellence

Gateway (Domain Controller) Node

Purpose of the Gateway node

Starts up and manages services running on the domain


Manages Configuration Metadata
Provides Service lookup for clients
Checks for service availability via heartbeats
Coordinates failover of services

HA for Gateway Node


One or more nodes can be designated as Gateway nodes
Only one Master gateway node active at a time
Election process determines new Master

Center Of Excellence

Services
Application Service
Service that is configured by the end user and represents a key
visible component (Integration Service, Repository Service etc)
External clients directly interact with these services

Core Service
Infrastructure (internal) service (Gateway Service, Logging
Service etc)

Center Of Excellence

HA Setup
One Primary node and list of Backup nodes (Active/Passive
mode)
Application Services
Core Services

Automatic failover from Primary to Backup


No automatic fail-back
Manual fail-back

Integration Service operates in Active-Active mode

Center Of Excellence

Overview .. Informatica
Repository

Stores the metadata created using the Informatica Client tools


Repository Manager creates the metadata tables in the
database
Tasks in the Informatica Client application such as creating
users, analyzing sources, developing mappings or mapplets,
or creating sessions creates metadata
Informatica Server reads metadata created in the Client
application when a session runs
Global and local repositories can be created to share
metadata

Center Of Excellence

10

Grid
Collection on nodes
Integration Service can be assigned to Grid
Service runs on all nodes in the grid

Grid leveraged for Workflow distribution and Session


distribution (SonG)
Scalability
Availability

Advanced Load Balancer


Resource map

Center Of Excellence

11

PowerCenter Architecture:
Data Flow
Windows 95,
98, NT 4.0
or 2000

Workflow
Monitor

Workflow
Manager

Client Tools

Repository
Manager

Designer

ODBC

ODBC

TCP/IP

Heterogeneous
Sources
Oracle
MS SQL Server
Sybase
SAP R/3 & BW Informix
PeopleSoftDB2 UDB
Siebel ODBC
MQ Series Flat File
XML
TIBCO
VSAM/COBOL
JMSCopybook

Remote Files

Heterogeneous
Targets

Repository
Service

Native/ODBC

Sources

Targets

Object
Repository

Native
ODBC

Oracle
MS SQL
Server
Sybase
DB2 UDB

PowerConnect

TCP/IP

Native
ODBC

Oracle API, SQL*Loader


MS SQL Server, BCP
Sybase, IQ Load
Informix SAP R/3 & BW
DB2 UDB,PeopleSoft
Autoloader
Teradata FLOAD,TPUMP,
Siebel
MPUMP
MQ Series
ODBC
Flat File TIBCO
JMS
XML

Remote Files

PowerConnect

TCP/IP
PowerCenter Server Engine
Buffers
Key

Data
Metadata

UNIX (AIX, HPUX,


Solaris, Tru64)
Windows NT 4.0, 2000

Reader

DTM

Writer

Center Of Excellence

12

Overview .. Informatica Client


Tools
Repository Manager
To create and administer the metadata repository
To create repository users and groups, assign privileges and
permissions
Manage folders and locks
Designer
To add source and target definitions to the repository
To create mappings that contain data transformation
instructions

Workflow Manager & Workflow Monitor


To create, schedule, execute, and monitor sessions

Center Of Excellence

13

Overview .. Informatica Server


The Informatica Server reads
mapping and session
information from the repository
It extracts data from the
mapping sources and stores
the data in memory while it
applies the transformation
rules in the mapping
The Informatica Server loads
the transformed data into the
mapping targets
Platforms
Windows NT/2000
UNIX

Center Of Excellence

14

Overview .. Sources

Relational - Oracle, Sybase, Informix, IBM DB2, Microsoft


SQL Server, and Teradata
File - Fixed and delimited flat file, COBOL file, and XML
Extended
PowerConnect products for PeopleSoft, SAP R/3, Siebel, and
IBM MQSeries
Mainframe
PowerConnect for IBM DB2 on MVS
Other - Microsoft Excel and Access

Center Of Excellence

15

Overview .. Targets

Relational - Oracle, Sybase, Sybase IQ, Informix, IBM DB2,


Microsoft SQL Server, and Teradata
File - Fixed and delimited flat files and XML
Extended
Integration server to load data into SAP BW.
PowerConnect for IBM MQSeries to load data into IBM
MQSeries message queues
Other - Microsoft Access
ODBC or native drivers, FTP, or external loaders

Center Of Excellence

16

Questions

Center Of Excellence

17

Informatica Server &


Data Movement

Informatica Server and Data


Movement
The Informatica Server moves
data from sources to targets
based on mapping and session
metadata stored in a repository
database
A session is a set of instructions
that describes how and when to
move data from sources to
targets
Workflow Manager creates and
manages and executes
sessions,worklets and workflows.
Workflow Monitor is used to
monitor session for debugging in
case of any error

Center Of Excellence

19

Informatica Server

When a session starts, the Informatica Server retrieves mapping and


session metadata from the repository database through Repository
Server initiating a Repository Agent

The Informatica Server runs as a daemon on UNIX and as a service on


Windows NT/2000

The Informatica Server uses the following processes to run a session:


The Load Manager process - Starts the session, creates the DTM
process, and sends post-session email when the session completes
The DTM process - Creates threads to initialize the session, read,
write, and transform data, and handle pre- and post-session
operations

Center Of Excellence

20

The Load Manager Process


The Load Manager performs the following tasks:
Manages session,worklet and workflow scheduling
Locks the session and reads session properties
Reads the parameter file
Expands the server and session variables and parameters
Verifies permissions and privileges
Validates source and target code pages
Creates the session log file
Creates the Data Transformation Manager (DTM) process,
which executes the session

Center Of Excellence

21

The Load Manager Process

The Load Manager and repository communicate with each other


using Unicode

To prevent loss of information during data transfer, the


Informatica Server and repository require compatible code
pages

It communicates with the repository in the following situations:


When you start the Informatica Server
When you configure a session
When a session starts

Center Of Excellence

22

Data Transformation Manager


Process

DTM process is the second process associated with a session


run
The primary purpose of the DTM process is to create and
manage threads that carry out the session tasks
The DTM allocates process memory for the session and divides
it into buffers. This is also known as buffer memory
It creates the main thread, which is called the master thread
The master thread creates and manages all other threads
If you partition a session, the DTM creates a set of threads for
each partition to allow concurrent processing
When the Informatica Server writes messages to the session
log, it includes the thread type and thread ID

Center Of Excellence

23

DTM Threads

Center Of Excellence

24

DTM Threads

For example, a pipeline contains one source, one target. You configure two
partitions in the session properties. The DTM creates the following threads to
process the pipeline:
Two reader threads - One for each partition.
Two writer threads - One for each partition
When the pipeline contains an Aggregator or Rank transformation, the DTM
creates one additional set of threads for each Aggregator or Rank transformation

Center Of Excellence

25

DTM Threads

When the Informatica Server processes a mapping with a Joiner transformation,


it first reads the master source and builds caches based on the master rows
The Informatica Server then reads the detail source and processes the
transformation based on the detail source data and the cache data
The pipeline for the master source ends at the Joiner transformation and may
not have any targets
You cannot partition the master source for a Joiner transformation

Center Of Excellence

26

Questions

Center Of Excellence

27

Repository Server

Repository Server

Informatica client applications and informatica server access the


repository database tables through the Repository Server.

Informatica client connects to the repository server through the host


name/IP Address and its port number

The Repository Server can manage multiple repositories on different


machines on the network.

Center Of Excellence

29

Repository Server (Contd..)

For each repository database registered with the Repository Server,


it configures and manages a Repository Agent process.

The Repository Agent is a multi-threaded process that performs the


action needed to retrieve, insert, and update metadata in the
repository database tables.

Center Of Excellence

30

Questions

Center Of Excellence

31

Repository Manager

Repository Manager Window

Center Of Excellence

33

Repository

The Informatica repository tables have an open architecture

Metadata can include information such as


mappings describing how to transform source data
sessions indicating when you want the Informatica Server to
perform the transformations
connect strings for sources and targets

The repository also stores administrative information such as


usernames and passwords
permissions and privileges

Center Of Excellence

34

Repository
Can create and store the
following types of metadata in
the repository:
Database connections
Global objects
Mappings
Mapplets
Multi-dimensional metadata
Reusable transformations
Sessions and batches
Shortcuts
Source definitions
Target definitions
Transformations

Center Of Excellence

35

Repository Types

There are three different types of repositories:


Standalone repository
Global repository
Local repository

Center Of Excellence

36

Repository Manager Tasks

Perform repository functions:


Create, backup, copy, restore, upgrade, and delete
repositories
Make a repository as global repository
Register and unregister local repositories with a global
repository

Implement repository security:


Create, edit, and delete repository users and user groups
Assign and revoke repository privileges and folder
permissions
View locks and unlock objects

Center Of Excellence

37

Repository Manager Tasks

Perform folder functions:


Create, edit, and delete folders
Copy a folder within the repository or to another repository
Compare folders within a repository or in different
repositories
Add and remove repository reports
Import and export repository connection information in the
registry
Analyze source/target, mapping, and shortcut dependencies

Center Of Excellence

38

Dependency Window
The Dependency window can display the following types of
dependencies:

Source-target dependencies - lists all sources or targets related


to the selected object and relevant information

Mapping dependencies - lists all mappings containing the


selected object as well as relevant information

Shortcut dependencies - lists all shortcuts to the selected object


and relevant details

Center Of Excellence

39

Copying and Backing Up a


Repository

Can copy a repository from one database to another


If the database into which the repository has to be copied
contains an existing repository, the Repository Manager deletes
the existing repository
Backing up a repository, saves the entire repository in a file
Can save this file in any local directory
Can recover data from a repository backup file

Center Of Excellence

40

Crystal Reports
The Repository Manager includes four Crystal Reports that provide
views of your metadata:

Mapping report (map.rpt) - Lists source column and transformation


details for each mapping

Source and target dependencies report (S2t_dep.rpt)

Target table report (Trg_tbl.rpt) - Provides target field transformation


expressions, descriptions, comments for each target table

Executed session report (sessions.rpt) - Provides information about


executed sessions

Center Of Excellence

41

Repository Security

Can plan and implement security using the following features:


User groups
Repository users
Repository privileges
Folder permissions
Locking

Can assign users to multiple groups

Privileges are assigned to groups

Can assign privileges to individual usernames and must assign


each user to at least one user group

Center Of Excellence

42

Repository Security Viewing


Locks

Can view existing locks in the repository in the Repository Manager


The Repository Manager provides two ways to view locks:
Show locks

Center Of Excellence

43

Types of Locks
There are five kinds of locks on repository objects:
Read lock - Created when you open a repository object in a folder
for which you do not have write permission

Write lock - Created when you create or edit a repository object

Execute lock - Created when you start a session or batch

Fetch lock - Created when the repository reads information about


repository objects from the database

Save lock

Center Of Excellence

44

Folders

Folders provide a way to organize and store all metadata in the


repository, including mappings and sessions

They are used to store sources, transformations, cubes, dimensions,


mapplets, business components, targets, mappings, sessions and
batches

Can copy objects from one folder to another

Can copy objects across repositories

The Designer allows you to create multiple versions within a folder

Center Of Excellence

45

Folders

When a new version is created, the Designer creates a copy of


all existing mapping metadata in the folder and places it into the
new version

Can copy a session within a folder, but you cannot copy an


individual session to a different folder

To copy all sessions within a folder to a different location, you


can copy the entire folder

Center Of Excellence

46

Folders

Any mapping in a folder can use only those source and target
definitions or reusable transformations that are stored:
in the same folder
in a shared folder and accessed through a shortcut

The configurable folder properties are:


Folder permissions
Folder owner
Owners group
Shared or not shared

Center Of Excellence

47

Folders

Folders have the following permission types:


Read permission
Write permission
Execute permission

Shared folders allow users to create shortcuts to objects in the


folder

Shortcuts inherit changes to their shared object

Once you make a folder shared, you cannot reverse it

Center Of Excellence

48

Copying Folders

Each time you copy a


folder, the Repository
Manager copies the
following:
Sources,
transformations,
mapplets, targets,
mappings, and
business components
Sessions and batches
Folder versions

Center Of Excellence

49

Copying Folders
When you copy a folder,
the Repository Manager
allows to:
Re-establish
shortcuts
Choose an
Informatica Server
Copy connections
Copy persisted values
Compare folders
Replace folders

Center Of Excellence

50

Comparing Folders

The Compare Folders Wizard allows to perform the following


comparisons:
Compare objects between two folders in the same repository
Compare objects between two folders in different repositories
Compare objects between two folder versions in the same folder

Each comparison also allows to specify the following comparison


criteria:
Versions to compare
Object types to compare
Direction of comparison

Center Of Excellence

51

Comparing Folders

Whether or not the Repository Manger notes a similarity or difference between


two folders depends on the direction of the comparison
One-way comparisons check the selected objects of Folder1 against the objects
in Folder2
Two-way comparisons check objects in Folder1 against those in Folder2 and
also check objects in Folder2 against those in Folder1

Center Of Excellence

52

Comparing Folders

The comparison wizard displays the following user-customized


information:
Similarities between objects
Differences between objects
Outdated objects
Can edit and save the result of the comparison
The Repository Manager does not compare the field attributes of
the objects in the folders when performing the comparison
A two-way comparison can sometimes reveal information a oneway comparison cannot
A one-way comparison does not note a difference if an object is
present in the target folder but not in the source folder

Center Of Excellence

53

Folder Versions

Maintaining different versions lets you revert to earlier work


when needed

When you save a version, you save all metadata at a particular


point in development

Later versions contain new or modified metadata, reflecting work


that you have completed since the last version

Maintaining different versions lets you revert to earlier work


when needed

Center Of Excellence

54

Exporting and Importing Objects

In the Designer and Workflow Manager, you can export repository


objects to an XML file and then import repository objects from the XML
file

Can export the following repository objects:


Sources
Targets
Transformations
Mapplets
Mappings
Sessions

Can share objects by exporting and importing objects between


repositories with the same version

Center Of Excellence

55

Questions

Center Of Excellence

56

Designer

Screen Shot of Designer

Center Of Excellence

58

Designer Workspace

Center Of Excellence

Navigator
Workspace
Status bar
Output

59

Designer Tools

Source Analyzer
To import or create source definitions for flat file, XML,
Cobol, ERP, and relational sources
Warehouse Designer
To import or create target definitions
Transformation Developer
To create reusable transformations
Mapplet Designer
To create mapplets
Mapping Designer
To create mappings

Center Of Excellence

60

Source Analyzer

The following types of source definitions can be imported or


created or modified in the Source Analyzer:
Relational Sources Tables, Views, Synonyms
Files Fixed-Width or Delimited Flat Files, COBOL Files
Microsoft Excel Sources
XML Sources XML Files, DTD Files, XML Schema Files
Data models using MX Data Model PowerPlug
SAP R/3, SAP BW, Siebel, IBM MQ Series by using
PowerConnect

Center Of Excellence

61

Source Analyzer Importing


Relational Source Definitions

After importing a relational source definition, Business names for


the table and columns can be entered

Center Of Excellence

62

Source Analyzer Importing Relational Source Definitions

The source definition appears in the Source Analyzer. In the Navigator, the new
source definition appears in the Sources node of the active repository folder,
under the source database name

Center Of Excellence

63

Source Analyzer Flat File


Sources
Supports Delimited & Fixed length files
Flat File Wizard prompts for the following file properties
File name and location
File code page
File type
Column names and data types
Number of header rows in the file
Column size and null characters for fixed-width files
Delimiter type, quote character, and escape character for
delimited files

Center Of Excellence

64

Source Analyzer Flat File


Sources

Flat file properties in the Source Analyzer :

Table name, business purpose, owner, and description


File type
Null characters for fixed-width files
Delimiter type, quote character, and escape character for delimited files
Column names and datatypes
Comments
HTML links to business documentation

Center Of Excellence

65

Warehouse Designer

To create target definitions for file and relational sources


Import the definition for an existing target - Import the target
definition from a relational target
Create a target definition based on a source definition:
Relational source definition
Flat file source definition
Manually create a target definition or design several related
targets at the same time

Center Of Excellence

66

Warehouse Designer - Tasks


Edit target definitions
Change in the target definitions gets propagated to the
mappings using that target
Create relational tables in the target database
If the target tables do not exist in the target database,
generate and execute the necessary SQL code to create
the target tables
Preview relational target data

Center Of Excellence

67

Warehouse Designer Create/Edit Target Definitions

Can edit Business Names, Constraints, Creation Options, Description,


Keywords on the Table tab of the target definition
Can edit Column Name, Datatype, Precision and Scale, Not Null, Key
Type, Business Name on the Columns tab of the target definition

Center Of Excellence

68

Mapping

Mappings represent the data flow between sources and targets

When the Informatica Server runs a session, it uses the


instructions configured in the mapping to read, transform, and
write data

Every mapping must contain the following components:


Source definition
Transformation
Connectors

A mapping can also contain one or more mapplets

Center Of Excellence

69

Mapping

Sample Mapping

Center Of Excellence

70

Mapping - Invalidation

On editing a mapping, the Designer invalidates sessions under the


following circumstances:
Add or remove sources or targets
Remove mapplets or transformations
Replace a source, target, mapplet, or transformation while
importing or copying objects
Add or remove Source Qualifiers or COBOL Normalizers, or
change the list of associated sources for these transformations
Add or remove a Joiner or Update Strategy transformation.
Add or remove transformations from a mapplet in the mapping
Change the database type for a source

Center Of Excellence

71

Mapping - Components

Every mapping requires at least one transformation object that


determines how the Informatica Server reads the source data:
Source Qualifier transformation
Normalizer transformation
ERP Source Qualifier transformation
XML Source Qualifier transformation

Transformations can be created to use once in a mapping, or


reusable transformations to use in multiple mappings

Center Of Excellence

72

Mapping - Updates

By default, the Informatica Server updates targets based on key


values
The default UPDATE statement for each target in a mapping can be
overrode
For a mapping without an Update Strategy transformation, configure
the session to mark source records as update

Center Of Excellence

73

Mapping - Validation
The Designer marks a mapping valid for the following reasons:
Connection validation - Required ports are connected and that
all connections are valid
Expression validation - All expressions are valid
Object validation - The independent object definition matches
the instance in the mapping
The Designer performs connection validation each time you
connect ports in a mapping and each time you validate or save a
mapping
You can validate an expression in a transformation while you are
developing a mapping

Center Of Excellence

74

Mapping - Validation

Center Of Excellence

75

Questions

Center Of Excellence

76

Transformations used
in Informatica

Transformations

A transformation is a repository object that generates, modifies, or


passes data

The Designer provides a set of transformations that perform


specific functions

Transformations in a mapping represent the operations the


Informatica Server performs on data

Data passes into and out of transformations through ports that you
connect in a mapping or mapplet

Transformations can be active or passive

Center Of Excellence

78

Transformation Types

An active transformation can change the number of rows that


pass through it

A passive transformation does not change the number of rows


that pass through it

Transformations can be connected to the data flow, or they


can be unconnected

An unconnected transformation is not connected to other


transformations in the mapping. It is called within another
transformation, and returns a value to that transformation

Center Of Excellence

79

Active Transformation Nodes

Advanced External Procedure - Calls a procedure in a shared


library or in the COM layer of Windows NT
Aggregator - Performs aggregate calculations
ERP Source Qualifier - Represents the rows that the Informatica
Server reads from an ERP source when it runs a session
Filter - Filters records
Joiner - Joins records from different databases or flat file systems
Rank - Limits records to a top or bottom range
Router - Routes data into multiple transformations based on a
group expression
Source Qualifier - Represents the rows that the Informatica Server
reads from a relational or flat file source when it runs a session
Update Strategy - Determines whether to insert, delete, update, or
reject records

Center Of Excellence

80

Passive Transformation nodes

Expression - Calculates a value


External Procedure - Calls a procedure in a shared library or
in the COM layer of Windows NT
Input - Defines mapplet input rows. Available only in the
Mapplet Designer
Lookup - Looks up values
Output - Defines mapplet output rows. Available only in the
Mapplet Designer
Sequence Generator - Generates primary keys
Stored Procedure - Calls a stored procedure
XML Source Qualifier - Represents the rows that the
Informatica Server reads from an XML source when it runs
a session

Center Of Excellence

81

Transformations - Properties

Port Name
Copied ports will inherit the name of contributing port
Copied ports with the same name will be appended with a
number

Data types
Transformations use internal data types
Data types of input ports must be compatible with data types
of the feeding output port

Port Default values - can be set to handle nulls and errors

Description - can enter port comments

Center Of Excellence

82

Aggregator Transformation

Performs aggregate
calculations

Components of the
Aggregator
Transformation
Aggregate expression
Group by port
Sorted Input option
Aggregate cache

The Aggregator is an
active and connected
transformation

Center Of Excellence

83

Aggregator Transformation

The following aggregate functions can be used within an Aggregator


transformation:
AVG, COUNT, FIRST, LAST, MAX , MEDIAN
MIN, PERCENTILE, STDDEV, SUM, VARIANCE

Center Of Excellence

84

Expression Transformation

Can use the Expression transformation to


perform any non-aggregate calculations
Calculate values in a single row
test conditional statements before you output the results to
target tables or other transformations

Ports that must be included in an Expression Transformation:


Input or input/output ports for each value used in the calculation
Output port for the expression

Center Of Excellence

85

Expression Transformation

Can enter multiple expressions in a single expression


transformation
Can enter only one expression for each output port
can create any number of output ports in the transformation

Center Of Excellence

86

Filter Transformation

It provides the means for filtering rows in a mapping


All ports in a Filter transformation are input/output
Only rows that meet the condition pass through it
Cannot concatenate ports from more than one transformation into the Filter
transformation
To maximize session performance, include the Filter transformation as close to the
sources in the mapping as possible
Does not allow setting output default values

Center Of Excellence

87

Joiner Transformation

Joins two related heterogeneous sources residing in different locations or file


systems
Can be used to join
Two relational tables existing in separate databases
Two flat files in potentially different file systems
Two different ODBC sources
Two instances of the same XML source
A relational table and a flat file source
A relational table and an XML source

Center Of Excellence

88

Joiner Transformation
Use the Joiner transformation to join two sources with at
least one matching port
It uses a condition that matches one or more pairs of ports
between the two sources
Requires two input transformations from two separate data
flows
It supports the following join types
Normal (Default)
Master Outer
Detail Outer
Center Of Excellence
Full Outer

89

Lookup Transformation
Used to look up data in a relational table, view, or synonym
The Informatica Server queries the lookup table based on the lookup
ports in the transformation
It compares Lookup transformation port values to lookup table column
values based on the lookup condition
Can use the Lookup transformation to perform many tasks, including:
Get a related value
Perform a calculation
Update slowly changing dimension tables

Center Of Excellence

90

Connected & Unconnected Lookup


Connected Lookup Transformation

Receives input values directly from another transformation in the


pipeline

For each input row, the Informatica Server queries the lookup table or
cache based on the lookup ports and the condition in the
transformation

Passes return values from the query to the next transformation


Unconnected Lookup Transformation

Receives input values from an expression using the :LKP


(:LKP.lookup_transformation_name(argument, argument, ...)) reference
qualifier to call the lookup and returns one value.

Some common uses for unconnected lookups include:


Testing the results of a lookup in an expression
Filtering records based on the lookup results
Marking records for update based on the result of a lookup (for
example, updating slowly changing dimension tables)
Calling the same lookup multiple times in one mapping

Center Of Excellence

91

Lookup Transformation

With unconnected Lookups, you can pass multiple input values into the
transformation, but only one column of data out of the transformation
Use the return port to specify the return value in an unconnected lookup
transformation

Center Of Excellence

92

Lookup Caching
Session performance can be improved by
caching the lookup table
Caching can be static or dynamic
By default, the lookup cache remains static
and does not change during the session
Caching can be persistent Cache used
across sessions
Center Of Excellence

93

Router Transformation
A Router transformation tests data for one or
more conditions and gives the option to route
Itany
has the
following types
rows of data that do not meet
of the
of groups:
conditions to a default output group
Input
Output
There are two types of
output groups:
User-defined groups
Default group

Center Of

Create one user-defined


group for each condition
that you want to specify
Excellence
94

Comparing Router & Filter


Transformations

Center Of Excellence

95

Sequence Generator
Transformation
Generates numeric
values

It can be used to
create unique primary key values
replace missing primary keys
cycle through a sequential range of numbers
It provides two output ports: NEXTVAL and CURRVAL
These ports can not be edited or deleted
Can not add ports to the sequence generator
transformation
When NEXTVAL is connected to the input port of another
transformation, the Informatica Server generates a
sequence of numbers
Center Of Excellence
96

Sequence Generator
Transformation

Connect the NEXTVAL port to a downstream transformation to


generate the sequence based on the Current Value and Increment By
properties
The CURRVAL port is connected, only when the NEXTVAL port is
already connected to a downstream transformation

Center Of Excellence

97

Source Qualifier
Transformation
The Source Qualifier represents the records that the
Informatica Server reads when it runs a session
Can use the Source Qualifier to perform the following tasks:
Join data originating from the same source database
Filter records when the Informatica Server reads source
data
Specify an outer join rather than the default inner join
Specify sorted ports
Select only distinct values from the source
Create a custom query to issue a special SELECT
statement for the Informatica Server to read source data

Center Of Excellence

98

Source Qualifier Transformation

For relational sources, the Informatica Server generates a query for each Source
Qualifier when it runs a session
The default query is a SELECT statement for each source column used in the
mapping
The Informatica Server reads only those columns in Source Qualifier that are
connected to another transformation

Center Of Excellence

99

Update Strategy Transformation


It determines whether to insert, update, delete
or reject records
Constants
for eachConstant
database operation
Operation
Numeric
Value
Insert
Update

DD_INSERT 0
DD_UPDAT 1
E

Delete

DD_DELET
E

Reject

DD_REJEC 3
Center Of Excellence
T

100

Rank Transformation
Allows to select only the top or bottom rank of data, not just
one value
Can use it to return
the largest or smallest numeric value in a port or group
the strings at the top or the bottom of a session sort order
During the session, the Informatica Server caches input
data until it can perform the rank calculations
Can select only one port to define a rank

Center Of Excellence

101

Rank Transformation

When you create a Rank transformation, you can configure the following properties:
Enter a cache directory
Select the top or bottom rank
Select the input/output port that contains values used to determine the rank. You
can select only one port to define a rank
Select the number of rows falling within a rank
Define groups for ranks

Center Of Excellence

102

Rank Transformation
Rank Transformation
Ports:
Variable port - Can
use to store values
or calculations to
use in an expression
Rank port - Use to
designate the
column for which
you want to rank
values

Center Of Excellence

103

Stored Procedure
Transformation
A Stored Procedure transformation is
an important tool for populating and maintaining databases
a precompiled collection of Transact-SQL statements and
optional flow control statements, similar to an executable
script
used to call a stored procedure
The stored procedure must exist in the database before
creating a Stored Procedure transformation
One of the most useful features of stored procedures is the
ability to send data to the stored procedure, and receive data
from the stored procedure

Center Of Excellence

104

Stored Procedure
Transformation
There are three
types of data that pass between the
Informatica Server and the stored procedure:
Input/Output parameters - For many stored
procedures, you provide a value and receive a
value in return
Return values - Most databases provide a return
value after running a stored procedure
Status codes - Status codes provide error handling
for the Informatica Server during a session
Center Of Excellence

105

Stored Procedure
Transformation

The following list describes the options for running a Stored


Procedure transformation:
Normal - During a session, the stored procedure runs
where the transformation exists in the mapping on a rowby-row basis
Pre-load of the Source - Before the session retrieves data
from the source, the stored procedure runs
Post-load of the Source - After the session retrieves data
from the source, the stored procedure runs
Pre-load of the Target - Before the session sends data to
the target, the stored procedure runs
Post-load of the Target - After the session sends data to
the target, the stored procedure runs

Center Of Excellence

106

Stored Procedure
Transformation

Can set up the Stored Procedure transformation in one of two modes, either
connected or unconnected
The flow of data through a mapping in connected mode also passes through the
Stored Procedure transformation
Cannot run the same instance of a Stored Procedure transformation in both
connected and unconnected mode in a mapping. You must create different
instances of the transformation

Center Of Excellence

107

Stored Procedure
Transformation

The unconnected Stored Procedure transformation is not connected


directly to the flow of the mapping
It either runs before or after the session, or is called by an
expression in another transformation in the mapping

Center Of Excellence

108

Dynamic Lookup
Transformation

A Lookup transformation using a dynamic cache has the following properties


that a Lookup transformation using a static cache does not have:
NewLookupRow
Associated Port

Center Of Excellence

109

Dynamic Lookup
Transformation

You might want to configure the transformation to use a dynamic cache


when the target table is also the lookup table. When you use a
dynamic cache, the Informatica Server inserts rows into the cache as it
passes rows to the target.

Center Of Excellence

110

Transformation Language
The designer provides a transformation language to help you write
expressions to transform source data
With the transformation language, you can create a transformation
expression that takes the data from a port and changes it
Can write expressions in the following transformations:
Aggregator
Expression
Filter
Rank
Router
Update Strategy

Center Of Excellence

111

Transformation Language
Expressions can consist of any combination of the
following components:
Ports (input, input/output, variable)
String literals, numeric literals
Constants
Functions
Local and system variables
Mapping parameters and mapping variables
Operators
Return values

Center Of Excellence

112

Transformation Language
The functions available in PowerCenter are
Aggregate Functions e.g. AVG, MIN, MAX
Character Functions e.g. CONCAT, LENGTH
Conversion Functions e.g. TO_CHAR, TO_DATE
Date Functions e.g. DATE_DIFF, LAST_DAY
Numeric Functions e.g. ABS, CEIL, LOG
Scientific Functions e.g. COS, SINH
Special Functions e.g. DECODE, IIF, ABORT
Test Functions e.g. ISNULL, IS_DATE
Variable Functions e.g. SETMAXVARIABLE
Center Of Excellence

113

Questions

Center Of Excellence

114

Re usable
Transformations and
Mapplets

Reusable Transformation
A Transformation is said to be in reusable mode
when multiple instances of the same
transformation can be created.
Reusable transformations can be used in multiple
mappings.
Creating Reusable transformations:
Design it in the Transformation Developer
Promote a standard transformation from the Mapping
Designer.
Center Of Excellence
116

Mapplet
A mapplet is a reusable object that represents a set of
transformations
It allows to reuse transformation logic and can contain as
many transformations as needed
Mapplets can:
Include source definitions
Accept data from sources in a mapping
Include multiple transformations
Pass data to multiple pipelines
Contain unused ports
Center Of Excellence

117

Sample Mapplet in a Mapping

Center Of Excellence

118

Mapplet - Components
Each mapplet must include the following:
One Input transformation and/or Source Qualifier
transformation
At least one Output transformation
A Mapplet should contain exactly one of the following:
Input transformation with at least one port
connected to a transformation in the mapplet
Source Qualifier transformation with at least one
port connected to a source definition
Center Of Excellence

119

Mapplet

Center Of Excellence

120

Expanded Mapplet

For example, in the figure, the mapplet uses the Input transformation
IN_CustID_FirstLastName to define mapplet input ports. The Input
transformation is connected to one transformation, EXP_WorkaroundLookup,
which passes data to two separate transformations

Center Of Excellence

121

Questions

Center Of Excellence

122

Workflow Manager &


Workflow Monitor

Workflow Manager & Workflow


Monitor
Workflow Manager
Server Manager

1. Task Developer
2. Workflow
Designer
3. Worklet Designer

Workflow Monitor
1. Gantt
Chart
2. Task View

Center Of Excellence

124

Workflow Manager
The Workflow Manager replaces the Server Manager in version
5.0. Instead of running sessions, you now create a process called
the workflow in the Workflow Manager.
A workflow is a set of instructions on how to execute tasks such as
sessions, emails, and shell commands.
A session is now one of the many tasks you can execute in the
Workflow Manager.
The Workflow Manager provides other tasks such as Assignment,
Decision, and Events. You can also create branches with
conditional links. In addition, you can batch workflows by creating
worklets in the Workflow Manager.

Center Of Excellence

125

Workflow Manager Screen


Shot

Workflow
Manager

Center Of Excellence

126

Workflow Manager Tools


Task Developer
Use the Task Developer to create tasks you want to
execute in the workflow.

Workflow Designer
Use the Workflow Designer to create a workflow by
connecting tasks with links. You can also create tasks in
the Workflow Designer as you develop the workflow.

Worklet Designer
Use the Worklet Designer to create a worklet.

Center Of Excellence

127

Workflow Tasks

Command. Specifies a shell command run during the workflow.


Control. Stops or aborts the workflow.
Decision. Specifies a condition to evaluate.
Email. Sends email during the workflow.
Event-Raise. Notifies the Event-Wait task that an event has
occurred.
Event-Wait. Waits for an event to occur before executing the
next task.
Session. Runs a mapping you create in the Designer.
Assignment. Assigns a value to a workflow variable.
Timer. Waits for a timed event to trigger.

Center Of Excellence

128

Create Task

Center Of Excellence

129

Workflow Monitor
PowerCenter 6.0 provides a new tool, the Workflow
Monitor, to monitor workflow, worklets, and tasks.
The Workflow Monitor displays information about
workflows in two views:
1. Gantt Chart view
2. Task view.

You can monitor workflows in online and offline


mode.
Center Of Excellence

130

Workflow Monitor Gantt Chart

Center Of Excellence

131

Workflow Monitor Task View

Center Of Excellence

132

Questions

Center Of Excellence

133

Performance Tuning

Performance Tuning
First step in performance tuning is to identify the
performance bottleneck in the following order :
Target
Source
Mapping
Session
System
The most common performance bottleneck occurs
when the Informatica Server writes to a target database.

Center Of Excellence

135

Target Bottlenecks

Identifying
A target bottleneck can be identified by
configuring the session to write to a flat file
target.

Optimizing
Dropping Indexes and Key Constraints before
loading.
Increasing commit intervals.
Use of Bulk Loading / External Loading.
Center Of Excellence

136

Source Bottlenecks

Identifying
Add a filter condition after Source qualifier to false so
that no data is processed past the filter transformation.
If the time it takes to run the new session remains
about the same, then there is a source bottleneck.
In a test mapping remove all the transformations and if
the performance is similar, then there is a source
bottleneck.

Optimizing
Optimizing the Query by using hints.
Use informatica Conditional Filters if the source
system lacks indexes.
Center Of Excellence

137

Mapping Bottlenecks

Identifying
If there is no source bottleneck, add a Filter
transformation in the mapping before each target
definition. Set the filter condition to false so that no
data is loaded into the target tables. If the time it
takes to run the new session is the same as the
original session, there is a mapping bottleneck.

Optimizing
Configure for Single-Pass reading
Avoid unnecessary data type conversions.
Avoid database reject errors.
Use Shared Cache / Persistant Cache
Center Of Excellence

138

Session Bottlenecks

Identifying
If there is no source, Target or Mapping bottleneck, then there
may be a session bottleneck.
Use Collect Performance Details. Any value other than zero in
the readfromdisk and writetodisk counters for Aggregator,
Joiner, or Rank transformations indicate a session bottleneck.
Low (0-20%) BufferInput_efficiency and BufferOutput_efficiency
counter values also indicate a session bottleneck.

Optimizing
Increase the number of partitions.
Tune session parameters.

DTM Buffer Size (6M 128M)


Buffer Block Size (4K 128K)
Data (2M 24 M )/ Index (1M-12M) Cache Size

Use incremental Aggregation if possible.

Center Of Excellence

139

Session Bottlenecks - Memory

Configure the index and data cache memory for the Aggregator, Rank, and
Joiner transformations in the Configuration Parameters dialog box
The amount of memory you configure depends on partitioning, the
transformation that requires the largest cache, and how much memory cache
and disk cache you want to use

Center Of Excellence

140

Session Bottlenecks - Cache


When you cache the
Lookup transformation,
the Informatica Server
builds a cache in memory
when it processes the first
row of data in the
transformation
The Informatica Server
builds the cache and
queries it for each row that
enters the transformation
It processes data
according to the way you
configure the Lookup
transformation

Center Of Excellence

141

Incremental Aggregation
First Run creates idx and dat files.
Second Run performs the following actions:
For each i/p record, the Server checks historical information in the index
file for a corresponding group, then:
If it finds a corresponding group, it performs the aggregate operation
incrementally, using the aggregate data for that group, and saves the
incremental change
If it does not find a corresponding group, it creates a new group and
saves the record data
When writing to the target Informatica Server
Updates modified aggregate groups in the target
Inserts new aggregate data
Deletes removed aggregate data
Ignores unchanged aggregate data
Saves modified aggregate data in the index and data files

Center Of Excellence

142

Incremental Aggregation

You can find options for incremental aggregation on the Transformations tab in the
session properties
The Server Manager displays a warning indicating the Informatica Server overwrites
the existing cache and a reminder to clear this option after running the session

Center Of Excellence

143

System Bottlenecks

Identifying
If there is no source, Target, Mapping or Session bottleneck,
then there may be a system bottleneck.
Use system tools to monitor CPU usage, memory usage, and
paging.
On Windows :- Task Manager
On Unix Systems toots like sar, iostat. For Eg: sar u
(%usage on user, idle time, i/o waiting time)

Optimizing
Improve network speed.
Improve CPU performance
Check hard disks on related machines
Reduce Paging

Center Of Excellence

144

PMCMD
Can use the command line program pmcmd to communicate with the
Informatica Server
Can perform the following actions with pmcmd:
Determine if the Informatica Server is running
Start sessions and batches
Stop sessions and batches
Recover sessions
Stop the Informatica Server
Can configure repository usernames and passwords as
environmental variables with pmcmd
can also customize the way pmcmd displays the date and time on
the machine running the Informatica Server
pmcmd returns zero on success and non-zero on failure
You can use pmcmd with operating system scheduling tools like cron
to schedule sessions, and you can embed pmcmd into shell scripts
or Perl programs to run or schedule sessions

Center Of Excellence

145

PMCMD
Need the following information to use pmcmd:
Repository username
Repository password
Connection type - The type of connection from the client
machine to the Informatica Server
Port or connection - The TCP/IP port number or IPX/SPX
connection (Windows NT/2000 only) to the Informatica
Server
Host name - The machine hosting the Informatica Server
Session or batch name - The names of any sessions or
batches you want to start or stop
Folder name - The folder names for those sessions or
batches
Parameter file Center Of Excellence
146

Commit Points
A commit interval is the interval at which the Informatica
Server commits data to relational targets during a session
The commit point can be a factor of the commit interval, the
commit interval type, and the size of the buffer blocks
The commit interval is the number of rows you want to use
as a basis for the commit point
The commit interval type is the type of rows that you want
to use as a basis for the commit point
Can choose between the following types of commit interval
Target-based commit
Source-based commit
During a source-based commit session, the Informatica
Server commits data to the target based on the number of
rows from an active Center
sourceOfinExcellence
a single pipeline
147

Commit Points
During a target-based
commit session, the
Informatica Server continues
to fill the writer buffer after it
reaches the commit interval
When the buffer block is
filled, the Informatica Server
issues a commit command
As a result, the amount of
data committed at the
commit point generally
exceeds the commit interval

Center Of Excellence

148

Commit Points
During a source-based commit session, the Informatica Server
commits data to the target based on the number of rows from an
active source in a single pipeline
These rows are referred to as source rows
A pipeline consists of a source qualifier and all the transformations
and targets that receive data from the source qualifier
An active source can be any of the following active
transformations:
Advanced External Procedure
Source Qualifier
Normalizer
Aggregator
Joiner
Rank
Mapplet, if it contains one of the above transformations

Center Of Excellence

149

Commit Points

When the Informatica Server runs a source-based commit session, it identifies the
active source for each pipeline in the mapping
The Informatica Server generates a commit row from the active source at every
commit interval
When each target in the pipeline receives the commit row, the Informatica Server
performs the commit

Center Of Excellence

150

Commit Points

Center Of Excellence

151

Multiple Servers
You can register multiple PowerCenter Servers with a
PowerCenter repository
Can run these servers at the same time
Can distribute the repository session load across available
servers to improve overall performance
Can use the Server Manager to administer and monitor
multiple servers
With multiple Informatica Servers, you need to decide which
server you want to run each session and batch
You can register and run only one PowerMart Server in a
local repository
Cannot start a PowerMart Server if it is registered in a local
repository that has multiple servers registered to it
Center Of Excellence
152

Multiple Servers

When attached to multiple servers, you can only view, or monitor, one Informatica
Server at a time, but you have access to all the servers in the repository

Center Of Excellence

153

Questions

Center Of Excellence

154

Debugger

Debugger

Can debug a valid mapping to gain troubleshooting


information about data and error conditions

To debug a mapping, you configure and run the Debugger


from within the Mapping Designer

When you run the Debugger, it pauses at breakpoints and


allows you to view and edit transformation output data

After you save a mapping, you can run some initial tests
with a debug session before you configure and run a
session in the Server Manager
Center Of Excellence
156

Debugger

Center Of Excellence

157

Debugger

Can Use the following process to debug a mapping:


Create breakpoints
Configure the Debugger
Run the Debugger
Monitor the Debugger
Debug log
Session log
Target window
Instance window
Modify data and breakpoints

A breakpoint can consist of an instance name, a breakpoint


Center Of Excellence
158
type, and a condition

Debugger

After you set the instance name, breakpoint type, and optional data condition,
you can view each parameter in the Breakpoints section of the Breakpoint Editor

Center Of Excellence

159

Questions

Center Of Excellence

160

Das könnte Ihnen auch gefallen