Informatica

Informatica
PowerMart / PowerCenter 8.6
Introduction
PowerMart and PowerCenter provide an environment that allows you to load data into a
centralized location, such as a datamart, data warehouse, or operational data store
(ODS).
You can extract data from multiple sources, transform the data according to business
logic you build in the client application, and load the transformed data into file and
relational targets.
Informatica provides the following integrated components:
Informatica repository. The Informatica repository is at the center of the

Informatica suite. You create a set of metadata tables within the repository
database that the Informatica applications and tools access. The Informatica Client
and Server access the repository to save and retrieve metadata.
Informatica Client. Use the Informatica Client to manage users, define sources
and targets, build mappings and mapplets with the transformation logic, and create
sessions to run the mapping logic. The Informatica Client has three client
applications: Repository Manager, Designer, and Server Manager.
Informatica Server. The Informatica Server extracts the source data, performs the
data transformation, and loads the transformed data into the targets.
PowerMart/PowerCenter Architecture
Sources
PowerMart and PowerCenter access the following sources:
Relational. Oracle, Sybase, Informix, IBM DB2, Microsoft SQL

Server, and Teradata.
File. Fixed and delimited flat file, COBOL file, and XML.
Extended. If you use PowerCenter, you can purchase additional

PowerConnect products to access business sources such as
PeopleSoft, SAP R/3, Siebel, and IBM MQSeries.
Mainframe. If you use PowerCenter, you can purchase

PowerConnect for IBM DB2 for faster access to IBM DB2 on MVS.
Other. Microsoft Excel and Access.

4
Targets
PowerMart and PowerCenter can load data into the following targets:
Relational. Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL
Server, and Teradata.
File. Fixed and delimited flat files and XML.
Extended. If you use PowerCenter, you can purchase an integration server

to load data into SAP BW. You can also purchase PowerConnect for IBM
MQSeries to load data into IBM MQSeries message queues.
Other. Microsoft Access.
You can load data into targets using ODBC or native drivers, FTP, or
external loaders.
5
Repository
The Informatica repository is a set of tables that stores the metadata you create using the
Informatica Client tools. You create a database for the repository, and then use the
Repository Manager to create the metadata tables in the database.
You add metadata to the repository tables when you perform tasks in the Informatica Client
application such as creating users, analyzing sources, developing mappings or mapplets,
or creating sessions. The Informatica Server reads metadata created in the Client
application when you run a session. The Informatica Server also creates metadata such as
start and finish times of a session or session status.
When you use PowerCenter, you can develop global and local repository to share
metadata:
Global repository. The global repository is the hub of the domain. Use the global
repository to store common objects that multiple developers can use through
shortcuts. These objects may include operational or application source definitions,
reusable transformations, mapplets, and mappings.
Local repositories. A local repository is within a domain that is not the global
repository. Use local repositories for development. From a local repository, you can
create shortcuts to objects in shared folders in the global repository. These objects
typically include source definitions, common dimensions and lookups, and enterprise
standard transformations. You can also create copies of objects in non-shared
folders.
6
Informatica Client
The Informatica Client is comprised of three applications that you use to manage the
repository, design mappings, mapplets, and create sessions to load the data.
Repository Manager. Use the Repository Manager to create and administer the metadata
repository. You can create repository users and groups, assign privileges and permissions,
manage folders and locks, and print Crystal Reports containing repository data.
Designer. Use the Designer to create mappings that contain transformation instructions for
the Informatica Server. Before you can create mappings, you must add source and target
definitions to the repository. The Designer has five tools that you use to analyze sources,
design target schemas, and build source-to-target mappings:
Source Analyzer. Import or create source definitions.

Warehouse Designer. Import or create target definitions.
Transformation Developer. Develop reusable transformations to use in mappings.
Mapplet Designer. Create sets of transformations to use in mappings.
Mapping Designer. Create mappings that the Informatica Server uses to extract,
transform, and load data.
Server Manager. Use the Server Manager to create, schedule, execute, and monitor
sessions. You create a session based on a mapping in the repository and schedule it to run
against an Informatica Server. You can view scheduled and running sessions for each
Informatica Server in the domain. You can also access details about those sessions.
7
Informatica Server
The Informatica Server reads mapping and session information from the
repository. It extracts data from the mapping sources and stores the data
in memory while it applies the transformation rules that you configure in
the mapping. The Informatica Server loads the transformed data into the
mapping targets.
You can install the Informatica Server on a Windows NT/2000 or UNIX
server machine.
You can communicate with the Informatica Server using pmcmd, a
command line program.
Connectivity
PowerMart and PowerCenter use the following types of connectivity:
Network Protocol
Native Drivers
ODBC
The Informatica Client uses ODBC and native drivers to connect to

source, target, and repository databases. The Server Manager and the
Informatica Server use TCP/IP or IPX/SPX to communicate with each
other. The Informatica Server uses native drivers to connect to the
databases to move data. You can optionally use ODBC to connect the
Informatica Server to databases. The Informatica Client uses ODBC and
native drivers to connect to source, target, and repository databases.
Connectivity Overview
10
Metadata Reporter
PowerMart and PowerCenter use the following types of connectivity:
Network Protocol
Native Drivers
ODBC
The Informatica Client uses ODBC and native drivers to connect to

source, target, and repository databases. The Server Manager and the
Informatica Server use TCP/IP or IPX/SPX to communicate with each
other. The Informatica Server uses native drivers to connect to the
databases to move data. You can optionally use ODBC to connect the
Informatica Server to databases. The Informatica Client uses ODBC and
native drivers to connect to source, target, and repository databases.
11
Using Repository Manager

Use the Repository Manager to administer your repositories. The Repository Manager
allows you to navigate through multiple folders and repositories, and perform the following
tasks:
Perform repository maintenance. You can create, copy, restore, upgrade, backup,
and delete repositories. With a global repository, you can register and unregister local
repositories. You can import and export repository connection information in the
registry and edit repository connection information.
Implement repository security. You can create, edit, and delete repository users
and user groups. You can assign and revoke repository privileges and folder
permissions.
Perform folder functions. You can create, edit, copy, and delete folders. All the work
you perform in the Designer is stored in folders. If you want to share metadata, you
can configure a folder to be shared.
View metadata. You can analyze sources, targets, mappings, and shortcut
dependencies, search by keyword, and view the properties of repository objects.
Customize the Repository Manager. You can add, edit, and remove repositories in
the Navigator, view or hide windows.
Run repository reports. You can run repository reports such as the Source to Target
Dependency report or the Session report. You can also add and remove customized
reports.
12
Repository Manager Windows

The Repository Manager can display the following windows:
Navigator. Displays all objects that you create in the Repository Manager,
the Designer, and the Server Manager. It is organized first by repository,
then by folder and folder version. Viewable objects include sources, targets,
dimensions, cubes, mappings, mapplets, transformations, sessions, and
batches. You can also view folder versions and business components.
Main. Provides properties of the object selected in the Navigator window.

The columns in this window change depending on the object selected in the
Navigator window.
Dependency. Shows dependencies on sources, targets, mappings, and

shortcuts for objects selected in either the Navigator or Main window.
Output. Provides the output of tasks executed within the Repository

Manager, such as creating a repository.
13
Repository Manager Windows
14
Repository Manager Navigator
15
Repository Objects
You create repository objects using the Repository Manager, Designer, and Server
Manager client tools. You can view the following objects in the Navigator window of the
Repository Manager:
Source definitions. Definitions of database objects (tables, views, synonyms) or files

that provide source data.
Target definitions. Definitions of database objects or files that contain the target
data.
Multi-dimensional metadata. Target definitions that are configured as cubes and

dimensions.
Mappings. A set of source and target definitions along with transformations

containing business logic that you build into the transformation. These are the
instructions that the Informatica Server uses to transform and move data.
Reusable transformations. Transformations that you can use in multiple mappings.
Mapplets. A set of transformations that you can use in multiple mappings.
Session and batches. Sessions and batches store information about how and when
the Informatica Server moves data. Each session corresponds to a single mapping.
You can group several sessions together in a batch
16
Design Process
The goal of the design process is to create mappings that depict the flow of
data between sources and targets, including changes made to the data before it
reaches the targets. However, before you can create a mapping, you must first
create or import source and target definitions. You might also want to create
reusable objects such as reusable transformations or mapplets.
Perform the following design tasks in the Designer:
1.
Import source definitions. Use the Source Analyzer to connect to the

sources and import the source definitions.
2.
Create or import target definitions. Use the Warehouse Designer to

define relational, flat file, or XML targets to receive data from sources. You
can import target definitions from a relational database, or you can
manually create a target definition.
3.
Create the target tables. If you add a target definition to the repository that
does not exist in a relational database, you need to create target tables in
your target database. You do this by generating and executing the
necessary SQL code within the Warehouse Designer.
17
Design Process
4. Design mappings. Once you have source and target definitions in the
repository, you can create mappings in the Mapping Designer. A mapping is a
set of source and target definitions linked by transformation objects that define
the rules for data transformation. A transformation is an object that performs a
specific function in a mapping, such as looking up data or performing
aggregation.
5. Create mapping objects. Optionally, you can create reusable objects for
use in multiple mappings. Use the Transformation Developer to create reusable
transformations. Use the Mapplet Designer to create mapplets. A mapplet is a
set of transformations that may contain sources and transformations.
6. Debug mappings. Use the Mapping Designer to debug a valid mapping to
gain troubleshooting information about data and error conditions.
7. Import and export repository objects. You can import and export
repository objects, such as sources, targets, transformations, mapplets, and
mappings to archive or share metadata.
18
Designer Windows
You can display the following windows in the Designer:
Navigator. Connect to repositories, and open folders within the Navigator.

You can also copy objects and create shortcuts within the Navigator.
Workspace. Open different tools in this window to create and edit
repository objects such as sources, targets, mapplets, transformations, and
mappings.
Output. View details about tasks you perform, such as saving your work or
validating a mapping.
Status bar. Displays the status of the operation you perform.
Overview. An optional window to simplify viewing a workspace that
contains a large mapping or multiple objects.
Instance data. View transformation data while you run the Debugger to
debug a mapping.
Target data. View target data while you run the Debugger to debug a
mapping.
19
Designer Windows
20
Debugger Window
21
Server Manager
Use the Server Manager to create, schedule, monitor, edit, copy, and abort
sessions. You can group multiple sessions to run as a single unit, known as a
batch. When you create a session, you select a valid mapping and configure
other settings such as connections, error handling, and scheduling. You may
also be able to override some transformation properties.
When you monitor sessions, the Server Manager displays status such as
scheduled, completed, and failed sessions. It also displays some errors
encountered while running the session. You can find a complete log of errors in
the session log and server log files. Before you create a session, you must
configure the following connection information:
Informatica Server connection. Register the Informatica Server with the

repository before you can start it or create a session to run against it.
Database connections. Create connections to source and target systems.
Other connections. If you want to use external loaders or FTP, you

configure access within the Server Manager.
22
Session Properties
You can set the following properties when you create a session:
Informatica Server. If you use PowerCenter, you can select an Informatica

Server to run a session.
Source and target location. Select a connection or specify a path for the
source and target data.
Scheduling information. Schedule the session to run on demand or on a

repeating schedule.
Error handling. Configure error handling parameters that determine how

the Informatica Server behaves when it encounters errors.
Post-session email. Send post-session email dependent on success and

failure of session.
Pre- and post-session scripts. Run shell commands before or after the
session.
23
Server Manager Windows

The Server Manager displays the following windows:
Navigator. View and select configured sessions.
Configure. Create and edit sessions.
Monitor. View information about running and completed sessions.
Output. View messages from the Informatica Server.
Status. Displays the status of the operation you perform.
24
Server Manager Windows
25
Creating a Repository
To create Repository
1.
Launch the Repository Manager by choosing Programs-PowerCenter (or

PowerMart) Client-Repository Manager from the Start Menu.
2.
In the Repository Manager, choose Repository-Create Repository.

Note: You must be running the Repository Manager in Administrator
mode to see the Create Repository option on the menu. Administrator
mode is the default when you install the program.
3.
In the Create Repository dialog box, specify the name of the new
repository, as well as the parameters needed to connect to the repository
database through ODBC.
26
Creating a Repository
27
Creating Repository User & Groups

You can create a repository user profile for everyone working in the repository,
each with a separate username and password. You can also create user
groups and assign each user to one or more groups. Then, grant repository
privileges to each group, so users in the group can perform tasks within the
repository (such as use the Designer or create sessions).
The repository user profile is not the same as the database user profile. While a
particular user might not have access to a database as a database user, that
same person can have privileges to a repository in the database as a repository
user.
Informatica tools include two basic types of security:
Privileges. Repository-wide security that controls which task or set of tasks

a single user or group of users can access.
Permissions. Security assigned to individual folders within the repository.

You can perform various tasks for each privilege.
28
Repository Privileges
Privilege
Description
Use Designer
Can edit metadata, import and export objects in the Designer, with
read and write permission at the folder-level
Browse Repository
Can browse repository content through the Repository Manager, add

and remove reports, import, export or remove the registry and change
user password
Create Sessions and Batches
Can create, import, export, modify, start, stop and delete sessions and
batches through the Server Manager with folder level read, write and
execute permissions. Can configure some connections used by the
Informatica Server
Session Operator
Can use the command line program (pmcmd) to start sessions and
batches, Can start, view, monitor and stop sessions or batches with
folder-level read permission and the create sessions and batches
privilege using the Server Manager
29
Repository Privileges
Privilege
Description
Administer Repository
Can create, upgrade, backup, delete and restore repositories, Can

create and modify folders, create and modify users and groups, and
assign privileges to users and groups
Administer Server
Can configure connections to the Informatica server through the

Server Manager and pmcmd.
Super User
Can perform all the tasks across all folders in the repository,
including unlocking locks and managing global object permissions
30
Folders
Folders provide a way to organize and store all metadata in the repository,
including mappings, schemas, and sessions. Folders are designed to be
flexible, to help you organize your data warehouse logically. Each folder has a
set of properties you can configure to define how users access the folder. For
example, you can create a folder that allows all repository users to see objects
within the folder, but not to edit them. Or you can create a folder that allows
users to share objects within the folder.
Shared Folders
When you create a folder, you can configure it as a shared folder. Shared
folders allow users to create shortcuts to objects in the folder. If you have
reusable transformation that you want to use in several mappings or across
multiple folders, you can place the object in a shared folder.
For example, you may have a reusable Expression transformation that
calculates sales commissions. You can then use the object in other folders by
creating a shortcut to the object.
31
Folder Permissions
Permissions allow repository users to perform tasks within a folder. With folder
permissions, you can control user access to the folder, and the tasks you
permit them to perform.
Folder permissions work closely with repository privileges. Privileges grant
access to specific tasks while permissions grant access to specific folders with
read, write, and execute qualifiers.
However, any user with the Super User privilege can perform all tasks across
all folders in the repository. Folders have the following types of permissions:
Read permission. Allows you to view the folder as well as objects in the
folder.
Write permission. Allows you to create or edit objects in the folder.
Execute permission. Allows you to execute or schedule a session or

batch in the folder.
32
Folder Permission Levels

You can grant folder permissions on the following levels of security:
Owner. The owner of the folder.
Owners Group. Each user in the owners repository group. If
the owner belongs to more than one group, you must select
one of those groups for the owners group.
Repository. All groups and users in the repository.
Each permission level includes the permissions of the level above it.
33
Creating Folders
To Create a New Folder:
Choose Folder-Create
34
Importing Sources
Source Analyzer is used to import or create source definitions for flat file, XML, Cobol, ERP,
and relational sources.
35
Import from Database

For importing from database you need to create an ODBC connection and select the tables
from the database.
36
Import from File

For importing a file into the source analyzer, select the file from the local disk.
37
Viewing Source Definitions
Double-click the title bar of the source definition for the table.
The Edit Tables dialog box opens and displays all the properties of this
source definition. The Table tab shows the name of the table, business name,
owner name, and the database type. You can add a comment in
the
Description section.
Note: To change the source table name, click Rename.
Click the Columns tab.
The Columns tab displays the column descriptions for the source. You
can modify the source definition, change or delete columns. Any changes
you make in this dialog box affect the source definition, not the source.
38
Viewing Source Definitions
39
Creating Targets
You can create target definitions in the Warehouse Designer for file and relational
sources. Create definitions in the following ways:
Import the definition for an existing target. Import the target definition
from a relational target.
Create a target definition based on a source definition. Drag one of

the following existing source definitions into the Warehouse Designer to
make a target definition:
o Relational source definition
o Flat file source definition
o COBOL source definition
Manually create a target definition. Create and design a target definition

in the Warehouse Designer.
Design several related targets. Create several related target definitions

at the same time. You can create the overall relationship, called a
schema, as well as the target definitions, through wizards in the Designer.
The Cubes and Dimensions Wizards follow common principles of data
warehouse design to simplify the process of designing related targets.
40
Creating Targets
41
Creating a Pass-Through Mapping

The next step is to create a mapping to depict the flow of data between sources
and targets. To create and edit mappings, you use the Mapping Designer tool in the
Designer. The mapping interface in the Designer is component-based, meaning
that it shows you every step in the process of moving data between sources and
targets. In addition, transformations depict how the Informatica Server modifies
data before it loads a target.
42
Creating Simple Mapping
Switch to the Mapping Designer.

Choose Mappings-Create.
While the workspace may appear blank, in fact it contains a new mapping
without any sources, targets, or transformations.
In the Mapping Name dialog box, enter <Mapping Name> as the name of
the new mapping and click OK.
The naming convention for mappings is m_MappingName.
In the Navigator, under the <Repository Name> repository and <Folder
Name> folder, click the Sources node to view source definitions added to
the repository.
43
Click the icon representing the EMPLOYEES source and drag it into the
workbook.
44

The source definition appears in the workspace. The Designer automatically
connects a Source Qualifier transformation to the source definition. After you add
the target definition, you connect the Source Qualifier to the target.
Click
the Targets icon in the Navigator to open the list of all target
definitions.
Click and drag the icon for the T_EMPLOYEES target into the
workspace.
The target definition appears. The final step is connecting the Source
Qualifier to this target definition.
45

To Connect the Source Qualifier to Target Definition:
Click once in the middle of the <Column Name> in the Source Qualifier. Hold down
the mouse button, and drag the cursor to the <Column Name> in the target. Then
release the mouse button.
An arrow (called a connector) now appears between the row columns
46
Transformations
A transformation is any part of a mapping that generates or modifies data. Every mapping
includes a Source Qualifier transformation, representing all the columns of information read
from a source and temporarily stored by the Informatica Server. In addition, you can add
transformations such as a calculating sum, looking up a value, or generating a unique ID that
modify information before it reaches the target.
When you build a mapping, you add transformations and configure them to handle data
according to your business purpose. Perform the following tasks to incorporate a
transformation into a mapping:
Create
the transformation. Create it in the Mapping Designer as part of a

mapping, in the Mapplet Designer as part of a mapplet, or in the Transformation
Developer as a reusable transformation.
Configure
the transformation. Each type of transformation has a unique set of
options that you can configure.

Connect
the transformation to other transformations and target definitions.
Drag one port to another to connect them in the mapping or mapplet.
47
Transformation Descriptions
Transformation
Type
Description
Advanced External Procedure
Active/
Connected
Calls a procedure in a shared

library or in the COM layer of
Windows NT.
Aggregator
Active/
Connected
Performs aggregate
calculations.
ERP Source Qualifier
Active/
Connected
Expression
Passive/
Connected
Calculates a value.
External Procedure
Passive/
Connected or Unconnected
Calls a procedure in a shared

library or in the COM layer of
Windows NT.
Filter
Active/
Connected
Filters records.
Represents the rows that the

Informatica Server reads from
an ERP source when it runs a
session.
48
Input
Passive/
Connected
Defines mapplet input rows. Available

only in the Mapplet Designer.
Joiner
Active/
Connected
Joins records from different

databases or flat file systems.
Lookup
Passive/
Looks up values.
Normalizer
Active/
Connected
Normalizes records, including those

read from COBOL sources.
Output
Passive/
Connected
Defines mapplet output rows.

Available only in the Mapplet
Designer.
Rank
Active/
Connected
Limits records to a top or bottom

range.
Sequence Generator
Passive/
Connected
Generates primary keys.
49
Source Qualifier
Active/
Connected

Informatica Server reads from a
relational or flat file source when it
runs a session.
Router
Active/
Connected
Routes data into multiple

transformations based on a group
expression.
Stored Procedure
Passive/
Calls a stored procedure.
Update Strategy
Active/
Connected
Determines whether to insert,

delete, update, or reject records.
XML Source Qualifier
Passive/
Connected

Informatica Server reads from an
XML source when it runs a session.
50
Transformations Toolbar
51
Aggregator Transformation
The Aggregator transformation allows you to perform aggregate
calculations, such as averages and sums. The Aggregator transformation
is unlike the Expression transformation, in that you can use the Aggregator
transformation to perform calculations on groups. The Expression
transformation permits you to perform calculations on a row-by-row basis
only.
When using the transformation language to create aggregate expressions,
you can use conditional clauses to filter records, providing more flexibility
than SQL language.
The Informatica Server performs aggregate calculations as it reads, and
stores necessary data group and row data in an aggregate cache.
52
Ports in Aggregator Transformation

To configure ports in the Aggregator transformation you can:
Enter
an aggregate expression in any output port, using conditional
clauses or non-aggregate functions in the port.

Create
multiple aggregate output ports.
Configure
any input, input/output, output, or variable port as a Group By
port, and use non-aggregate expressions in the port.

Improve
performance by connecting only the necessary input/output ports
to subsequent transformations, reducing the size of the data cache.

Use
variable ports for local variables.
Create
connections to other transformations as you enter an expression.
53
Components of Aggregator Transformation

The Aggregator is an active transformation, changing the number of rows in the data flow. It
must be connected to the data flow. The Aggregator transformation has several components
and options:
Aggregate
expression. Entered in an output port. Can include non-aggregate
expressions and conditional clauses.

Group
by port. Indicates how to create groups. Can be any input, input/output,
output, or variable port. When grouping data, the Aggregator transformation
outputs
the last row of each group unless otherwise specified.

Sorted
Input option. Use to improve session performance. To use Sorted Input,
you must pass data to the Aggregator transformation sorted by group by port, in
ascending or descending order.
Aggregate
cache. The Aggregator stores data in the aggregate cache until it
completes aggregate calculations. It stores group values in an index cache and row
data in the data cache.
54
Aggregate Cache
When you run a session that uses an Aggregator transformation, the Informatica
Server creates index and data caches in memory to process the transformation. If
the Informatica Server requires more space, it stores overflow values in cache files.
You configure the cache parameters in the session properties.
55
Creating an Aggregator Transformation

To use an Aggregator transformation in a mapping, you add the Aggregator transformation to
the mapping, then configure the transformation with an aggregate expression and group by
ports, if desired.
To create an Aggregator transformation:
1.In
the Mapping Designer, choose Transformation-Create. Select the Aggregator
transformation. The naming convention for Aggregator transformations is

AGG_TransformationName. Enter a description for the transformation. This
description appears in the Repository Manager, making
it easier for you or
others to understand what the transformation does.

2.Enter
a name for the Aggregator, click Create. Then click Done. The Designer
creates the Aggregator transformation.

3.Drag
the desired ports from the source transformation to the Aggregator
transformation. The
4.Double-click
Designer creates input/output ports for each port you include.
the title bar of the transformation to open the Edit Transformations
dialog box.
5.Select
the Ports tab.

56

6.Click
the Group By option for each column you want the Aggregator to use in
creating groups. You can optionally enter a default value to replace null groups.
7.If
you want to use a non-aggregate expression to modify groups, click the Add
button and enter a name and datatype for the port. Make the port an output port by
clearing Input (I). Click in the right corner of the Expression field, enter the nonaggregate expression using one of the input ports, then click OK. Select Group By.
8.Click
Add and enter a name and datatype for the aggregate expression port. Make
the port an output port by clearing Input (I). Click in the right corner of the
Expression field to open the Expression Editor. Enter the aggregate expression,
click Validate, then click OK. Make sure the expression validates before closing the
Expression Editor.
9.Add
default values for specific ports as necessary. If certain ports are likely to
contain null values, you might specify a default value if the target database does
not handle null values.
10.Select
the Properties tab.
57
58

Aggregator
Setting
Description
Cache Directory
Local directory where the Informatica Server creates the index and data
caches and, if necessary, index and data files. By default, the Informatica
Server uses the directory entered in the Server Manager for the server
variable $PMCacheDir. If you enter a new directory, make sure the directory
exists and contains enough memory/disk space for the aggregate caches.
Tracing Level
Amount of detail displayed in the session log for this transformation.
Sorted Input
Indicates input data is presorted by groups. Select this option only if the
mapping passes data to the Aggregator that is sorted by the Aggregator
group by ports and by the same sort order configured for the session.
Note: Use the Source Qualifier Number of Sorted Ports option to sort
relational sources.
59
Expression Transformation
You can use the Expression transformations to calculate values in a single row before
you write to the target.
For example, you might need to adjust employee salaries, concatenate first and last
names, or convert strings to numbers.
You can use the Expression transformation to perform any non-aggregate calculations.
You can also use the Expression transformation to test conditional statements before
you output the results to target tables or other transformations.
60
Calculating Values
To use the Expression transformation to calculate values for a single row, you must include
the following ports:
Input
or input/output ports for each value used in the calculation. For

example, when calculating the total price for an order, determined by multiplying the
unit price by the quantity
ordered, the input or input/output ports.
provides the unit price and the other

Output
One port
provides the quantity ordered.
port for the expression. You enter the expression as a configuration
option for the
output port. The return value for the output port needs to
the return value of the expression.
61
match
Adding Multiple Calculations
You can enter multiple expressions in a single Expression transformation. As long as you
enter only one expression for each output port, you can create any number of output ports in
the transformation. In this way, you can use one Expression transformation rather than
creating separate transformations for each calculation that requires the same set of data.
For example, you might want to calculate several types of withholding taxes from each
employee paycheck, such as local and federal income tax, Social Security and Medicare.
Since all of these calculations require the employee salary, the withholding category, and/or
the corresponding tax rate, you can create one Expression transformation with the salary and
withholding category as input/output ports and a separate output port for each necessary
calculation.
62
Creating an Expression Transformation

To create an Expression transformation:
1.In
the Mapping Designer, choose Transformation-Create. Select the Expression
transformation and add it to the mapping. Enter a name for it (the convention is
EXP_TransformationName) and click OK.
2.Create
the input ports. If you have the input transformation available, you can
select Link Columns from the Layout menu and then click and drag each port used
the calculation into the Expression transformation. With this method, the
in
Designer
copies the port into the new transformation and creates a connection between the two
ports. Or, you can open the Edit dialog box and create each port
manually.
Note: If you want to make this transformation reusable, you must create each port
manually within the transformation.
3.Repeat
the previous step for each input port you want to add to the expression.
4.Create
the output ports (O) you need, making sure to assign a port datatype that
matches the expression return value. The naming convention for output ports is
OUT_PORTNAME.
63
Creating an Expression Transformation

5. Click
enter
the expression in the Expression Editor. To prevent typographic errors, where
possible,
6. If
the small button that appears in the Expression section of the dialog box and
use the listed port names and functions.
you select a port name that is not connected to the transformation, the Designer
copies the port into the new transformation and creates a connection between the two
ports.
7. Port
stricter
names used as part of an expression in an Expression transformation follow
rules than port names in other types of transformations:
A port name must begin with a single- or double-byte letter or single- or doublebyte underscore (_).
It can contain any of the following single- or double-byte characters: a letter,

number, underscore (_), $, #, or @.
6.Check the expression syntax by clicking Validate. If necessary, make corrections to the
expression and check the syntax again. Then save the expression and exit the
Expression Editor.
7.Connect the output ports to the next transformation or target.

8.Select a tracing level on the Properties tab to determine the amount of transaction detail
reported in the session log file.
9.Choose Repository-Save.
64
Lookup Transformation
Use a Lookup transformation in your mapping to look up data in a relational table, view,
or
synonym. Import a lookup definition from any relational database to which both the
Informatica Client and Server can connect. You can use multiple Lookup
transformations
in a mapping.
The Informatica Server queries the lookup table based on the lookup ports in the
transformation. It compares Lookup transformation port values to lookup table column values
based on the lookup condition. Use the result of the lookup to pass to other transformations
and the target.
You can use the Lookup transformation to perform many tasks, including:
Get
a related value. For example, if your source table includes employee ID, but
you want to include the employee name in your target table to make your summary
data easier to read.
Perform
a calculation. Many normalized tables include values used in a
calculation,
such as gross sales per invoice or sales tax, but not the calculated value
(such as net
sales).
Update
slowly changing dimension tables. You can use a Lookup transformation
to determine whether records already exist in the target.

65
Lookup Transformation
You can configure the Lookup transformation to perform different types of lookups. You can
configure the transformation to be connected or unconnected, cached or uncached:
Connected
or unconnected. Connected and unconnected transformations receive

input and send output in different ways.
Cached
caching
or uncached. Sometimes you can improve session performance by
the lookup table. If you cache the lookup table, you can choose to use a
dynamic or
static cache. By default, the lookup cache remains static and does not
change during
rows into the cache
the session. With a dynamic cache, the Informatica Server inserts

during the session. Informatica recommends that you cache the
target table as the
lookup. This enables you to look up values in the target and insert
them if they do not
exist.
66
Connected Lookup Transformation

The following steps describe the way the Informatica Server processes a connected Lookup
transformation:
1.A
connected Lookup transformation receives input values directly from another
transformation in the pipeline.

2.For
each input row, the Informatica Server queries the lookup table or cache based
on the lookup ports and the condition in the transformation.

3.If
the transformation is uncached or uses a static cache, the Informatica Server
returns values from the lookup query. If the transformation uses a dynamic cache,
the
Informatica Server inserts the row into the cache when the lookup query does not
find
the row in the cache. It flags the row as new or existing, based on the result of
the
lookup query.
4.The
Lookup transformation passes return values from the query to the next
transformation. If the transformation uses a dynamic cache, you can pass rows
to a
Filter or Router transformation to filter new rows to the target.
67
Unconnected Lookup Transformation

An unconnected Lookup transformation receives input values from the result of a :LKP
expression in another transformation. You can call the Lookup transformation more than once
in a mapping.
A common use for unconnected Lookup transformations is to update slowly changing
dimension tables. The following steps describe the way the Informatica Server processes an
unconnected Lookup transformation:
1.An
unconnected Lookup transformation receives input values from the result of a
:LKP expression in another transformation, such as an Update Strategy

transformation.
2.The
Informatica Server queries the lookup table or cache based on the lookup ports
and condition in the transformation.

3.The
Informatica Server returns one value into the return port of the Lookup
transformation.
4.The
Lookup transformation passes the return value into the :LKP expression.
68
Differences between Connected and Unconnected Lookup
Connected Lookup
Unconnected Lookup
Receives input values directly from the pipeline.
Receives input values from the result of a :LKP

expression in another transformation.
You can use a dynamic or static cache.
You can use a static cache.
Cache includes all lookup columns used in the

mapping (that is, lookup table columns included in the
lookup condition and lookup table columns linked as
output ports to other transformations).
Cache includes all lookup/output ports in the lookup

condition and the lookup/return port.
Can return multiple columns from the same row or

insert into the dynamic lookup cache.
Designate one return port (R). Returns one column

from each row.
69
Differences between Connected and Unconnected Lookup
Connected Lookup
Unconnected Lookup
If there is no match for the lookup condition, the

Informatica Server returns the default value for all
output ports. If you configure dynamic caching, the
Informatica Server inserts rows into the cache.
If there is no match for the lookup condition, the

Informatica Server returns NULL.
Pass multiple output values to another transformation.

Link lookup/output ports to another transformation.
Pass one output value to another transformation. The

lookup/output/return port passes the value to the
transformation calling :LKP expression.
Supports user-defined default values.
Does not support user-defined default values.
70
Lookup Components
When you configure a Lookup transformation in a mapping, you define the
following components:
Lookup table
Ports
Properties
Condition
71
Lookup Table
You can import a lookup table from the mapping source or target database, or
you can import a lookup table from any database that both the Informatica
Server and Client machine can connect to. If your mapping includes
heterogeneous joins, you can use any of the mapping sources or mapping
targets as the lookup table.
The lookup table can be a single table, or you can join multiple tables in the
same database using a lookup query override. The Informatica Server queries
the lookup table or an in-memory cache of the table for all incoming rows into
the Lookup transformation.
Connect to the database to import the lookup table definition. The Informatica
Sever can connect to a lookup table using a native database driver or an ODBC
driver. However, the native database drivers improve session performance.
72
Lookup Table
Indexes and a Lookup Table
If you have privileges to modify the database containing a lookup table, you can
improve lookup initialization time by adding an index to the lookup table. This is
important for very large lookup tables. Since the Informatica Server needs to
query, sort, and compare values in these columns, the index needs to include
every column used in a lookup condition.
You can improve performance by adding indexes for the following lookups:
Cached
lookups. You can improve performance by indexing the columns in the
lookup ORDER BY. The session log contains the ORDER BY statement.
Uncached
for
lookups. Because the Informatica Server issues a SELECT statement
each row passing into the Lookup transformation, you can improve performance by
indexing the columns in the lookup condition.
73
Lookup Ports
The Ports tab contains options similar to other transformations, such as port
name, datatype, and scale. In addition to input and output ports, the Lookup
transformation includes a lookup port type that represents columns of data in
the lookup table. An unconnected Lookup transformation also includes a return
port type that represents the return value.
74
Lookup Ports
Ports
Type of
Lookup
Connected
Unconnected
Number
Required
Description
Minimum of 1
Input port. Create an input port for each lookup port you want to
use in the lookup condition. You must have at least one input or
input/output port in each Lookup transformation.
Connected
Unconnected
Minimum of 1
Output port. Create an output port for each lookup port you want
to link to another transformation. You can designate both input
and lookup ports as output ports. For connected lookups, you
must have at least one output port. For unconnected lookups, use
the return port (R) to designate a return value.
Connected
Unconnected
Minimum of 1
Lookup port. The Designer automatically designates each column

in the lookup table as a lookup (L) and output port (O).
1 only
Return port. Use only in unconnected Lookup transformations.

Designates the column of data you want to return based on the
lookup condition. You can designate one lookup/output port as the
return port.
Unconnected
75
Lookup Transformation Properties

Properties for the Lookup transformation identify the database source, how the
Informatica Server processes the transformation, and how it handles caching
and multiple matches.
On the Properties tab, you can configure properties such as a SQL override for
the lookup, the lookup table name, and tracing level for the transformation.
Most of the options on this tab allow you to configure caching properties.
76

Option
Description
Lookup SQL
Override
Overrides the default SQL statement to query the lookup table.

Specifies the SQL statement you want the Informatica Server to use for querying lookup
values. Use only with the lookup cache enabled.
Enter only the SELECT FROM and WHERE clauses when you enter the SQL override. Do not
enter the ORDER BY clause.
Lookup Table
Name
Specifies the name of the table from which the transformation looks up and caches values.
You can import a table, view, or synonym from another database by selecting the Import button
on the dialog box that displays when you first create a Lookup transformation.
If you enter a lookup SQL override, you do not need to add an entry for this option.
Lookup Caching
Enabled
Indicates whether the Lookup transformation caches lookup values during the session.
When lookup caching is enabled, the Informatica Server queries the lookup table once,
caches the values, and looks up values in the cache during the session. This can improve
session performance.
When you disable caching, each time a row passes into the transformation, the Informatica
Server issues a select statement to the lookup table for lookup values.
77

Option
Description
Lookup Policy on
Multiple Match
Available for Lookup transformations that are uncached or use a static cache. Determines
what happens when the Lookup transformation finds multiple rows that match the lookup
condition. You can select the first or last record returned from the cache or lookup table, or
report an error.
The Informatica Server fails a session when it encounters a multiple match while processing a
Lookup transformation with a dynamic cache.
Lookup Condition
Displays the lookup condition you set in the Condition tab.
Location
Information
Specifies the database containing the lookup table. You can select the exact database or you
can use the $Source or $Target variable. If you use one of these variables, the lookup table
must reside in the source or target database you specify when you configure the session.
When you have more than one relational source in the mapping, the session fails if you use
$Source.
Source Type
Indicates that the Lookup transformation reads values from a relational database.
78

Option
Description
Recache if Stale
The Recache from Database option replaces the Recache if Stale and Lookup Cache
Initialize options.
Tracing Level
Sets the amount of detail included in the session log when you run a session containing
this transformation.
Lookup Cache
Directory Name
Specifies the directory used to build the lookup cache files when the Lookup
transformation is configured to cache the lookup table. Also used to save the persistent
lookup cache files when the Lookup Persistent option is selected.
By default, the Informatica Server uses the $PMCacheDir directory configured for the
Informatica Server.
Lookup Cache
Initialize
The Recache from Database option replaces the Lookup Cache Initialize and Recache if
Stale options.
Lookup Cache
Persistent
Indicates whether the Informatica Server uses a persistent lookup cache, which consists
of at least two cache files. If a Lookup transformation is configured for a persistent lookup
cache and persistent lookup cache files do not exist, the Informatica Server creates the
files during the session. You can use this only when you enable lookup caching.
79

Option
Description
Lookup Data
Cache Size
Indicates the maximum size the Informatica Server allocates to the data cache in memory. If
the Informatica Server cannot allocate the configured amount of memory when initializing the
session, it fails the session. When the Informatica Server cannot store all the data cache data
in memory, it pages to disk as necessary.
The Lookup Data Cache Size is 2,000,000 bytes by default. The minimum size is 1,024 bytes.
Use only with the lookup cache enabled.
Lookup Index
Cache Size
Indicates the maximum size the Informatica Server allocates to the index cache in memory. If
the Informatica Server cannot allocate the configured amount of memory when initializing the
session, it fails the session. When the Informatica Server cannot store all the index cache data
in memory, it pages to disk as necessary.
The Lookup Index Cache Size is 1,000,000 bytes by default. The minimum size is 1,024 bytes.
Use only with the lookup cache enabled.
Dynamic Lookup
Cache
Indicates to use a dynamic lookup cache. Inserts new rows into the lookup cache as it passes
rows to the target table. You can use this only when you enable lookup caching.
Cache File Name

Prefix
Specifies the file name prefix to use with persistent lookup cache files. The Informatica Server
uses the file name prefix as the file name for the persistent cache files it saves to disk. Only
enter the prefix. Do not enter .idx or .dat.
If the named persistent cache files exist, the Informatica Server builds the memory cache from
the files. If the named persistent cache files do not exist, the Informatica Server rebuilds the
persistent cache files. Use only with persistent lookup cache.
80
Lookup Condition
The Informatica Server uses the lookup condition to test incoming values. It is similar to the
WHERE clause in an SQL query. When you configure a lookup condition for the
transformation, you compare transformation input values with values in the lookup table or
cache, represented by lookup ports. When you run a session, the Informatica Server queries
the lookup table or cache for all incoming values based on the condition.
You must enter a lookup condition in all Lookup transformations. Some guidelines for the
lookup condition apply for all Lookup transformations, and some guidelines vary depending
on how you configure the transformation.
81
Lookup Condition
Use the following guidelines when you enter a condition for any Lookup transformation:
The
datatypes in a condition must match.
Use
one input port for each lookup port used in the condition. You can use the same
input port in more than one condition in a transformation.

When
condition
all the
The
you enter multiple conditions, the Informatica Server evaluates each

as an AND, not an OR. The Informatica Server returns only rows that match
conditions you specify.
Informatica Server matches null values. For example, if an input lookup
condition column is NULL, the Informatica Server evaluates the NULL equal to a
NULL in the lookup table.
The lookup condition guidelines and the way the Informatica Server processes matches
varies depending on whether you configure the transformation for a dynamic cache or for an
uncached or static cache.
82
Lookup Condition Uncached or Static Cache

Uncached or Static Cache
Use the following guidelines when you configure a Lookup transformation without a cache or
to use a static cache:
If
you configure a Lookup transformation to use a static cache, or not to cache, you
can use the following operators when you create the lookup condition: =, >, <, >=, <=,
!=
If
you include more than one lookup condition, place the conditions with an equal
sign first to optimize lookup performance.

The
input value must meet all conditions for the lookup to return a value.
The condition can match equivalent values or supply a threshold condition. For example, you
might look for customers who do not live in California, or employees whose salary is greater
than $30,000. Depending on the nature of the source and condition, the Lookup might return
multiple values.
83
Lookup Condition Uncached or Static Cache

Handling Multiple Matches
Lookups find a value based on the conditions you set in the Lookup transformation. If the
lookup condition is not based on a unique key, or if the lookup table is denormalized, the
Informatica Server might find multiple matches in the lookup table or cache.
You can configure the static Lookup transformation to handle multiple matches in the following
ways:
Return
the first matching value, or return the last matching value. You can configure the
transformation either to return the first matching value or the last matching value. The first and last
values are the first values and last values found in the lookup cache that match the lookup condition.
When you cache the lookup table, the Informatica Server determines which record is first and which is
last by generating an ORDER BY clause for each column in the lookup cache.
then sorts each lookup source column in the lookup condition in ascending
Server sorts numeric columns in ascending numeric order (such as 0 to 10
from January to December and from the first of the month to the end of the
The Informatica Server

order. The Informatica
), date/time columns
month, and string
columns based on the sort order configured for the session.

Return
an error. The Informatica Server returns the default value for the output ports.
Note: The Informatica Server fails the session when it encounters multiple keys for a Lookup
transformation configured to use a dynamic cache.
84
Lookup Condition Dynamic Cache

Dynamic Cache
If you configure a Lookup transformation to use a dynamic cache, you can only use the
equality operator (=) in the lookup condition.
Handling Multiple Matches
You cannot configure handling for multiple matches in a Lookup transformation configured to
use a dynamic cache. The Informatica Server fails the session when it encounters multiple
matches either while caching the lookup table or looking up values in the cache that contain
duplicate keys.
85
Lookup Caches
The Informatica Server builds a cache in memory when it processes the first row of data in a
cached Lookup transformation. It allocates memory for the cache based on the amount you
configure in the transformation or session properties. The Informatica Server stores condition
values in the index cache and output values in the data cache. The Informatica Server
queries the cache for each row that enters the transformation.
The Informatica Server also creates cache files by default in the $PMCacheDir. If the data
does not fit in the memory cache, the Informatica Server stores the overflow values in the
cache files. When the session completes, the Informatica Server releases cache memory and
deletes the cache files unless you configure the Lookup transformation to use a persistent
cache.
When configuring a lookup cache, you can specify any of the following options:
Persistent
cache. You can save the lookup cache files and reuse them the next time
the Informatica Server processes a Lookup transformation configured to use the

cache.
Recache
from Database. If the persistent cache is not synchronized with the lookup
table, you can configure the Lookup transformation to rebuild the lookup cache.
86
Lookup Caches
Static
cache. You can configure a static, or read-only, cache for any lookup table.
By default, the Informatica Server creates a static cache. It caches the lookup table
and looks up values in the cache for each row that comes into the transformation.
When the lookup condition is true, the Informatica Server returns a value in the
lookup cache. The Informatica Server does not update the cache while it processes
the Lookup transformation.
Dynamic
cache. If you want to cache the target table and insert new rows into the
cache and the target, you can create a Lookup transformation to use a dynamic
cache. The Informatica Server dynamically inserts data into the lookup cache and
passes data to the target table.
Shared
cache. You can share the lookup cache between multiple transformations.
You can share an unnamed cache between transformations in the same mapping.
You can share a named cache between transformations in the same or different
mappings.
87
Creating Lookup Transformation

To create a Lookup transformation:
1.
In the Mapping Designer, choose Transformation-Create. Select the Lookup

transformation. Enter a name for the lookup. The naming convention for Lookup
transformations is LKP_TransformationName. Click OK.
2.
In the Select Lookup Table dialog box, you can choose the lookup table. Click
the Import button if the lookup table is not in the source or target database.
88
Creating Lookup Transformation

3.
If you want to manually define the lookup transformation, click the Skip button.
4.
Define input ports for each Lookup condition you want to define.
5.
For an unconnected Lookup transformation, create a return port for the value
you want to return from the lookup.
6.
Define output ports for the values you want to pass to another transformation.
7.
For Lookup transformations that use a dynamic lookup cache, associate an

input port or sequence ID with each lookup port.
8.
Add the lookup conditions. If you include more than one condition, place the
conditions using equal signs first to optimize lookup performance. On the
Properties tab, set the properties for the lookup. Click OK.
11.
For unconnected Lookup transformations, write an expression in another

transformation using :LKP to call the unconnected Lookup transformation.
89
Sequence Generator Transformation

The Sequence Generator transformation generates numeric values. You can use the
Sequence Generator to create unique primary key values, replace missing primary keys, or
cycle through a sequential range of numbers.
The Sequence Generator transformation is a connected transformation. It contains two
output ports that you can connect to one or more transformations. The Informatica Server
generates a value each time a row enters a connected transformation, even if that value is
not used. When NEXTVAL is connected to the input port of another transformation, the
Informatica Server generates a sequence of numbers. When CURRVAL is connected to the
input port of another transformation, the Informatica Server generates the NEXTVAL value
plus one.
You can make a Sequence Generator reusable, and use it in multiple mappings. You might
reuse a Sequence Generator when you perform multiple loads to a single target.
For example, if you have a large input file that you separate into three sessions running in
parallel, you can use a Sequence Generator to generate primary key values. If you use
different Sequence Generators, the Informatica Server might accidentally generate
duplicate key values. Instead, you can use the same reusable Sequence Generator for all
three sessions to provide a unique value for each target row.
90
Creating Sequence Generator Transformation

To create a Sequence Generator transformation:
1.
In the Mapping Designer, select Transformation-Create. Select the Sequence

Generator transformation. The naming convention for Sequence Generator
transformations is SEQ_TransformationName.
2.
Enter a name for the Sequence Generator, click Create. Then click Done. The
Designer creates the Sequence Generator transformation.
3.
Double-click the title bar of the transformation to open the Edit Transformations
dialog box.
91
Creating Sequence Generator Transformation

4.
Enter a description for the transformation. This description appears in the

Repository Manager, making it easier for you or others to understand what the
transformation does.
5.
Select the Properties tab. Enter settings as necessary.
92
Stored Procedure Transformation

A Stored Procedure transformation is an important tool for populating and
maintaining databases. Database administrators create stored procedures to
automate time-consuming tasks that are too complicated for standard SQL
statements.
Not all databases support stored procedures, and database implementations
vary widely on their syntax. You might use stored procedures to:
Drop and recreate indexes.
Check the status of a target database before moving records into it.
Determine if enough space exists in a database.
Perform a specialized calculation.

The stored procedure must exist in the database before creating a Stored
Procedure transformation, and the stored procedure can exist in a source, target,
or any database with a valid connection to the Informatica Server.
93
Creating Stored Procedure Transformation

There are two ways to configure the Stored Procedure transformation:
Use the Import Stored Procedure dialog box to automatically configure

the ports used by the stored procedure.
Configure the transformation manually, creating the appropriate ports

for any input or output parameters.
Stored Procedure transformations are created as Normal type by default, which
means that they run during the mapping, not before or after the session.
New Stored Procedure transformations are not created as reusable
transformations. To create a reusable transformation, click Make Reusable in the
Transformation properties after creating the transformation.
94
Import Stored Procedure

When you import a stored procedure, the Designer creates ports based on the
stored procedure input and output parameters. You should import the stored
procedure whenever possible.
There are three ways to import a stored procedure in the Mapping

Designer:
Select the stored procedure icon and add a Stored Procedure

transformation.
Select Transformation-Import Stored Procedure.
Select Transformation-Create, and then select Stored Procedure.
95
Modes of Stored Procedure Transformation

The modes of stored procedure are
Connected
Unconnected
The type you use depends on what your stored procedure does, and how often
the stored procedure should run in a mapping.
96
Connected
The flow of data through a mapping in connected mode also passes through the Stored
Procedure transformation. All data entering the transformation through the input ports
affects the stored procedure. You should use a connected stored procedure when you
need data from an input port sent as an input parameter to the stored procedure, or the
results of a stored procedure sent as an output parameter to another transformation.
97
Configuring connected Stored Procedure Transformation
To configure a connected Stored Procedure transformation:
Create the Stored Procedure transformation in your mapping.

Drag ports from upstream transformations to connect to any available
input ports.
Drag ports from the output ports of the Stored Procedure to other
transformations or targets.
Double-click the transformations, and select the Properties tab. Select
the appropriate database in the Connection Information item if you did
not select it when creating the transformation.
Select the Tracing level for the transformation. If you are testing the
mapping, select the Verbose Initialization option to provide the most
information in the event that the transformation fails. Click OK.
Choose Repository-Save to save changes to the mapping.
98
Unconnected
The unconnected Stored Procedure transformation is not connected directly to
the flow of the mapping. It either runs before or after the session, or is called by
an expression in another transformation in the mapping.
99
Configuring Unconnected Stored Procedure Transformation
An unconnected Stored Procedure transformation is not directly connected to

the flow of data through the mapping. Instead, the stored procedure runs either:
From an expression. Called from an expression written in the

Expression Editor within another transformation in the mapping.
Pre- or post-session. Runs before or after a session.
100
Source Qualifier Transformation

When you add a relational or a flat file source definition to a mapping, you need to connect
it to a Source Qualifier transformation. The Source Qualifier represents the records that the
Informatica Server reads when it runs a session.
You can use the Source Qualifier to perform the following tasks:
Join data originating from the same source database. You can join two or more tables
with primary-foreign key relationships by linking the sources to one Source Qualifier.
Filter records when the Informatica Server reads source data. If you include a filter
condition, the Informatica Server adds a WHERE clause to the default query.
Specify an outer join rather than the default inner join. If you include a user-defined
join, the Informatica Server replaces the join information specified by the metadata in the
SQL query.
Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds
an ORDER BY clause to the default SQL query.
Select only distinct values from the source. If you choose Select Distinct, the
Informatica Server adds a SELECT DISTINCT statement to the default SQL query.
Create a custom query to issue a special SELECT statement for the Informatica
Server to read source data. For example, you might use a custom query to perform
aggregate calculations or execute a stored procedure
101
Configuring Source Qualifier Transformation

To configure a Source Qualifier:
In the Designer, open a mapping.
Double-click the title bar of the Source Qualifier.
In the Edit Transformations dialog box, click Rename, enter a

descriptive name for the transformation, and click OK. The naming
convention for Source Qualifier transformations is
SQ_TransformationName,.
Click the Properties tab.
102
Configuring Source Qualifier Transformation

Option
Description
SQL Query
Defines a custom query that replaces the default query the Informatica Server uses to read data from
sources represented in this Source Qualifier
User-Defined
Join
Specifies the condition used to join data from multiple sources represented in the same Source
Qualifier transformation
Source Filter
Specifies the filter condition the Informatica Server applies when querying records.
Number of Sorted
Ports
Indicates the number of columns used when sorting records queried from relational sources. If you
select this option, the Informatica Server adds an ORDER BY to the default query when it reads
source records. The ORDER BY includes the number of ports specified, starting from the top of
the Source Qualifier.
When selected, the database sort order must match the session sort order.
Tracing Level
Sets the amount of detail included in the session log when you run a session containing this
transformation.
Select Distinct
Specifies if you want to select only unique records. The Informatica Server includes a SELECT
DISTINCT statement if you choose this option.
103
Filter Transformation
The Filter transformation provides the means for filtering rows in a mapping. You
pass all the rows from a source transformation through the Filter transformation,
and then enter a filter condition for the transformation. All ports in a Filter
transformation are input/output, and only rows that meet the condition pass
through the Filter transformation.
104
Creating a Filter Transformation

To create a Filter transformation:
In the Designer, switch to the Mapping Designer and open a mapping.
Choose Transformation-Create. Select Filter transformation, and enter the name of

the new transformation. The naming convention for the Filter transformation is
FIL_TransformationName. Click Create, and then click Done.
Select and drag all the desired ports from a source qualifier or other transformation
to add them to the Filter transformation. After you select and drag ports, copies of
these ports appear in the Filter transformation. Each column has both an input and
an output port.
Double-click the title bar of the new transformation.
Click the Properties tab. A default condition appears in the list of conditions. The
default condition is TRUE (a constant with a numeric value of 1).
105
Joiner Transformation
While a Source Qualifier transformation can join data originating from a common source
database, the Joiner transformation joins two related heterogeneous sources residing in
different locations or file systems. The combination of sources can be varied. You can use
the following sources:
Two relational tables existing in separate databases
Two flat files in potentially different file systems
Two different ODBC sources
Two instances of the same XML source
A relational table and a flat file source
A relational table and an XML source
If two relational sources contain keys, then a Source Qualifier transformation can easily join
the sources on those keys. Joiner transformations typically combine information from two
different sources that do not have matching keys, such as flat file sources.
The Joiner transformation allows you to join sources that contain binary data.
106
Creating a Joiner Transformation

To create a Joiner Transformation:
In the Mapping Designer, choose Transformation-Create. Select the Joiner transformation.

Enter a name for the Joiner. Click OK. The naming convention for Joiner transformations is
JNR_TransformationName. Enter a description for the transformation. This description appears
in the Repository Manager, making it easier for you or others to understand or remember what
the transformation does.
The Designer creates the Joiner transformation. Keep in mind that you cannot use a Sequence
Generator or Update Strategy transformation as a source to a Joiner transformation.
Drag all the desired input/output ports from the first source into the Joiner transformation. The
Designer creates input/output ports for the source fields in the Joiner as detail fields by default.
You can edit this property later.
Select and drag all the desired input/output ports from the second source into the Joiner
transformation. The Designer configures the second set of source fields and master fields by
default.
Double-click the title bar of the Joiner transformation to open the Edit Transformations dialog
box.
Select the Ports tab.
Click any box in the M column to switch the master/detail relationship for the sources. Change
the master/detail relationship if necessary by selecting the master source in the M column.
107
Select the Condition tab and set the condition.
108

Select the Properties tab and enter any additional settings for the transformations. .
Joiner Setting
Description
Case-Sensitive String
Comparison
If selected, the Informatica Server uses case-sensitive string comparisons when

performing joins on string columns.
Cache Directory
Specifies the directory used to cache master records and the index to these records. By
default, the caches are created in a directory specified by the server variable
$PMCacheDir. If you override the directory, be sure there is enough disk space on the file
system. The directory can be a mapped or mounted drive.
Join Type
Specifies the type of join: Normal, Master Outer, Detail Outer, or Full Outer.
Null Ordering in
Master
Not applicable for this transformation type.
Null Ordering in Detail
Not applicable for this transformation type.
Tracing Level
Amount of detail displayed in the session log for this transformation. The options are
Terse, Normal, Verbose Data, and Verbose Initialization.
109
Rank Transformation
The Rank transformation allows you to select only the top or bottom rank of data. You can
use a Rank transformation to return the largest or smallest numeric value in a port or
group. You can also use a Rank transformation to return the strings at the top or the bottom
of a session sort order. During the session, the Informatica Server caches input data until it
can perform the rank calculations.
The Rank transformation differs from the transformation functions MAX and MIN, in that it
allows you to select a group of top or bottom values, not just one value. For example, you
can use Rank to select the top 10 salespersons in a given territory. Or, to generate a
financial report, you might also use a Rank transformation to identify the three departments
with the lowest expenses in salaries and overhead. While the SQL language provides
many functions designed to handle groups of data, identifying top or bottom strata within a
set of rows is not possible using standard SQL functions.
110
Creating a Rank Transformation

To create a Rank transformation:
In the Mapping Designer, choose Transformation-Create. Select the Rank

transformation. Enter a name for the Rank. The naming convention for Rank
transformations is RNK_TransformationName. Enter a description for the
transformation. This description appears in the Repository Manager.
Click OK, and then click Done. The Designer creates the Rank transformation.
Link columns from an input transformation to the Rank transformation.
Click the Ports tab, and then select the Rank (R) option for the port used to
measure ranks.
111
112

Click the Properties tab and select whether you want the top or bottom rank.
Setting
Description
Cache directory
Local directory where the Informatica Server creates the index and data caches and,
inecessary, index and data files. By default, the Informatica Server uses the directory
entered in the Server Manager for the server variable $PMCacheDir. If you enter a new
directory, make sure the directory exists and contains enough disk space for the rank
caches.
Top/Bottom
Specifies whether you want the top or bottom ranking for a column.
Number of Ranks
The number of rows you want to rank.
Case-Sensitive String
Comparison
When running in Unicode mode, the Informatica Server ranks strings based on the sort
order selected for the session. If the session sort order is case-sensitive, select this option
to enable case-sensitive string comparisons, and clear this option to have the Informatica
Server ignore case for strings. If the sort order is not case-sensitive, the Informatica Server
ignores this setting. By default, this option is selected.
Tracing level
Determines the amount of information the Informatica Server writes to the session log about
data passing through this transformation during a session.
113
Router Transformation
A Router transformation is similar to a Filter transformation because both transformations
allow you to use a condition to test data. A Filter transformation tests data for one condition
and drops the rows of data that do not meet the condition. However, a Router
transformation tests data for one or more conditions and gives you the option to route rows
of data that do not meet any of the conditions to a default output group.
If you need to test the same input data based on multiple conditions, use a Router
Transformation in a mapping instead of creating multiple Filter transformations to perform
the same task. The Router transformation is more efficient when you design a mapping and
when you run a session. For example, to test data based on three conditions, you only
need one Router transformation instead of three filter transformations to perform this task.
Likewise, when you use a Router transformation in a mapping, the Informatica Server
processes the incoming data only once. When you use multiple Filter transformations in a
mapping, the Informatica Server processes the incoming data for each transformation.
114
Router Transformation Components

A Router transformation consists of input and output groups, input and output ports, group filter
conditions, and properties that you configure in the Designer.
115
Working with Groups

Router transformation has the following types of groups:
Input
Output
Input Group
The Designer copies property information from the input ports of the input group to
create a set of output ports for each output group.
Output Groups
There are two types of output groups:
User-defined groups
Default group
You cannot modify or delete output ports or their properties.
116
Creating Group Filter Conditions
117
Creating a Router Transformation

To create a Router transformation:
1.
In the Mapping Designer, open a mapping.
2.
Choose Transformation-Create. Select Router transformation, and enter the name of the
new transformation. The naming convention for the Router transformation is
RTR_TransformationName. Click Create, and then click Done.
3.
Select and drag all the desired ports from a transformation to add them to the Router
transformation, or you can manually create input ports on the Ports tab.
4.
Double-click the title bar of the Router transformation to edit transformation properties.
5.
Click the Transformation tab and configure transformation properties as desired.
6.
Click the Properties tab and configure tracing levels as desired.
7.
Click the Groups tab, and then click the Add button to create a user-defined group. The
Designer creates the default group when you create the first user-defined group.
8.
Click the Group Filter Condition field to open the Expression Editor.
9.
Enter a group filter condition.
10.
Click Validate to check the syntax of the conditions you entered.
118
Update Strategy Transformation

When you design your data warehouse, you need to decide what type of information to store in targets.
As part of your target table design, you need to determine whether to maintain all the historic data or
just the most recent changes.
For example, you might have a target table, T_CUSTOMERS, that contains customer data. When a
customer address changes, you may want to save the original address in the table, instead of updating
that portion of the customer record. In this case, you would create a new record containing the updated
address, and preserve the original record with the old customer address. This illustrates how you might
store historical information in a target table. However, if you want the T_CUSTOMERS table to be a
snapshot of current customer data, you would update the existing customer record and lose the original
address.
The model you choose constitutes your update strategy, how to handle changes to existing records. In
PowerMart and PowerCenter, you set your update strategy at two different levels:
Within a session. When you configure a session, you can instruct the Informatica Server
to either treat all records in the same way (for example, treat all records as inserts), or use
instructions coded into the session mapping to flag records for different database
operations.
Within a mapping. Within a mapping, you use the Update Strategy transformation to flag
records for insert, delete, update, or reject.
119
Setting up Update Strategy for a Session
120
Specifying an Option for all Rows

During session configuration, you can select a single database operation for all records.
For the Treat Rows As setting, you have the following options:
Setting
Description
Insert
Treat all records as inserts. If inserting the record violates a primary or foreign key constraint in the
database, the Informatica Server rejects the record.
Delete
Treat all records as deletes. For each record, if the Informatica Server finds a corresponding record in the
target table (based on the primary key value), the Informatica Server deletes it. Note that the primary key
constraint must exist in the target definition in the repository.
Update
Treat all records as updates. For each record, the Informatica Server looks for a matching primary key value
in the target table. If it exists, the Informatica Server updates the record. Again, the primary key constraint
must exist in the target definition.
Data
Driven
The Informatica Server follows instructions coded into Update Strategy transformations within the session
mapping to determine how to flag records for insert, delete, update, or reject.
If the mapping for the session contains an Update Strategy transformation, this field is marked Data Driven
by default.
If you do not choose Data Driven setting, the Informatica Server ignores all Update Strategy transformations
in the mapping.
121
Update Strategy Settings

The setting you choose depends on your update strategy and the status of data in target
tables:
Setting
Use To
Insert
Populate the target tables for the first time, or maintaining a historical data warehouse. In the latter case,
you must set this strategy for the entire data warehouse, not just a select group of target tables.
Delete
Clear target tables.
Update
Update target tables. You might choose this setting whether your data warehouse contains historical data
or a snapshot. Later, when you configure how to update individual target tables, you can determine
whether to insert updated records as new records or use the updated information to modify existing
records in the target.
Data
Driven
Exert finer control over how you flag records for insert, delete, update, or reject. Choose this setting if
records destined for the same table need to be flagged on occasion for one operation (for example,
update), or for a different operation (for example, reject). In addition, this setting provides the only way you
can flag records for reject.
122
Specifying Options for individual Target Tables

Once you determine how to treat all rows in the session (insert, delete, update, or datadriven), you also need to set update strategy options for individual targets. You set the
following options in the Targets section of the Session Wizard:
123
Specifying Options for individual Target Tables
Insert. Select this option to insert a row into a target table.

Delete. Select this option to delete a record from a table.
Truncate table. Select this option to truncate the target table before loading data.
Update. You have three different options in this situation:
Option
Description
Update as
update
Update each record flagged for update if it exists in the

target table.
Update as
insert
Insert each record flagged for update.
Update else
insert
Update the record if it exists. Otherwise, insert it.
124
Update Strategy Within a Mapping

For the greatest degree of control over your update strategy, you add Update Strategy
transformations to a mapping. The most important feature of this transformation is its
update strategy expression, used to flag individual records for insert, delete, update, or
reject.
Operation
Constant
Numeric Value
Insert
DD_INSERT
Update
DD_UPDATE
Delete
DD_DELETE
Reject
DD_REJECT
125
Creating Update Strategy Transformation

To create an Update Strategy transformation:
1.
In the Mapping Designer, open or create a mapping.
2.
Click the Update Strategy button on the Transformations toolbar.
3.
Click and drag across the area where you want the transformation to appear. When you release the
mouse button, a new Update Strategy transformation appears.
4.
Choose Layout-Link Columns.
5.
Click and drag all the ports from another transformation representing data you want to pass through
the Update Strategy transformation. In the Update Strategy transformation, the Designer creates a
copy of each port you click and drag. The Designer also connects the new port to the original port.
Each port in the Update Strategy transformation is a combination input/output port. Normally, you
would select all of the columns destined for a particular target. After they pass through the Update
Strategy transformation, this information is flagged for update, insert, delete, or reject.
6.
Double-click the transformation title bar.
7.
Click Rename, enter a descriptive name, and click OK. The naming convention for Update Strategy
transformations is UPD_TransformationName.
126
Creating Update Strategy Transformation

8.
Click the Properties tab.
9.
Click the button in the Update Strategy Expression field. The Expression Editor appears.
10.
Enter an update strategy expression to flag records as inserts, deletes, updates, or rejects.
11.
Validate the expression and click OK to close the Expression Editor.
12.
Click OK to return to the Designer.
13.
Connect the ports in the Update Strategy transformation to another transformation or a target
instance.
14.
Choose Repository-Save.
127
XML Source Qualifier Transformation

When you add an XML source definition to a mapping, you need to connect it to an XML Source
Qualifier transformation. The XML Source Qualifier represents the data elements that the Informatica
Server reads when it runs a session with XML sources.
You can use the XML Source Qualifier only with an XML source definition. You can link only one XML
Source Qualifier to one XML source definition. An XML Source Qualifier always has one input/output
port for every column in the XML source. When you create an XML Source Qualifier for a source
definition, the Designer automatically links each port in the XML source definition to a port in the XML
Source Qualifier. You cannot remove or edit any of the links. If you remove an XML source definition
from a mapping, the Designer also removes the corresponding XML Source Qualifier.
You can link ports of one group to ports in different transformations to form separate data flows.
However, you cannot link ports from more than one group in an XML Source Qualifier to ports in the
same target transformation.
If you drag columns of more than one group in an XML Source Qualifier to one transformation, the
Designer copies the columns of all the groups to the transformation. However, it links only the ports of
the first group to the corresponding ports of the columns created in the transformation.
A group in an XML Source Qualifier can link to one group in an XML target definition. You can link more
than one group in an XML Source Qualifier to an XML target definition.
You cannot use an XML Source Qualifier in a mapplet.
128
Normalizer Transformation
The Normalizer transformation normalizes records from COBOL and relational sources, allowing you to
organize the data according to your own needs. A Normalizer transformation can appear anywhere in a
data flow when you normalize a relational source. Use a Normalizer transformation instead of the
Source Qualifier transformation when you normalize a COBOL source. When you drag a COBOL
source into the Mapping Designer workspace, the Normalizer transformation automatically appears,
creating input and output ports for every column in the source.
You primarily use the Normalizer transformation with COBOL sources, which are often stored in a
denormalized format. The OCCURS statement in a COBOL file nests multiple records of information in
a single record. Using the Normalizer transformation, you break out repeated data within a record into
separate records. For each new record it creates, the Normalizer transformation generates a unique
identifier. You can use this key value to join the normalized records.
You can also use the Normalizer transformation with relational sources to create multiple rows from a
single row of data.
129
External Procedure Transformation

External Procedure transformations operate in conjunction with procedures you create outside of the
Designer interface to extend PowerMart/PowerCenter functionality.
Although the standard transformations provide you with a wide range of options, there are occasions
when you might want to extend the functionality provided with PowerMart and PowerCenter. For
example, the range of standard transformations (Expression, Stored Procedure, Filter, and so forth)
may not provide the exact functionality you need. If you are an experienced programmer, you may want
to develop complex functions within a dynamic link library (DLL) or UNIX shared library, instead of
creating the necessary Expression transformations in a mapping.
To obtain this kind of extensibility, you can use the Transformation Exchange (TX) dynamic invocation
interface built into PowerMart and PowerCenter. Using TX, you can create an Informatica External
Procedure transformation and bind it to an external procedure that you have developed. You can bind
External Procedure transformations to two kinds of external procedures:
COM external procedures (available on Windows NT/2000 only)

Informatica external procedures (available on Windows NT/2000 and Solaris, HPUX, and
AIX)
To use TX, you must be an experienced C, C++, or Visual Basic programmer.

You can use multi-threaded code in both external procedures and advanced external procedures.
130
Advanced External Procedure Transformation

Use the Advanced External Procedure transformation to create external transformation applications,
such as sorting and aggregation, which require all input rows to be processed before emitting any
output rows. To support this process, the input and output functions occur separately in Advanced
External Procedure transformation. The advanced external procedure specified in the transformation is
an input function, and is passed only through the input ports. The output function is a separate callback
function provided by Informatica that can be called from the Advanced External Procedure library. The
output callback function is used to pass all the output port values from the Advanced External
Procedure library to the Informatica Server. In contrast, in the External Procedure transformation, an
external procedure function does both input and output, and its parameters consist of all the ports of the
transformation.
Advanced External Procedure transformations are connected transformations. You cannot reference an
Advanced External Procedure transformation in an expression.
131
Differences Between External and Advanced External Procedures
External Procedure
Single return value: One row in, one row out. Each input has
one or zero output.
Multiple outputs: Multiple rows in, multiple

rows out.
Supports COM and Informatica procedures.
Supports Informatica procedures only.
Passive: Allowed in concatenation data flows.
Active: Not allowed in concatenation data

flows.
Connected or Unconnected. Can be called from an

expression.
Connected only. Cannot be called from an

expression.
132
Mapplets
A mapplet is a reusable object that represents a set of transformations. It allows you to reuse
transformation logic and can contain as many transformations as you need. You create mapplets in the
Mapplet Designer.
Create a mapplet when you want to use a standardized set of transformation logic in several mappings.
For example, if you have several fact tables that require a series of dimension keys, you can create a
mapplet containing a series of Lookup transformations to find each dimension key. You can then use
the mapplet in each fact table mapping, rather than recreate the same lookup logic in each mapping.
To create a mapplet, you add, connect, and configure transformations to complete the desired
transformation logic.
After you save a mapplet, you can use it in a mapping to represent the transformations within the
mapplet. When you use a mapplet in a mapping, you use an instance of the mapplet. Like a reusable
transformation, any changes made to the mapplet are automatically inherited by all instances of the
mapplet.
133
Mapplet Input
Data passing through a mapplet comes from a source. Source data for a mapplet can
originate from one of two places:
Sources within the mapplet. Mapplet input can originate from within the mapplet if you
include one or more source definitions in the mapplet. When you use more than one
source definition in a mapplet, you must connect the sources to a single Source Qualifier or
ERP Source Qualifier transformation. When you use the mapplet in a mapping, the
mapplet provides source data for the mapping.
Sources outside the mapplet. Mapplet input can originate from outside a mapplet if you
include an Input transformation to define mapplet input ports. When you use the mapplet in
a mapping, data passes through the mapplet as part of the mapping pipeline.
134
Mapplet Input Using Sources within Mapplet

You can use one or more source definitions in a mapplet to provide source data for the
mapplet. Source definitions can represent either file, relational, or ERP data. When you
include source definitions in a mapplet, you can connect them to one of the following
transformations:
Source Qualifier
ERP Source Qualifier
You cannot connect sources to a Normalizer transformation. You cannot use COBOL, MQ,
or XML source definitions in a mapplet.
135
Mapplet Input Using Sources Outside Mapplet

You can connect a mapplet to sources in a mapping by creating mapplet input ports. To create mapplet
input ports, you add an Input transformation to the mapplet. Each port in the Input transformation
connected to another transformation in the mapplet becomes a mapplet input port.
When you use an Input transformation in a mapplet, you must connect at least one port in the Input
transformation to another transformation in the mapplet. You cannot connect ports in an Input
transformation directly to an Output transformation.
You can connect an Input transformation to multiple transformations in a mapplet. However, you can
connect each port in the Input transformation to only one transformation in the mapplet. For example,
you can connect one port in an Input transformation to a Lookup transformation and a different port to
an Expression transformation. You cannot connect the same port to both the Lookup transformation and
the Expression transformation.
When you use the mapplet in a mapping, the Designer displays all available input ports below the Input
transformation name. You do not have to use all mapplet input ports in each mapping, but you must use
at least one.
136
Mapplet Output
To pass data out of a mapplet, you create mapplet output ports. To create mapplet output ports, you add
Output transformations to the mapplet. Each port in an Output transformation connected to another
transformation in the mapplet becomes a mapplet output port. Each mapplet must contain at least one
Output transformation, and at least one port in the Output transformation must be connected within the
mapplet.
Each Output transformation in a mapplet represents a group of mapplet output ports, or output group.
Each output group can pass data to a single pipeline in the mapping. To pass data from a mapplet to
more than one pipeline, create an Output transformation for each pipeline.
When you use a mapplet in a mapping, you connect ports in each output group to different pipelines.
You do not have to use all mapplet output ports in a mapping, but you must use at least one.
137
Creating Cubes and Dimensions

The Warehouse Designer provides an interface to let you create and edit cubes
and dimensions.
Multi-dimensional metadata refers to the logical organization of data used for
analysis in OLAP applications. This logical organization is generally specialized
for the most efficient data representation and access by end users of the OLAP
application.
138
Creating a Dimension
Before you can create a cube, you need to create dimensions. Complete each of the
following steps to create a dimension:
1.
2.
3.
4.
Enter a dimension description.

Add levels to the dimension.
Add hierarchies to the dimension.
Add level instances to the hierarchies.
139
Step 1: Creating a Dimension
1.
2.
3.
4.
In the Warehouse Designer, choose Targets-Create/Edit Dimension. The Dimension Editor

displays.
Select Add Dimension.
Enter the following information:
Name. Dimension names must be unique in a folder.
Description.
Database type. The database type of a dimension must match the database type of the
cube. Note: You cannot change the database type once you create the dimension.
Click OK.
140
Step 2: Add Levels to the Dimension
After you create the dimension, add as many levels as needed. Levels hold the properties necessary to
create target tables.
1.
In the Dimension Editor select Levels and click Add Level.
141
142
3.
Click Level Properties.
4.
Click the Import from Source Fields button .
143
5.
Select a source table from which you want to copy columns to the level. The columns display in the
Source Fields section.
144
Step 3: Add Hierarchies to the Dimension
1.
In the Dimension Editor, select Hierarchies.
2.
Click Add Hierarchy.
145
Step 3: Add Hierarchies to the Dimension
3.
Enter a hierarchy name, description, and select normalized or non-normalized .
Normalized cubes restrict redundant data.

Non-normalized cubes allow for redundant data, which increases speed for retrieving data.
146
Step 4: Add Levels to Hierarchy

After you create a hierarchy, you add levels to it. You can have only one root level in a hierarchy.
To add a level to a hierarchy:
1.
From the Dimension Editor, drill down to view the levels in the dimension.
2.
Drag the level you want to define as the root level in the hierarchy.
147
Step 4: Add Levels to Hierarchy
3.
Enter a target table name and description of the target table.
148
Creating a Cube: Step 1
After you create dimensions, you can create a cube.

To create a cube:
1.
From the Warehouse Designer, choose Targets-Create Cube.
149
2.
Enter the following information:

o
Cube name. The cube name must be unique in a folder.
Cube type: Normalized or Non-normalized. Normalized dimensions must have a

normalized cube. Likewise, non-normalized dimensions must have a nonnormalized cube.
Database type. The database type for the cube must match the database type for
the dimensions in the cube.
3.
Click Next.
150

4.
Specify the dimensions and hierarchies to include in the cube.
151
Add measures to the cube.
152
Add a name for the fact table.
153
Viewing Metadata for Cubes and Dimensions
You can view the metadata for cubes and dimensions in the Repository Manager.
To view cube or dimension metadata:
In the Repository Manager, open a folder.

Drill down to the cube or dimension you want to analyze.
The Repository Manager displays the metadata for each object.
154
Mapping Wizards
The Designer provides two mapping wizards to help you create mappings quickly and easily. Both
wizards are designed to create mappings for loading and maintaining star schemas, a series of
dimensions related to a central fact table. You can, however, use the generated mappings to load
other types of targets.
You choose a different wizard and different options in each wizard based on the type of target you
want to load and the way you want to handle historical data in the target:
Getting Started Wizard. Creates mappings to load static fact and dimension tables, as
well as slowly growing dimension tables.
Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing

dimension tables based on the amount of historical dimension data you want to keep and
the method you choose to handle historical dimension data.
After using a mapping wizard, you can edit the generated mapping to further customize it.
155
Using Getting Started Wizards
The Getting Started Wizard creates mappings to load static fact and dimension tables, as well as
slowly growing dimension tables.
The Getting Started Wizard can create two types of mappings:
Simple Pass Through. Loads a static fact or dimension table by inserting all rows. Use
this mapping when you want to drop all existing data from your table before loading new
data.
Slowly Growing Target. Loads a slowly growing fact or dimension table by inserting new
rows. Use this mapping to load new data when existing data does not require updates.
156
Using Slowly Changing Dimension Wizards

The Slowly Changing Dimensions Wizard creates mappings to load slowly changing dimension tables:
Type 1 Dimension mapping. Loads a slowly changing dimension table by inserting new
dimensions and overwriting existing dimensions. Use this mapping when you do not want a
history of previous dimension data.
Type 2 Dimension/Version Data mapping. Loads a slowly changing dimension table by

inserting new and changed dimensions using a version number and incremented primary key to
track changes. Use this mapping when you want to keep a full history of dimension data and to
track the progression of changes.
Type 2 Dimension/Flag Current mapping. Loads a slowly changing dimension table by

inserting new and changed dimensions using a flag to mark current dimension data and an
incremented primary key to track changes. Use this mapping when you want to keep a full history
of dimension data, tracking the progression of changes while flagging only the current
dimension.
Type 2 Dimension/Effective Date Range mapping. Loads a slowly changing dimension table
by inserting new and changed dimensions using a date range to define current dimension data.
Use this mapping when you want to keep a full history of dimension data, tracking changes with
an exact effective date range.
Type 3 Dimension mapping. Loads a slowly changing dimension table by inserting new
dimensions and updating values in existing dimensions. Use this mapping when you want to
keep the current and previous dimension values in your dimension table.
157
Steps for Creating Slowly Growing Target Mapping

To create a Slowly Growing Target mapping:
1.
In the Mapping Designer, choose Mappings-Wizards-Getting Started.
2.
Enter a mapping name and select Slowly Growing Target, and click Next. The naming
convention for mapping names is mMappingName. .
158

Select a source definition to be used in the mapping.
All available source definitions appear in the Select Source Table list. This list
includes shortcuts, flat file, relational, and ERP sources.
159
Enter a name for the mapping target table. Click Next. The naming convention for target
definitions is T_TARGET_NAME.
Select the column or columns from the Target Table Fields list that you want the
Informatica Server to use to look up data in the target table. Click Add.
The wizard adds selected columns to the Logical Key Fields list.
Tip: The columns you select should be a key column in the source.
When you run the session, the Informatica Server performs a lookup on existing target
data. The Informatica Server returns target data when Logical Key Fields columns match
corresponding target columns.
To remove a column from Logical Key Fields, select the column and click Remove.
Note: The Fields to Compare for Changes field is disabled for the Slowly Growing
Targets mapping.
160
161
Configuring a Slowly Growing Target Session
The Slowly Growing Target mapping flags new source rows, and then inserts them to the target
with a new primary key. The mapping uses an Update Strategy transformation to indicate new
rows must be inserted. Therefore, when you create a session for the mapping, configure the
session as follows:
For the source, set Treat Rows As to Data Driven.
To ensure rows are inserted into the target properly, click the Target Options button to
access the Targets dialog box and select Insert.
162
Understanding the Transformations

Transformation Name
Transformation
Type
SQ_SourceName
Source Qualifier
or ERP Source
Qualifier
Selects all rows from the source you choose in the Mapping Wizard.
LKP_GetData
Lookup
Caches the existing target table.

Compares a logical key column in the source against the corresponding key
column in the target.
EXP_DetectChanges
Expression
Uses the following expression to flag source rows that have no matching key
in the target (indicating they are new):
IIF(ISNULL(PM_PRIMARYKEY),TRUE,FALSE)
Populates the NewFlag field with the results.
Passes all rows to FIL_InsertNewRecord.
FIL_InsertNewRecord
Filter
Uses the following filter condition to filter out any rows from
EXP_DetectChanges that are not marked new (TRUE): NewFlag. Passes new
rows to UPD_ForceInserts.
UPD_ForceInserts
Update Strategy
Uses DD_INSERT to insert rows to the target.
SEQ_GenerateKeys
Sequence
Generator
Generates a value for each new row written to the target, incrementing values
by 1. Passes values to the target to populate the PM_PRIMARYKEY column.
T_TargetName
Target Definition
Instance of the target definition for new rows to be inserted into the target.
Description
163
Creating a Type 1 Dimension Mapping
The Type 1 Dimension mapping filters source rows based on user-defined comparisons and
inserts only those found to be new dimensions to the target. Rows containing changes to
existing dimensions are updated in the target by overwriting the existing dimension. In the
Type 1 Dimension mapping, all rows contain current dimension data.
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you
do not need to keep any previous versions of dimensions in the table.
164
Understanding the Mapping
The Type 1 Dimension mapping performs the following:
Selects all rows

Caches the existing target as a lookup table
Compares logical key columns in the source against corresponding columns in the
target lookup table
Compares source columns against corresponding target columns if key columns
match
Flags new rows and changed rows
Creates two data flows: one for new rows, one for changed rows
Generates a primary key for new rows
Inserts new rows to the target
Updates changed rows in the target, overwriting existing rows
165
Steps for Creating Type 1 Dimension Mapping

To create a Type 1 Dimension mapping:
In the Mapping Designer, choose Mappings-Wizards-Slowly Changing Dimension.

Enter a mapping name and select Type 1 Dimension, and click Next. The naming
convention for mappings is mMappingName.
166
Select a source definition to be used by the mapping. All available source definitions appear in
the Select Source Table list. This list includes shortcuts, flat file, relational, and ERP sources .
167
Enter a name for the mapping target table. Click Next. The naming convention for target
definitions is T_TARGET_NAME.
Select the column or columns you want to use as a lookup condition from the Target Table
Fields list and click Add.
The wizard adds selected columns to the Logical Key Fields list.
Tip: The columns you select should be a key column in the source.
When you run the session, the Informatica Server performs a lookup on existing target data.
The Informatica Server returns target data when Logical Key Fields columns match
corresponding target columns.
To remove a column from Logical Key Fields, select the column and click Remove.
168
169
Configuring a Type 1 Dimension Session
The Type 1 Dimension mapping inserts new rows with a new primary key and updates
existing rows. When you create a session for the mapping, configure the session as
follows:
For the source, set Treat Rows As to Data Driven and select the source
database.
Select the target database. Then to ensure the Informatica Server loads
rows to the target properly, click the Target Options button. Select Insert and
Update (as Update).
170
Creating a Type 2 Dimension Mapping / Version Data Mapping
The Type 2 Dimension/Version Data mapping filters source rows based on user-defined
comparisons and inserts both new and changed dimensions into the target. Changes are
tracked in the target table by versioning the primary key and creating a version number for
each dimension in the table. In the Type 2 Dimension/Version Data target, the current
version of a dimension has the highest version number and the highest incremented primary
key of the dimension.
Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension
table when you want to keep a full history of dimension data in the table. Version numbers
and versioned primary keys track the order of changes to each dimension.
When you use this option, the Designer creates two additional fields in the target:
PM_PRIMARYKEY. The Informatica Server generates a primary key for each row
written to the target.
PM_VERSION_NUMBER. The Informatica Server generates a version number for

each row written to the target.
171
Handling Keys
In a Type 2 Dimension/Version Data mapping, the Informatica Server generates a new

primary key value for each new dimension it inserts into the target. An Expression
transformation increments key values by 1,000 for new dimensions.
When updating an existing dimension, the Informatica Server increments the existing primary
key by 1.
For example, the Informatica Server inserts the following new row with a key value of 65,000
since this is the sixty-fifth dimension in the table.
PM_PRIMARYKEY ITEM STYLES
65000
Sandal
The next time you run the session, the same item has a different number of styles. The
Informatica Server creates a new row with updated style information and increases the
existing key by 1 to create a new key of 65,001. Both rows exist in the target, but the row
with the higher key version contains current dimension data.
PM_PRIMARYKEY ITEM STYLES
65000
Sandal
65001
Sandal
5
14
172
Numbering Versions
In addition to versioning the primary key, the Informatica Server generates a

matching version number for each row inserted into the target. Version numbers
correspond to the final digit in the primary key. New dimensions have a version
number of 0.
For example, in the data below, the versions are 0, 1, and 2. The highest version
number contains the current dimension data.
PM_PRIMARYKEY
ITEM
PM_VERSION_NUMBER
65000
Sandal
0
65001
Sandal
1
65002
Sandal
2
STYLES
5
14
17
173
The Type 2 Dimension/Version Data mapping performs the following:
Selects all rows
target lookup table

match
Generates a primary key and version number for new rows
Increments the primary key and version number for changed rows
Inserts changed rows in the target
174
Steps for Creating a Type 2 Dimension Mapping / Version Data

Mapping
To create a Type 2 Dimension/Version Data mapping:
In the Mapping Designer, choose Mappings-Wizards-Slowly Changing

Dimensions...
Enter a mapping name and select Type 2 Dimension. Click Next. The naming
convention for mappings is mMappingName.
175

Mapping
Select a source definition to be used by the mapping.

All available source definitions appear in the Select Source Table list. This list includes shortcuts,
flat file, relational, and ERP sources.
176

Mapping
Select the column or columns you want to use as a lookup condition from the Target Table Fields
list and click Add.
177
Steps for Creating a Type 2 Dimension Mapping / Version Data Mapping
Click Next. Select Keep Version Number in Separate Column.
178
Configuring a Type 2 Dimension Mapping / Version Data Mapping
The Type 2 Dimension/Version Data mapping inserts both new and updated rows with
a unique primary key. When you create a session for the mapping, configure the
session as follows:
For the source, set Treat Rows As to Data Driven and select the source
database.
To ensure rows are inserted into the target properly, click the Target Options
button to access the Targets dialog box and select Insert.
179
Creating a Type 2 Dimension Mapping / Flag Current Mapping
The Type 2 Dimension/Flag Current mapping filters source rows based on user defined
comparisons and inserts both new and changed dimensions into the target. Changes are tracked
in the target table by flagging the current version of each dimension and versioning the primary
key. In the Type 2 Dimension/Flag Current target, the current version of a dimension has a current
flag set to 1 and the highest incremented primary key.
Use the Type 2 Dimension/Flag Current mapping to update a slowly changing dimension table
when you want to keep a full history of dimension data in the table, with the most current data
flagged. Versioned primary keys track the order of changes to each dimension.
When you use this option, the Designer creates two additional fields in the target:
PM_CURRENT_FLAG. The Informatica Server flags the current row 1 and all
previous versions 0.
180

The Type 2 Dimension/Flag Current mapping performs the following:
Selects all rows
target lookup table

match
Generates a primary key and current flag for new rows
Increments the existing primary key and sets the current flag for changed rows
Updates existing versions of the changed rows in the target, resetting the current
flag to indicate the row is no longer current
181
Steps for Creating a Type 2 Dimension Mapping / Flag Current

Mapping
Select Mark the Current Dimension Record with a Flag.
182
Creating a Type 2 Dimension Mapping / Effective Date Range Mapping
The Type 2 Dimension/Effective Date Range mapping filters source rows based on user-defined
comparisons and inserts both new and changed dimensions into the target. Changes are tracked
in the target table by maintaining an effective date range for each version of each dimension in the
target. In the Type 2 Dimension/Effective Date Range target, the current version of a dimension
has a begin date with no corresponding end date.
Use the Type 2 Dimension/Effective Date Range mapping to update a slowly changing dimension
table when you want to keep a full history of dimension data in the table. An effective date range
tracks the chronological history of changes for each dimension.
When you use this option, the Designer creates three additional fields in the target:
PM_BEGIN_DATE. For each new and changed dimension written to the target, the
Informatica Server uses the system date to indicate the start of the effective date range
for the dimension.
PM_END_DATE. For each dimension being updated, the Informatica Server uses the
system date to indicate the end of the effective date range for the dimension.
183
The Type 2 Dimension/Effective Date Range mapping performs the following:
Selects all rows
Compares logical key columns in the source against corresponding columns in the target
lookup table
Compares source columns against corresponding target columns if key columns match
Creates three data flows: one for new rows, one for changed rows, one for updating
existing rows
Generates a primary key and beginning of the effective date range for new rows
Generates a primary key and beginning of the effective date range for changed rows
Updates existing versions of the changed rows in the target, generating the end of the
effective date range to indicate the row is no longer current
184
Steps for Creating a Type 2 Dimension Mapping / Flag Current Mapping
Select Mark the Dimension Records with their Effective Date Range.
185
Creating a Type 3 Dimension Mapping
The Type 3 Dimension mapping filters source rows based on user-defined comparisons and
inserts only those found to be new dimensions to the target. Rows containing changes to existing
dimensions are updated in the target. When updating an existing dimension, the Informatica
Server saves existing data in different columns of the same row and replaces the existing data
with the updates. The Informatica Server optionally enters the system date as a timestamp for
each row it inserts or updates. In the Type 3 Dimension target, each dimension contains current
dimension data.
Use the Type 3 Dimension mapping to update a slowly changing dimension table when you want
to keep only current and previous versions of column data in the table. Both versions of the
specified column or columns are saved in the same row.
When you use this option, the Designer creates additional fields in the target:
PM_PREV_ColumnName. The Designer generates a previous column corresponding

to each column for which you want historical data. The Informatica Server keeps the
previous version of dimension data in these columns.
PM_EFFECT_DATE. An optional field. The Informatica Server uses the system date to
indicate when it creates or updates a dimension.
186

The Type 3 Dimension mapping performs the following:
Selects all rows
Compares logical key columns in the source against corresponding columns in the target
lookup table
Compares source columns against corresponding target columns if key columns match
Creates two data flows: one for new rows, one for updating changed rows
Generates a primary key and optionally notes the effective date for new rows
Writes previous values for each changed row into previous columns and replaces
previous values with updated values
Optionally uses the system date to note the effective date for inserted and updated
values
Updates changed rows in the target

187
Steps for Creating a Type 3 Dimension Mapping
If you want the Informatica Server to timestamp new and changed rows, select Effective Date.
The wizard displays the columns the Informatica Server compares and the name of the column to
hold historic values.
188
Server Architecture
The Informatica Server moves data from sources to targets based on mapping and session
metadata stored in a repository.
Session Process
The Informatica Server uses both process memory and system shared memory to perform these
tasks. It runs as a daemon on UNIX and as a service on Windows NT/2000. The Informatica
Server uses the following processes to run a session:
The Load Manager process. Starts the session, creates the DTM process, and sends
post-session email when the session completes.
The DTM process. Creates threads to initialize the session, read, write, and transform
data, and handle pre- and post-session operations.
189
Load Manager Process
The Load Manager is the primary Informatica Server process. It performs the following tasks:
Manages session and batch scheduling.
Locks the session and reads session properties.
Reads the parameter file.
Expands the server and session variables and parameters.
Verifies permissions and privileges.
Validates source and target code pages.
Creates the session log file.
Creates the Data Transformation Manager (DTM) process, which executes the session.
190
Data Transformation Manager (DTM) Process
The DTM process is the second process associated with a session run. The primary purpose of
the DTM process is to create and manage threads that carry out the session tasks.
The DTM allocates process memory for the session and divides it into buffers. This is also known
as buffer memory. The default memory allocation is 12,000,000 bytes. It creates the main thread,
which is called the master thread. The master thread creates and manages all other threads.
Thread Type
Description
Master Thread
Main thread of the DTM process. Creates and manages all other threads. Handles stop and
abort requests from the Load Manager.
Mapping Thread
One thread for each session. Fetches session and mapping information. Compiles the
mapping. Cleans up after session execution.
Pre- and PostSession Threads
One thread each to perform pre- and post-session operations.
Reader Thread
One thread for each partition for each source pipeline. Reads sources. Relational sources
use relational threads, and file sources use file threads.
Writer Thread
One thread for each partition, if a target exists in the source pipeline. Writes to targets.
Transformation
Thread
One or more transformation threads for each partition.
191
Running a Session
When the Informatica Server runs a session, it performs the following tasks as configured in the
session properties:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
Load Manager locks the session and reads session properties.

Load Manager reads the parameter file.
Load Manager expands the server and session variables and parameters.
Load Manager verifies permissions and privileges.
Load Manager validates source and target code pages.
Load Manager creates the session log file.
Load Manager creates the DTM process.
DTM process allocates DTM process memory.
DTM initializes the session and fetches the mapping.
DTM executes pre-session commands and procedures.
DTM creates reader, transformation, and writer threads for each source pipeline. If the
pipeline is partitioned, it creates a set of threads for each partition.
DTM executes post-session commands and procedures.
DTM writes historical incremental aggregation and lookup data to disk, and it writes
persisted sequence values and mapping variables to the repository.
Load Manager sends post-session email.
192
System Resources
The Informatica Server uses the following system resources:
CPU
Shared memory
Buffer memory
Cache Memory
The DTM process creates in-memory index and data caches to temporarily store data used by the
following transformations:
Aggregator transformation (without sorted input)
Rank transformation
Joiner transformation
Lookup transformation (with caching enabled)
193
Output Files and Caches
The Informatica Server creates the following output files:
Informatica Server Log
Session Log File
Session Details File
Performance Detail File
Reject Files
Control File
Post-Session Email
Output File
Cache Files
194
Cache Files
The Informatica Server writes to the index and data cache files during the session in the following
cases:
The mapping contains one or more Aggregator transformations, and the session is
configured for incremental aggregation.
The mapping contains a Lookup transformation that is configured to use a persistent

lookup cache, and the Informatica Server runs the session for the first time.
The mapping contains a Lookup transformation that is configured to initialize the

persistent lookup cache.
The DTM runs out of cache memory and pages to the local cache files. The DTM may
create multiple files when processing large amounts of data. The session fails if the
local directory runs out of disk space.
After the session completes, the DTM generally deletes the overflow index and data files. It does
not delete the cache files under the following circumstances:
The session is configured to perform incremental aggregation.
The session is configured with a persistent lookup cache.
195
Configuring Server Manager
You can configure the following settings in the Server Manager:
Configure Server Manager display options. You can configure the display options
such as grouping sessions or docking and undocking windows.
Register Informatica Servers. Before you can start an Informatica Server, you must
register it with the repository.
Create source and target database connections. Create connections to each

source and target database. You must create connections to a database before you
can create a session that accesses the database.
Create FTP connections. After you create FTP connections, you can configure a
session to use FTP to access source or target files
Create external loader connections. Create connections to Oracle, Sybase IQ, and
Teradata external loaders. You must create these connections before you can
configure a session to use an external loader.
196
Server Manager Window
197
Server Variables
Server Variable
Required/Optional
Description
$PMRootDir
Required
A root directory to be used by any or all other server variables. Informatica

recommends you use the Server installation directory as the root directory.
$PMSessionLogDir
Required
Default directory session logs. Defaults to $PMRootDir/SessLogs.
$PMBadFileDir
Required
Default directory for reject files. Defaults to $PMRootDir/BadFiles
$PMCacheDir
Required
Default directory for the lookup cache, index and data caches, and index and
data files. Defaults to $PMRootDir/Cache. To avoid performance problems,
always use a drive local to the Informatica Server for the cache directory. Do
not use a mapped or mounted drive for cache files.
$PMTargetFileDir
Required
Default directory for target files. Defaults to $PMRootDir/TgtFiles
$PMSourceFileDir
Required
Default directory for source files. Defaults to $PMRootDir/SrcFiles
$PMExtProcDir
Required
Default directory for external procedures. Defaults to $PMRootDir/ExtProc
$PMTempDir
Required
Default directory for temporary files. Defaults to $PMRootDir/Temp.
198
Server Variables
Optional
Email address to receive post-session email when the session

completes successfully. Use to address post-session email.
$PMFailureEmailUser
Optional
Email address to receive post-session email when the session

fails. Use to address post-session email. The default value is an
empty string..
$PMSessionLogCount
Optional
Number of session logs the Informatica Server archives for the

session. Defaults to 0. Use to archive session logs..
$PMSessionErrorThreshhold
Optional
Number of errors the Informatica Server allows before failing the

session. Defaults to 0. Use to configure the Stop On option in the
session property sheet
$PMSuccessEmailUser
199
Working with Sessions
A session is a set of instructions that tells the Informatica Server how and when to
move data from sources to targets.
You create and maintain sessions in the Server Manager.
When you create a session, you enter general information such as the session
name, session schedule, and the Informatica Server to run the session.
You can also select options to execute pre-session shell commands, send postsession email, and FTP source and target files.
Using session properties, you can also override parameters established in the
mapping, such as source and target location, source and target type, error tracing
levels, and transformation attributes.
200
Creating a Session
To create a new session, you must enter the following information:
Mapping used for the session
Session name, which must be unique among all sessions in a given folder
Source type
Update strategy for writing to targets
Target type
Schedule for the session to run
Server on which you want the session to run
201
Using Session Wizard
Use the Session Wizard to create a session for a valid mapping. The Session Wizard has the
following pages, and each of those pages has multiple dialog boxes where you enter session
properties:
General page. Enter source and target information and performance configuration.
Sources page. Enter source information for heterogeneous sessions.
Time page. Schedule the session.
Log Files page. Enter log file and error handling information.
Transformation page. Override transformation properties.
Partition page. Configure the session for partitioning.
202
Starting a Session
You can start a session using
Server Manager
PMCMD Command
203
Monitoring a Session
The Server Manager allows you to monitor sessions on an Informatica Server. When monitoring a
session, you can use information provided through the Server Manager to troubleshoot sessions
and improve session performance.
When you poll the Informatica Server, it indicates the following types of session status in the
Monitor window:
Initializing. Indicates that the session is initializing.
Scheduled. Indicates that the session is scheduled.
Running. Indicates that the session is running.
Completed. Indicates that the session completed successfully.
Failed. Indicates that the session has failed.
204
Monitor Window Contents

Column
Name
Description
Session
Name
Session name.
Server Name
Server running the session.
Top Level
Batch
Top or outermost batch containing the session. If the session is not a part of a nested batch, this
field displays the folder name.
Batch
Batch containing the session, if the session is batched. If the session is a standalone session,
this field displays the folder name.
Status
Session status.
Start Time
Time the session started.
Completion
Time
Time the session completed.
First Error
First error that occurred in the session.
Mapping
Name
Mapping used in the session.
Session Run
Mode
Indicates whether the session is a regular or debug session.
User Name
User who started the session.
205
Monitoring Session Details
When you run a session, the Server Manager creates session details that provide load statistics
for each target in the mapping. You can view session details during the session or after the
session completes.
Session
Detail
Description
Table Name
Name of target table. If you have multiple instances of a target, this field shows both the target
instance name and the table name. The target instance display format is Table Name:Instance
Name.
Loaded
Number of rows written to the target.
Failed
Number of rows rejected by the target.
Read
Throughput
Rate at which the Informatica Server read rows from the source (bytes/sec).
Write
Throughput
Rate at which the Informatica Server wrote data into the target (rows/sec).
Current
Message
The most recent error message written to the session log. If you view details after the session
completes, this field displays the last error message.
206
Stopping a Session
To stop a session in the Server Manager:
In the Server Manager Navigator, select the session you want to stop.
To stop a session running against the Informatica Server configured in the session
properties, choose Server Requests-Stop or use the Stop button on the toolbar.
To stop a session running against an Informatica Server other than the one configured
in the session properties, use the Stop button on the toolbar to select the Informatica
Server running the session.
To abort a session in the Server Manager:
In the Server Manager Navigator, select the session you want to abort.
To abort a session running against the Informatica Server configured in the session
properties, choose Server-Requests-Abort.
207
Managing Batches
Batches provide a way to group sessions for either serial or parallel execution by the Informatica
Server. There are two types of batches:
Sequential. Runs sessions one after the other.
Concurrent. Runs sessions at the same time.
Nesting Batches
Each batch can contain any number of sessions or other batches. You can nest batches several
levels deep, defining batches within batches. Nested batches are useful when you want to control
a complex series of sessions that must run sequentially or concurrently.
Scheduling
When you place sessions in a batch, the batch schedule overrides the session schedule by
default. However, you can configure a batched session to run on its own schedule by selecting the
Use Absolute Time session option.
208
Recovering a Batch
When a session or sessions in a batch fail, you can perform recovery to complete the batch. The
steps you take vary depending on the type of batch:
Sequential batch. If the batch is sequential, you can recover data from the session
that failed and run the remaining sessions in the batch.
Concurrent batch. If a session within in a concurrent batch fails, but the rest of the
sessions complete successfully, you can recover data from the failed session targets to
complete the batch. However, if all sessions in a concurrent batch fail, you might want
to truncate all targets and run the batch again.
209
Using PMCMD
You can use the command line program pmcmd to communicate with the Informatica Server. This
does not replace the Server Manager, since there are many tasks that you can perform only with
the Server Manager.
You can perform the following actions with pmcmd:
Determine if the Informatica Server is running.
Start sessions and batches.
Stop sessions and batches.
Recover sessions.
Stop the Informatica Server.
pmcmd returns zero on success and non-zero on failure.
210
Parameters for PMCMD command
You need the following information to use pmcmd:
Repository username. This can be configured optionally as an environment variable.
Repository password. This can be configured optionally as an environment variable.
Connection type. The type of connection from the client machine to the Informatica
Server (TCP/IP or IPX/SPX).
Port or connection. The TCP/IP port number or IPX/SPX connection (Windows NT/2000
only) to the Informatica Server.
Host name. The machine hosting the Informatica Server (if running pmcmd from a remote
machine through a TCP/IP connection).
Session or batch name. The names of any sessions or batches you want to start or stop.
Folder name. The folder names for those sessions or batches (if their names are not
unique in the repository).
Parameter file. The directory and name of the parameter file you want the Informatica
Server to use with the session or batch.
211
Pinging Informatica Server
To determine if the Informatica Server is running,

Use the following syntax to ping the Informatica Server on a Windows NT/2000 system:
pmcmd ping [{user_name | %user_env_var} {password | %password_env_var}]
{[TCP/IP:][hostname:]portno | IPX/SPX:ipx/spx_address}
Use the following syntax to ping the Informatica Server on a UNIX system:
pmcmd ping [{user_name | %user_env_var} {password | %password_env_var}]
[hostname:]portno
212
Ping Return Values
Return Value
Description
The Informatica Server is running.
The Informatica Server is down, or pmcmd cannot connect to the Informatica Server.
The TCP/IP host name or port number, or IPX/SPX address (if applicable) may be
incorrect, or a network problem occurred.
An internal pmcmd error occurred. Contact Informatica Technical Support.
Informatica Server timed out while waiting for the request. Try sending it again.
213
Starting Sessions / Batches Using PMCMD
Use the following syntax to start a session or batch on a Windows NT/2000 system:
pmcmd start {user_name | %user_env_var} {password | %password_env_var } {[TCP/IP:]
[hostname:]portno | IPX/SPX: ipx/spx_address} [folder_name:] {session_name |
batch_name} [:pf=param_file] session_flag wait_flag
Use the following syntax to start a session or batch on a UNIX system:

pmcmd start {user_name | %user_env_var} {password | %password_env_var}
[hostname:]portno [folder_name:]{session_name | batch_name}
[:pf=param_file] session_flag wait_flag
214
PMCMD Return Values

Return
Value
Description
If pmcmd was called in wait mode (wait flag = 1), 0 indicates the session or batch ran successfully.
If pmcmd was not called in wait mode (wait flag = 0), 0 indicates the request to start the session was
successfully transmitted to the Informatica Server, and it acknowledged the request.
The Informatica Server is down, or pmcmd cannot connect to the Informatica Server. The TCP/IP host name
or port number, or IPX/SPX address (if applicable) may be incorrect, or a network problem occurred.
The specified session or batch name does not exist. Or, if you specified a folder name, the folder does not
contain the specified session or batch.
An error occurred in starting or running the session or batch.

This return value may appear if you run pmcmd in non-wait mode (wait_flag = 0) and the Informatica Server
returns a negative acknowledgment. If this is not the case, look for more information in the server error log.
Usage error: The wrong parameters were passed to pmcmd.
You used an invalid username or password.
You do not have the appropriate permissions or privileges to perform this action.
Informatica Server timed out while waiting for the request. Try sending it again.
215
PMCMD Return Values
10
(This error occurs only under extremely rare conditions.)

pmcmd successfully started the session or batch in wait mode (wait_flag =1). However, while checking the
status of the running session or batch, pmcmd attempted to communicate with the Informatica Server 100
consecutive times, and the Informatica Server timed out trying to receive the request.
If pmcmd returns from wait mode with this error code, the session/batch may or may not be running.
If you use pmcmd in wait mode to see if the session or batch completed successfully, use the Server Manager
to check its status or open the server error log file.
If you use pmcmd in wait mode as a series of related commands, you may need to work around this return
value.
13
The username environment variable [variable name] is not defined.
14
The password environment variable [variable name] is not defined.
15
The username environment variable is missing.
16
The password environment variable is missing.
17
Parameter file does not exist.
18
The Informatica Server found the parameter file, but experienced errors expanding the start values for the
session parameters. The parameter file may not have the start values for the session parameters.
216
Stopping Sessions / Batches Using PMCMD
Use the following syntax to stop a session or batch on a Windows NT/2000 system:
pmcmd stop {user_name | %user_env_var} {password | %password_env_var } {[TCP/IP:]
[hostname:]portno | IPX/SPX: ipx/spx_address} [folder_name:] {session_name |
batch_name} [:pf=param_file] session_flag wait_flag
Use the following syntax to stop a session or batch on a UNIX system:

pmcmd stop {user_name | %user_env_var} {password | %password_env_var}
[hostname:]portno [folder_name:]{session_name | batch_name}
[:pf=param_file] session_flag wait_flag
217
Recovering Sessions Using PMCMD
Use pmcmd to recover a standalone session. You cannot use pmcmd to recover a
session in a batch.
Use the following syntax to recover a standalone session on a Windows NT/2000
system:
pmcmd startrecovery {user_name | %user_env_var}
{password | %password_env_var}
[folder_name:]session_name [:pf=param_file] wait_flag
Use the following syntax to recover a standalone session on a UNIX system:

pmcmd startrecovery {user_name | %user_env_var}
{password | %password_env_var} [hostname:]portno
[folder_name:]session_name [:pf=param_file] wait_flag
218
Stopping the Server Using PMCMD
Use the following syntax to stop Informatica Server on a Windows NT/2000 system:
pmcmd stopserver {user_name | %user_env_var}
{password | %password_env_var}
Use the following syntax to stop Informatica Server on a UNIX system:

pmcmd stopserver {user_name | %user_env_var}
{password | %password_env_var} [hostname:]portno
219
PMCMD Stop Server Return Values

Return
Value
Description
The Informatica Server successfully stopped.
The Informatica Server is down, or pmcmd cannot connect to the Informatica Server. The TCP/IP host name
or port number, or IPX/SPX address (if applicable) may be incorrect, or a network problem occurred.
An error occurred while stopping the Informatica Server. Contact Informatica Technical Support.
You used an invalid username or password.
You do not have the appropriate permissions or privileges to perform this action.
Server timed out while waiting for the request. Try sending it again.
13
The username environment variable [variable name] is not defined.
14
The password environment variable [variable name] is not defined.
15
The username environment variable is missing.
16
The password environment variable is missing.
220
Reject Loading
During a session, the Informatica Server creates a reject file for each target instance in the
mapping. If the writer or the target rejects data, the Informatica Server writes the rejected row into
the reject file.
The reject file and session log contain information that helps you determine the cause of the
reject.
You can correct reject files and load them to relational targets using the Informatica reject loader
utility. The reject loader also creates another reject file for the data that the writer or target reject
during the reject loading.
Each time you run a session, the Informatica Server appends rejected data to the reject file.
Complete the following tasks to load reject data into the target:
Locate the reject file.
Correct bad data.
Run the reject loader utility.
221
Reading Reject Files
Reject files contain rows of data rejected by the writer or the target database. Though the
Informatica Server writes the entire row in the reject file, the problem generally centers on one
column within the row. To help you determine which column caused the row to be rejected, the
Informatica Server adds row and column indicators to give you more information about each
column:
Row indicator. The first column in each row of the reject file is the row indicator. The
numeric indicator tells whether the row was marked for insert, update, delete, or reject.
Column indicator. Column indicators appear after every column of data. The alphabetical
character indicators tell whether the data was valid, overflow, null, or truncated.
The following sample reject file shows the row and column indicators:
3,D,1,D,,D,0,D,1094945255,D,0.00,D,-0.00,D
0,D,1,D,April,D,1997,D,1,D,-1364.22,D,-1364.22,D
0,D,1,D,April,D,2000,D,1,D,2560974.96,D,2560974.96,D
3,D,1,D,April,D,2000,D,0,D,0.00,D,0.00,D
0,D,1,D,August,D,1997,D,2,D,2283.76,D,4567.53,D
0,D,3,D,December,D,1999,D,1,D,273825.03,D,273825.03,D
0,D,1,D,September,D,1997,D,1,D,0.00,D,0.00,D
222
Row Indicators
The first column in the reject file is the row indicator. The number listed as the row indicator tells
the writer what to do with the row of data.
Row Indicator
Meaning
Rejected By
Insert
Writer or target
Update
Writer or target
Delete
Writer or target
Reject
Writer
223
Column Indicators
After the row indicator is a column indicator, followed by the first column of data, and another
column indicator. Column indicators appear after every column of data and define the type of the
data preceding it.
Column Indicator
Type of data
Writer Treats As
Valid data.
Good data. Writer passes it to the target

database. The target accepts it unless a
database error occurs, such as finding a
duplicate key.
Overflow. Numeric data exceeded

the specified precision or scale for
the column.
Bad data, if you configured the mapping

target to reject overflow or truncated data.
Null. The column contains a null

value.
Good data. Writer passes it to the target,

which rejects it if the target database does
not accept null values.
Truncated. String data exceeded a

specified precision for the column,
so the Informatica Server truncated
it.
Bad data, if you configured the mapping

target to reject overflow or truncated data.
224
Running Reject Loading Session
After you correct the reject file and rename it to reject_file.in, you can use the reject loader to send
those files through the writer to the target database.
Use the reject loader utility from the command line to load rejected files into target tables. The
syntax for reject loading differs on UNIX and Windows NT/2000 platforms.
Use the following syntax for UNIX:
pmrejldr pmserver.cfg [folder_name:]session_name
Use the following syntax for Windows NT/2000:
pmrejldr [folder_name:]session_name
225
Commit Points
A commit interval is the interval at which the Informatica Server commits data to relational targets
during a session. You can choose between the following types of commit interval:
Target-based commit. The Informatica Server commits data based on the number of
target rows and the key constraints on the target table. The commit point also depends on
the buffer block size and the commit interval.
Source-based commit. The Informatica Server commits data based on the number of
source rows. The commit point is the commit interval you configure in the session
properties.
226
Target Based Commit

During a target-based commit session, the Informatica Server continues to fill the writer buffer
after it reaches the commit interval. When the buffer block is filled, the Informatica Server issues a
commit command. As a result, the amount of data committed at the commit point generally
exceeds the commit interval.
For example, a session is configured with target-based commit interval of 10,000. The writer
buffers fill every 7,500 rows. When the Informatica Server reaches the commit interval of 10,000, it
continues processing data until the writer buffer is filled. The second buffer fills at 15,000 rows,
and the Informatica Server issues a commit to the target. If the session completes successfully,
the Informatica Server issues commits after 15,000, 22,500, 30,000, and 40,000 rows.
227
Source Based Commit

During a source-based commit session, the Informatica Server commits data to the target based
on the number of rows from an active source in a single pipeline. These rows are referred to as
source rows. An active source can be any of the following active transformations:
Source Qualifier
Normalizer
Aggregator
Joiner
Rank
Mapplet, if it contains one of the above transformations
Note: Although the Filter, Router, and Update Strategy transformations are active transformations,
the Informatica Server does not use them as active sources in a source-based commit session.
The Informatica Server generates a commit row from the active source at every commit interval.
When each target in the pipeline receives the commit row, the Informatica Server performs the
commit.
The number of rows held in the writer buffers does not affect the commit point for a source-based
commit session.
228
Performance Tuning
The most common performance bottleneck occurs when the Informatica Server writes to a target
database. You can identify performance bottlenecks by the following methods:
Running test sessions. You can configure a test session to read from a flat file source or
to write to a flat file target to identify source and target bottlenecks.
Studying performance details. You can create a set of information called performance
details to identify session bottlenecks. Performance details provide information such as
buffer input and output efficiency.
Monitoring system performance. You can use system monitoring tools to view percent
CPU usage, I/O waits, and paging to identify system bottlenecks.
229
Identifying Performance Bottleneck

Performance bottlenecks can occur in the source and target databases, the mapping,
the session, and the system. Generally, you should look for performance bottlenecks
in the following order:
1.
Target
2.
Source
3.
Mapping
4.
Session
5.
System
230
Identifying Target Bottleneck

The most common performance bottleneck occurs when the Informatica Server writes to a target
database. You can identify target bottlenecks by configuring the session to write to a flat file target.
If the session performance increases significantly when you write to a flat file, you have a target
bottleneck.
Optimizing the Target Database
If your session writes to a flat file target, you can optimize session performance by writing to a flat
file target that is local to the Informatica Server. If your session writes to a relational target,
consider performing the following tasks to increase performance:
Drop indexes and key constraints.
Increase checkpoint intervals.
Use bulk loading.
Use external loading.
Turn off recovery.
Increase database network packet size.
Optimize Oracle target databases
231
Identifying Source Bottleneck

Performance bottlenecks can occur when the Informatica Server reads from a source database. If
your session reads from a flat file source, you probably do not have a source bottleneck. You can
improve session performance by setting the number of bytes the Informatica Server reads per line
if you read from a flat file source.
If the session reads from relational source, you can use
Filter transformation
Read test mapping
Database query
Optimizing the Source Database
If your session reads from a relational source, review the following suggestions for improving
performance:
Optimize the query.
Create tempdb as in-memory database.
Use conditional filters.
Increase database network packet size.
Connect to Oracle databases using IPC protocol.
232
Identifying Mapping Bottleneck

You can identify mapping bottlenecks by using a Filter transformation in the mapping.
If you determine that you do not have a source bottleneck, you can add a Filter transformation in
the mapping before each target definition. Set the filter condition to false so that no data is loaded
into the target tables. If the time it takes to run the new session is the same as the original
session, you have a mapping bottleneck.
You can also identify mapping bottlenecks by using performance details. High errorrows and
rowsinlookupcache counters indicate a mapping bottleneck.
High Rowsinlookupcache Counters
Multiple lookups can slow down the session. You might improve session performance by locating
the largest lookup tables and tuning those lookup expressions.
High Errorrows Counters
Transformation errors impact session performance. If a session has large numbers in any of the
Transformation_errorrows counters, you might improve performance by eliminating the errors.
233
Optimizing a Mapping
Generally, you reduce the number of transformations in the mapping and delete unnecessary links
between transformations to optimize the mapping. You should configure the mapping with the least
number of transformations and expressions to do the most amount of work possible. You should
minimize the amount of data moved by deleting unnecessary links between transformations.
For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup
transformations), limit connected input/output or output ports. Limiting the number of connected
input/output or output ports reduces the amount of data the transformations store in the data cache.
You can also perform the following tasks to optimize the mapping:
Configure single-pass reading.
Optimize datatype conversions.
Eliminate transformation errors.
Optimize transformations.
Optimize expressions.
234
Identifying Session Bottleneck

You can identify a session bottleneck by using the performance details. The Informatica Server
creates performance details when you enable Collect Performance Data on the General tab of the
session properties.
Performance details display information about each Source Qualifier, target definition, and
individual transformation. All transformations have some basic counters that indicate the number
of input rows, output rows, and error rows.
Any value other than zero in the readfromdisk and writetodisk counters for Aggregator, Joiner, or
Rank transformations indicate a session bottleneck. Low BufferInput_efficiency and
BufferOutput_efficiency counter values also indicate a session bottleneck.
Small cache size, low buffer memory, and small commit intervals can cause session bottlenecks.
235
Optimizing a Session
You can perform the following tasks to improve overall performance:
Run concurrent batches.
Partition sessions.
Reduce errors tracing.
Remove staging areas.
Tune session parameters.
236
Identifying System Bottleneck

You can identify system bottlenecks by using system tools to monitor CPU usage, memory usage,
and paging.
The Informatica Server uses system resources to process transformation, session execution, and
reading and writing data. The Informatica Server also uses system memory for other data such as
aggregate, joiner, rank, and cached lookup tables. You can use system performance monitoring
tools to monitor the amount of system resources the Informatica Server uses and identify system
bottlenecks.
On Windows NT/2000, you can use system tools in the Task Manager or Administrative Tools.
On UNIX systems you can use system tools such as vmstat and iostat to monitor system
performance.
237
Identifying System Bottleneck on Windows NT / 2000

Use the Windows NT/2000 Performance Monitor to create a chart that provides the following
information:
Percent processor time. If you have several CPUs, monitor each CPU for percent
processor time. If the processors are utilized at more than 80%, you may consider
adding more processors.
Pages/second. If pages/second is greater than five, you may have excessive memory
pressure (thrashing). You may consider adding more physical memory.
Physical disks percent time. This is the percent time that the physical disk is busy
performing read or write requests. You may consider adding another disk device or
upgrading the disk device.
Physical disks queue length. This is the number of users waiting for access to the
same disk device. If physical disk queue length is greater than two, you may consider
adding another disk device or upgrading the disk device.
Server total bytes per second. This is the number of bytes the server has sent to and
received from the network. You can use this information to improve network bandwidth.
238
Identifying System Bottleneck on UNIX

You can use UNIX tools to monitor user background process, system swapping actions, CPU
loading process, and I/O load operations. When you tune UNIX systems, tune the server for a
major database system. Use the following UNIX tools to identify system bottlenecks on the UNIX
system:
lsattr -E -I sys0. Use this tool to view current system settings. This tool shows
maxuproc, the maximum level of user background processes. You may consider
reducing the amount of background process on your system.
iostat. Use this tool to monitor loading operation for every disk attached to the database
server. Iostat displays the percentage of time that the disk was physically active. High
disk utilization suggests that you may need to add more disks. If you use disk arrays, use
utilities provided with the disk arrays instead of iostat.
vmstat or sar -w. Use this tool to monitor disk swapping actions. Swapping should not
occur during the session. If swapping does occur, you may consider increasing your
physical memory or reduce the number of memory-intensive applications on the disk.
sar -u. Use this tool to monitor CPU loading. This tool provides percent usage on user,
system, idle time, and waiting time. If the percent time spent waiting on I/O (%wio) is
high, you may consider using other under-utilized disks. For example, if your source
data, target data, lookup, rank, and aggregate cache files are all on the same disk,
consider putting them on different disks.
239

Informatica

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Informatica

Hochgeladen von

Copyright:

Verfügbare Formate

Informatica

PowerMart / PowerCenter 8.6

Informatica repository. The Informatica repository is at the center of the

Relational. Oracle, Sybase, Informix, IBM DB2, Microsoft SQL

Extended. If you use PowerCenter, you can purchase additional

Mainframe. If you use PowerCenter, you can purchase

Other. Microsoft Excel and Access.

File. Fixed and delimited flat files and XML.

Extended. If you use PowerCenter, you can purchase an integration server

Other. Microsoft Access.

Source Analyzer. Import or create source definitions.

The Informatica Client uses ODBC and native drivers to connect to

The Informatica Client uses ODBC and native drivers to connect to

Using Repository Manager

Repository Manager Windows

Main. Provides properties of the object selected in the Navigator window.

Dependency. Shows dependencies on sources, targets, mappings, and

Output. Provides the output of tasks executed within the Repository

Repository Manager Windows

Repository Manager Navigator

Source definitions. Definitions of database objects (tables, views, synonyms) or files

Multi-dimensional metadata. Target definitions that are configured as cubes and

Mappings. A set of source and target definitions along with transformations

Reusable transformations. Transformations that you can use in multiple mappings.

Mapplets. A set of transformations that you can use in multiple mappings.

Import source definitions. Use the Source Analyzer to connect to the

Create or import target definitions. Use the Warehouse Designer to

Navigator. Connect to repositories, and open folders within the Navigator.

Informatica Server connection. Register the Informatica Server with the

Database connections. Create connections to source and target systems.

Other connections. If you want to use external loaders or FTP, you

Informatica Server. If you use PowerCenter, you can select an Informatica

Scheduling information. Schedule the session to run on demand or on a

Error handling. Configure error handling parameters that determine how

Post-session email. Send post-session email dependent on success and

Server Manager Windows

Navigator. View and select configured sessions.

Configure. Create and edit sessions.

Monitor. View information about running and completed sessions.

Output. View messages from the Informatica Server.

Status. Displays the status of the operation you perform.

Server Manager Windows

Launch the Repository Manager by choosing Programs-PowerCenter (or

In the Repository Manager, choose Repository-Create Repository.

Creating Repository User & Groups

Privileges. Repository-wide security that controls which task or set of tasks

Permissions. Security assigned to individual folders within the repository.

Can browse repository content through the Repository Manager, add

Create Sessions and Batches

Can create, upgrade, backup, delete and restore repositories, Can

Can configure connections to the Informatica server through the

Write permission. Allows you to create or edit objects in the folder.

Execute permission. Allows you to execute or schedule a session or

Folder Permission Levels

Import from Database

Import from File

Viewing Source Definitions

Viewing Source Definitions

Create a target definition based on a source definition. Drag one of

Manually create a target definition. Create and design a target definition

Design several related targets. Create several related target definitions

Creating a Pass-Through Mapping

Creating Simple Mapping

Switch to the Mapping Designer.