Sie sind auf Seite 1von 15

Informatica Power Center

PowerCenter provides an environment that allows you to load data into a


centralized location, such as a datamart, data warehouse, or operational data store
(ODS). You can extract data from multiple sources, transform the data according to
business logic you build in the client application, and load the transformed data into
file and relational targets. PowerCenter provides the following integrated
components:

PowerCenter repository. The PowerCenter repository is at the center of


the PowerCenter suite. You create a set of metadata tables within the
repository database that the PowerCenter applications and tools access. The
PowerCenter Client and Server access the repository to save and retrieve
metadata.

PowerCenter Repository Server. The PowerCenter Repository Server


manages connections to the repository from client applications. It inserts,
updates, and fetches objects from the repository database tables. It also
maintains object consistency.

PowerCenter Client. Use the PowerCenter Client to manage users, define


sources and targets, build mappings and mapplets with the transformation
logic, and create workflows to run the mapping logic. The PowerCenter Client
has the following client applications: Repository Manager, Repository Server
Administration Console, Designer, Workflow Manager, and Workflow Monitor.

PowerCenter Server. The PowerCenter Server extracts the source data,


performs the data transformation, and loads the transformed data into the
targets.

Sources
PowerCenter accesses the following sources:

Relational. Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server, and
Teradata.

File. Fixed and delimited flat file, COBOL file, and XML.

Application. You can purchase additional PowerConnect products to access


business sources, such as PeopleSoft, SAP R/3, Siebel, IBM MQSeries, and
TIBCO.

Mainframe. You can purchase PowerConnect for Mainframe for faster access
to IBM DB2 on MVS.

Other. Microsoft Excel and Access.

Note: The Designer imports relational sources, such as Microsoft Excel, Microsoft
Access, and Teradata using ODBC and native drivers.
For more information about sources, see Working with Sources in the Designer
Guide.
Targets
PowerCenter can load data into the following targets:

Relational. Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL
Server, and Teradata.

File. Fixed and delimited flat file and XML.

Application. You can purchase additional PowerConnect products to load


data into SAP BW. You can also load data into IBM MQSeries message queues
and TIBCO.

Other. Microsoft Access.

You can load data into targets using ODBC or native drivers, FTP, or external
loaders.
For more information about targets, see Working with Targets in the Designer
Guide.
Power Center repository
The PowerCenter repository resides on a relational database. The repository
database tables contain the instructions required to extract, transform, and load
data. PowerCenter Client applications access the repository database tables through
the Repository Server.
You add metadata to the repository tables when you perform tasks in the
PowerCenter Client application, such as creating users, analyzing sources,
developing mappings or mapplets, or creating workflows. The PowerCenter Server
reads metadata created in the Client application when you run a workflow. The
PowerCenter Server also creates metadata, such as start and finish times of a
session or session status.
You can develop global and local repositories to share metadata:

Global repository. The global repository is the hub of the domain. Use the
global repository to store common objects that multiple developers can use
through shortcuts. These objects may include operational or Application
source definitions, reusable transformations, mapplets, and mappings.

Local repositories. A local repository is within a domain that is not the


global repository. Use local repositories for development. From a local
repository, you can create shortcuts to objects in shared folders in the global
repository. These objects typically include source definitions, common

dimensions and lookups, and enterprise standard transformations. You can


also create copies of objects in non-shared folders.

Version control. A versioned repository can store multiple copies, or


versions, of an object. Each version is a separate object with unique
properties. PowerCenter version control features allow you to efficiently
develop, test, and deploy metadata into production.

You can connect to a repository, back up, delete, or restore repositories using
pmrep, a command line program. For more information on pmrep, see Using
pmrep.
Repository Server
The Repository Server manages repository connection requests from
client applications. For each repository database registered with the Repository
Server, it configures and manages a Repository Agent process. The Repository
Server also monitors the status of running Repository Agents, and sends repository
object notification messages to client applications.
The Repository Agent is a separate, multi-threaded process that
retrieves, inserts, and updates metadata in the repository database tables. The
Repository Agent ensures the consistency of metadata in the repository by
employing object locking.
PowerCenter Client
The PowerCenter Client consists of the following applications that you use to
manage the repository, design mappings, mapplets, and create sessions to load the
data:

Repository Server Administration Console. Use the Repository Server


Administration console to administer the Repository Servers and repositories.

Repository Manager. Use the Repository Manager to administer the


metadata repository. You can create repository users and groups, assign
privileges and permissions, and manage folders and locks.

Designer. Use the Designer to create mappings that contain transformation


instructions for the PowerCenter Server. Before you can create mappings, you
must add source and target definitions to the repository. The Designer has
five tools that you use to analyze sources, design target schemas, and build
source-to-target mappings:
o

Source Analyzer. Import or create source definitions.

Warehouse Designer. Import or create target definitions.

Transformation Developer. Develop reusable transformations to use


in mappings.

Mapplet Designer. Create sets of transformations to use in mappings.

Mapping Designer. Create mappings that the PowerCenter Server


uses to extract, transform, and load data.

Workflow Manager. Use the Workflow Manager to create, schedule, and run
workflows. A workflow is a set of instructions that describes how and when to
run tasks related to extracting, transforming, and loading data. The
PowerCenter Server runs workflow tasks according to the links connecting the
tasks. You can run a task by placing it in a workflow.

Workflow Monitor. Use the Workflow Monitor to monitor scheduled and


running workflows for each PowerCenter Server. You can choose a Gantt
Chart or Task view. You can also access details about those workflow runs.

Install the client tools on a Microsoft Windows machine. For more information about
installation requirements, see Minimum System Requirements.
Power Center Server
The Power Center Server reads mapping and session information from the
repository. It extracts data from the mapping sources and stores the data in memory
while it applies the transformation rules that you configure in the mapping. The
Power Center Server loads the transformed data into the mapping targets.
The Power Center Server can achieve high performance using symmetric multiprocessing systems. The Power Center Server can start and run multiple workflows
concurrently. It can also concurrently process partitions within a single session.
When you create multiple partitions within a session, the Power Center Server
creates multiple database connections to a single source and extracts a separate
range of data for each connection, according to the properties you configure.
Database Connections
The Repository Server maintains a pool of reusable database connections for
serving client applications. The server generates a Repository Agent process for
each database. The Repository Agent creates new database connections only if all
the current connections are in use.
For example, if 10 clients send requests to the Repository Agent one at a time, the
agent requires only one connection. It reuses the same database connection for all
the requests. If the 10 clients send requests simultaneously, the Repository Agent

opens 10 connections. You can set the maximum number of open connections using
the DatabasePoolSize parameter in the repository configuration file.
For a session, a reader object holds the connection for as long as it needs to read
the data from the source tables. A writer object holds a connection for as long as it
needs to write data to the target tables.
The PowerCenter Server maintains a database connection pool for stored procedure
or lookup databases in a workflow. You can optionally set the
MaxLookupSPDBConnections parameter to limit connections when you configure the
PowerCenter service. The PowerCenter Server allows an unlimited number of
connections to lookup or stored procedure databases. If a database user does not
have permission for the number of connections a session requires, the session fails.
For pre-session, post-session, and load stored procedures, consecutive stored
procedures reuse a connection if they have identical connection attributes.
Otherwise, the connection for one stored procedure closes and a new connection
begins for the next stored procedure.
PowerCenter Metadata Reporter
PowerCenter provides PowerCenter Metadata Reporter, a web-based application
that allows you to run reports against PowerCenter repository metadata. It gives you
insight into your repository, which enhances your ability to analyze and manage
your repository efficiently.
The Metadata Reporter provides a number of reports, including reports on
transformations, mapplets, mappings, sources, targets, sessions, worklets, and
workflows.
Using the Repository Server Administration Console
Use the Repository Server Administration Console to administer your Repository
Servers and repositories. A Repository Server can manage multiple repositories. You
use the Repository Server Administration Console to create and administer the
repository through the Repository Server.
You can use the Administration Console to perform the following tasks:

Add, edit, and remove repository configurations.

Export and import repository configurations.

Create a repository.

Promote a local repository to a global repository.

Copy a repository.

Delete a repository from the database.

Back up and restore a repository.

Start, stop, enable, and disable repositories.

Send repository notification messages.

Register and unregister a repository.

Propagate domain connection information for a repository.

View repository connections and locks.

Close repository connections.

Register and remove repository plug-ins.

Upgrade a repository.

Repository Objects
You create repository objects using the Repository Manager, Designer, and Workflow
Manager client tools. You can view the following objects in the Navigator window of
the Repository Manager:

Source definitions. Definitions of database objects (tables, views,


synonyms) or files that provide source data.

Target definitions. Definitions of database objects or files that contain the


target data.

Multi-dimensional metadata. Target definitions that are configured as


cubes and dimensions.

Mappings. A set of source and target definitions along with transformations


containing business logic that you build into the transformation. These are
the instructions that the PowerCenter Server uses to transform and move
data.

Reusable transformations. Transformations that you can use in multiple


mappings.

Mapplets. A set of transformations that you can use in multiple mappings.

Sessions and workflows. Sessions and workflows store information about


how and when the PowerCenter Server moves data. A workflow is a set of
instructions that describes how and when to run tasks related to extracting,
transforming, and loading data. A session is a type of task that you can put in
a workflow. Each session corresponds to a single mapping.

The Design Process


The goal of the design process is to create mappings that depict the flow of data
between sources and targets, including changes made to the data before it reaches
the targets. However, before you can create a mapping, you must first create or

import source and target definitions. You might also want to create reusable objects,
such as reusable transformations or mapplets. For a list of objects you create in the
Design process, see Repository Objects.
Perform the following design tasks in the Designer:
1. Import source definitions. Use the Source Analyzer to connect to the
sources and import the source definitions.
2. Create or import target definitions. Use the Warehouse Designer to
define relational, flat file, or XML targets to receive data from sources. You
can import target definitions from a relational database or a flat file, or you
can manually create a target definition.
3. Create the target tables. If you add a target definition to the repository
that does not exist in a relational database, you need to create target tables
in your target database. You do this by generating and executing the
necessary SQL code within the Warehouse Designer.
4. Design mappings. Once you have source and target definitions in the
repository, you can create mappings in the Mapping Designer. A mapping is a
set of source and target definitions linked by transformation objects that
define the rules for data transformation. A transformation is an object that
performs a specific function in a mapping, such as looking up data or
performing aggregation.
5. Create mapping objects. Optionally, you can create reusable objects for
use in multiple mappings. Use the Transformation Developer to create
reusable transformations. Use the Mapplet Designer to create mapplets. A
mapplet is a set of transformations that may contain sources and
transformations.
6. Debug mappings. Use the Mapping Designer to debug a valid mapping to
gain troubleshooting information about data and error conditions.
7. Import and export repository objects. You can import and export
repository objects, such as sources, targets, transformations, mapplets, and
mappings to archive or share metadata.
Workflow Manager
The Workflow Manager consists of three tools to help you develop a workflow:

Task Developer. Create tasks you want to accomplish in the workflow in the
Task Developer.

Workflow Designer. Create a workflow by connecting tasks with links in the


Workflow Designer. You can also create tasks in the Workflow Designer as you
develop the workflow.

Worklet Designer. Create a worklet in the Worklet Designer. A worklet is an


object that groups a set of tasks. A worklet is similar to a workflow, but
without scheduling information. You can nest multiple worklets inside a
workflow.

Before you create a workflow, you must configure the following connection
information:

PowerCenter Server connection. Register the PowerCenter Server with


the repository before you can start it or create a session to run against it.

Database connections. Create connections to source and target systems.

Other connections. If you want to use external loaders or FTP, you


configure these connections in the Workflow Manager.

Workflow Monitor
After you create a workflow, you run the workflow in the Workflow Manager and
monitor it in the Workflow Monitor. The Workflow Monitor is a tool that displays
details about workflow runs in two views, Gantt Chart view and Task view. You can
monitor workflows in online and offline modes.
The Workflow Monitor consists of the following windows:

Navigator window. Displays monitored repositories, servers, and


repositories objects.

Output window. Displays messages from the PowerCenter Server.

Time window. Displays progress of workflow runs.

Gantt Chart view. Displays details about workflow runs in chronological


format.

Task view. Displays details about workflow runs in a report format.

Getting Started
Before you can begin using PowerCenter, you must create the environment and
perform the following administration tasks to allow access to the repository and the
PowerCenter Server:
1. Configure the sources. If you extract data from relational sources, ask the
database administrator to create user profiles with read access. These user
profiles allow you to import source definitions into the repository and access
the sources at runtime.
If you extract data from file sources, the files must be accessible to the PowerCenter
Server and Client machines.
2. Configure the targets. Ask the database administrator to create user
profiles with read and write access. These user profiles allow you to import
target definitions into the repository and write to the targets at runtime.

If the target database does not exist, create it using the database administration
tools included with your RDBMS. After you create the target database, you can use
the Designer to design and create target tables.
For flat file targets, you need a target directory large enough to process the
resulting files.
3. Choose globalization settings and data movement modes. The data
movement mode you use depends on whether you want the PowerCenter
Server to process single-byte data or multibyte character data. You select
code pages for the repository, PowerCenter Client and PowerCenter Server.
4. Create repository database. Create a database for the repository. Users
accessing the repository database need full rights in that database. If you
upgrade the repository to a new version, you need database rights to drop or
modify these tables.
5. Install the PowerCenter Client. Install the client software on a machine
that accesses the sources, targets, and repository databases, as well as the
PowerCenter Server.
6. Install and configure the Repository Server. Install and configure the
Repository Server on a machine that accesses the repository database, the
PowerCenter Client, and the PowerCenter Server.
7. Install and configure the PowerCenter Server. Install the PowerCenter
Server on a Windows or UNIX system that accesses the sources, targets, and
the repository database.
8. Configure connectivity. Configure network, native, and ODBC connectivity.
Create ODBC data sources to connect to the PowerCenter Clients to the
sources and targets. You must also have network connections between all
databases and PowerCenter Servers.
9. Create the repository. After you configure connectivity between source,
target, and repository databases, you can create the metadata repository.
Connect to the Repository Server from within the Repository Server
Administration Console to create the metadata repository. The Repository
Server connects to the repository database and runs the SQL to create the
repository tables. All the objects you create with PowerCenter are stored as
metadata in the repository.
10.Create repository users and groups. Create groups and user profiles,
then assign privileges and permissions that determine tasks that users can
perform.

11.Register the PowerCenter Server. Before you can start the PowerCenter
Server, you must register the PowerCenter Server so the Workflow Manager
can direct the PowerCenter Server to the repository.
12.
Changing Data Movement Modes
13.You can change the PowerCenter Server data movement mode in the
PowerCenter Server configuration parameters. After you change the data
movement mode, the PowerCenter Server runs in the new data movement
mode the next time you start the PowerCenter Server. When the data
movement mode changes, the PowerCenter Server handles character data
differently. To avoid creating data inconsistencies in your target tables, the
PowerCenter Server performs additional checks for sessions that reuse
session caches and files.
14.Table 2-1 describes how the PowerCenter Server handles session files and
caches after you change the data movement mode:
15.
Table 2-1. Session and File Cache Handling After Data Movement Mode Change
Session File or
Cache

PowerCenter Server Behavior


Time of Creation or Use

After Data Movement Mode


Change
No change in behavior. Creates a

Session Log File


(*.log)

Each session.

new session log for each session


using the PowerCenter Server code
page.
No change in behavior. Creates a

Workflow Log

Each workflow.

new workflow log file for each


workflow using the PowerCenter
Server code page.
No change in behavior. Appends

Reject File
(*.bad)

Each session.

rejected data to the existing reject


file using the PowerCenter Server
code page.
No change in behavior for delimited

Output File

Sessions writing to flat

flat files. Creates a new output file

(*.out)

file.

for each session using the target


code page.

Indicator File

Sessions writing to flat

No change in behavior. Creates a

(*.in)

file.

new indicator file for each session.

When files are removed or deleted,


the PowerCenter Server creates new
files.
When files are not removed or
deleted, the PowerCenter Server
fails the session with the following
Incremental
Aggregation Files
(*.idx, *.dat)

error message:
Sessions with Incremental
Aggregation enabled.

TE_7038 Aggregate Error:


ServerMode: [server data
movement mode] and CachedMode:
[data movement mode that created
the files] mismatch.
You should also remove or delete
files created using a different code
page.

Unnamed

Sessions with a Lookup

Persistent

transformation configured

Rebuilds the persistent lookup

Lookup Files

for a named persistent

cache.

(*.idx, *.dat)

lookup cache.

Named

Sessions with a Lookup

Persistent

transformation configured

Lookup Files

for a persistent lookup

(*.idx, *.dat)

cache.

Fails the session.

Code Page Overview


A code page contains the encoding to specify characters in a set of one or
more languages. An encoding is the assignment of a number to a character in the
character set. You use code pages to identify data that might be in different
languages. For example, if you are importing Japanese data into a mapping, you
must select a Japanese code page for the source data.
When you choose a code page, the program or application for which you
set the code page refers to a specific set of data that describes the characters the
application recognizes. This influences the way that application stores, receives, and
sends character data.

Table 2-2. Code Page Compatibility


Component Code Page

Code Page Compatibility

Source (including relational, Subset of target.


flat file, and XML file)

Subset of PowerCenter Server.


Superset of source.

Target (including relational,

Superset of PowerCenter Server.

XML files, and flat files)

PowerCenter Server creates external loader data and


control files using the target flat file code page.

Lookup and Stored


Procedures

Compatible with PowerCenter Server and repository.


Superset of source.
Subset of target.
Identical to PowerCenter Server operating system

PowerCenter Server

and machine hosting pmcmd.


Compatible with repository and PowerCenter Client.
Compatible with database connection code page
used by Lookup and Stored Procedure
transformations.
Compatible with repository.

Repository Server

Compatible with PowerCenter Client and


PowerCenter Server.
Compatible with local repository. Can also be a

Global Repository

subset of local repository.


Compatible with PowerCenter Client and Server.
Compatible with global repository. Can also be a

Local Repository

superset of global repository.


Compatible with PowerCenter Client and Server.

Standalone Repository

Compatible with PowerCenter Client and Server.

PowerCenter Client

Compatible with PowerCenter Server and repository.

Machine hosting pmcmd

Identical to PowerCenter Server.

Power Center Server Variable Directories


The installation program creates the following directories under the installation
directory to store session files and caches associated with each PowerCenter Server:

BadFiles

Cache

ExtProc

LkpFiles

SessLogs

SrcFiles

Temp

TgtFiles

WorkflowLogs

All workflows use these directories by default


Server Variables
You can define server variables for each PowerCenter Server you register. Server
variables define the path and directories for session and workflow output files and
caches. You can also use server variables to define workflow properties, such as the
number of workflow logs to archive.
The installation process creates default directories in the location where you install
the PowerCenter Server. By default, the PowerCenter Server writes output files in
these directories when you run a workflow. To use these directories as the default
location for the session and workflow output files, you must configure the server
variable $PMRootDir to define the path to the directories.
Sessions and workflows are configured to use server directories by default. You can
override the default by entering different directories session or workflow properties.
For example, you might have a PowerCenter Server running all workflows in a
repository. If you define the server variable for workflow logs directory as
c:\pmserver\workflowlog, the PowerCenter Server saves the workflow log for each
workflow in c:\pmserver\workflowlog by default.
If you change the default server directories, make sure the designated directories
exist before running a workflow. If the PowerCenter Server cannot resolve a
directory during the workflow, it cannot run the workflow.
By using server variables instead of hard-coding directories and parameters, you
simplify the process of changing the PowerCenter Server that runs a workflow. If
each workflow in a development folder uses server variables, then when you copy
the folder to a production repository, the production server can run the workflow as
configured. When the production server runs the workflow, it uses the directories
configured for its server variables. If, instead, you changed workflow to use hardcoded directories, workflows fail if those directories do not exist on the production
server.
Table 11-1 lists the server variables you configure when you register a PowerCenter
Server:

Table 11-1. Server Variables


Server Variable

Required/Optio

Description

nal

A root directory to be used by any or


all other server variables. Informatica
$PMRootDir

Required

recommends you use the PowerCenter


Server installation directory as the root
directory.

$PMSessionLogDir

Required

$PMBadFileDir

Required

Default directory for session logs.


Defaults to $PMRootDir/SessLogs.
Default directory for reject files.
Defaults to $PMRootDir/BadFiles.
Default directory for the lookup cache,
index and data caches, and index and
data files. To avoid performance

$PMCacheDir

Required

problems, always use a drive local to


the PowerCenter Server for the cache
directory. Do not use a mapped or
mounted drive for cache files. Defaults
to $PMRootDir/Cache.

$PMTargetFileDir

Required

$PMSourceFileDir

Required

Default directory for target files.


Defaults to $PMRootDir/TgtFiles.
Default directory for source files.
Defaults to $PMRootDir/SrcFiles.
Default directory for external

$PMExtProcDir

Required

procedures. Defaults to
$PMRootDir/ExtProc.

$PMTempDir

Required

Default directory for temporary files.


Defaults to $PMRootDir/Temp.
Email address to receive post-session

$PMSuccessEmailUser

Optional

email when the session completes


successfully. Use to address postsession email.

$PMFailureEmailUser

Optional

Email address to receive post-session


email when the session fails. Use to

address post-session email. Default is


an empty string. For details, see
Sending Emails in the Workflow
Administration Guide.
Number of session logs the
PowerCenter Server archives for the
$PMSessionLogCount

Optional

session. Defaults to 0. Use to archive


session logs. For details, see Log Files
in the Workflow Administration Guide.
Number of non-fatal errors the
PowerCenter Server allows before
failing the session. Non-fatal errors
include reader, writer, and DTM errors.
If you want to stop the session on
errors, enter the number of non-fatal
errors you want to allow before

$PMSessionErrorThres
hold

Optional

stopping the session. The PowerCenter


Server maintains an independent error
count for each source, target, and
transformation. Use to configure the
Stop On option in the session
properties.
Defaults to 0. If you use the default
setting, non-fatal errors do not cause
the session to stop.

$PMWorkflowLogDir

Required

Default directory for workflow logs.


Defaults to $PMRootDir/WorkflowLogs.
Number of workflow logs the

$PMWorkflowLogCount Optional

PowerCenter Server archives for the


workflow. Use to archive workflow logs.
Defaults to 0.

$PMLookupFileDir

Optional

Default directory for lookup files.


Defaults to $PMRootDir/LkpFiles.

Das könnte Ihnen auch gefallen