Beruflich Dokumente
Kultur Dokumente
Sources
PowerCenter accesses the following sources:
Relational. Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server, and
Teradata.
File. Fixed and delimited flat file, COBOL file, and XML.
Mainframe. You can purchase PowerConnect for Mainframe for faster access
to IBM DB2 on MVS.
Note: The Designer imports relational sources, such as Microsoft Excel, Microsoft
Access, and Teradata using ODBC and native drivers.
For more information about sources, see Working with Sources in the Designer
Guide.
Targets
PowerCenter can load data into the following targets:
Relational. Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL
Server, and Teradata.
You can load data into targets using ODBC or native drivers, FTP, or external
loaders.
For more information about targets, see Working with Targets in the Designer
Guide.
Power Center repository
The PowerCenter repository resides on a relational database. The repository
database tables contain the instructions required to extract, transform, and load
data. PowerCenter Client applications access the repository database tables through
the Repository Server.
You add metadata to the repository tables when you perform tasks in the
PowerCenter Client application, such as creating users, analyzing sources,
developing mappings or mapplets, or creating workflows. The PowerCenter Server
reads metadata created in the Client application when you run a workflow. The
PowerCenter Server also creates metadata, such as start and finish times of a
session or session status.
You can develop global and local repositories to share metadata:
Global repository. The global repository is the hub of the domain. Use the
global repository to store common objects that multiple developers can use
through shortcuts. These objects may include operational or Application
source definitions, reusable transformations, mapplets, and mappings.
You can connect to a repository, back up, delete, or restore repositories using
pmrep, a command line program. For more information on pmrep, see Using
pmrep.
Repository Server
The Repository Server manages repository connection requests from
client applications. For each repository database registered with the Repository
Server, it configures and manages a Repository Agent process. The Repository
Server also monitors the status of running Repository Agents, and sends repository
object notification messages to client applications.
The Repository Agent is a separate, multi-threaded process that
retrieves, inserts, and updates metadata in the repository database tables. The
Repository Agent ensures the consistency of metadata in the repository by
employing object locking.
PowerCenter Client
The PowerCenter Client consists of the following applications that you use to
manage the repository, design mappings, mapplets, and create sessions to load the
data:
Workflow Manager. Use the Workflow Manager to create, schedule, and run
workflows. A workflow is a set of instructions that describes how and when to
run tasks related to extracting, transforming, and loading data. The
PowerCenter Server runs workflow tasks according to the links connecting the
tasks. You can run a task by placing it in a workflow.
Install the client tools on a Microsoft Windows machine. For more information about
installation requirements, see Minimum System Requirements.
Power Center Server
The Power Center Server reads mapping and session information from the
repository. It extracts data from the mapping sources and stores the data in memory
while it applies the transformation rules that you configure in the mapping. The
Power Center Server loads the transformed data into the mapping targets.
The Power Center Server can achieve high performance using symmetric multiprocessing systems. The Power Center Server can start and run multiple workflows
concurrently. It can also concurrently process partitions within a single session.
When you create multiple partitions within a session, the Power Center Server
creates multiple database connections to a single source and extracts a separate
range of data for each connection, according to the properties you configure.
Database Connections
The Repository Server maintains a pool of reusable database connections for
serving client applications. The server generates a Repository Agent process for
each database. The Repository Agent creates new database connections only if all
the current connections are in use.
For example, if 10 clients send requests to the Repository Agent one at a time, the
agent requires only one connection. It reuses the same database connection for all
the requests. If the 10 clients send requests simultaneously, the Repository Agent
opens 10 connections. You can set the maximum number of open connections using
the DatabasePoolSize parameter in the repository configuration file.
For a session, a reader object holds the connection for as long as it needs to read
the data from the source tables. A writer object holds a connection for as long as it
needs to write data to the target tables.
The PowerCenter Server maintains a database connection pool for stored procedure
or lookup databases in a workflow. You can optionally set the
MaxLookupSPDBConnections parameter to limit connections when you configure the
PowerCenter service. The PowerCenter Server allows an unlimited number of
connections to lookup or stored procedure databases. If a database user does not
have permission for the number of connections a session requires, the session fails.
For pre-session, post-session, and load stored procedures, consecutive stored
procedures reuse a connection if they have identical connection attributes.
Otherwise, the connection for one stored procedure closes and a new connection
begins for the next stored procedure.
PowerCenter Metadata Reporter
PowerCenter provides PowerCenter Metadata Reporter, a web-based application
that allows you to run reports against PowerCenter repository metadata. It gives you
insight into your repository, which enhances your ability to analyze and manage
your repository efficiently.
The Metadata Reporter provides a number of reports, including reports on
transformations, mapplets, mappings, sources, targets, sessions, worklets, and
workflows.
Using the Repository Server Administration Console
Use the Repository Server Administration Console to administer your Repository
Servers and repositories. A Repository Server can manage multiple repositories. You
use the Repository Server Administration Console to create and administer the
repository through the Repository Server.
You can use the Administration Console to perform the following tasks:
Create a repository.
Copy a repository.
Upgrade a repository.
Repository Objects
You create repository objects using the Repository Manager, Designer, and Workflow
Manager client tools. You can view the following objects in the Navigator window of
the Repository Manager:
import source and target definitions. You might also want to create reusable objects,
such as reusable transformations or mapplets. For a list of objects you create in the
Design process, see Repository Objects.
Perform the following design tasks in the Designer:
1. Import source definitions. Use the Source Analyzer to connect to the
sources and import the source definitions.
2. Create or import target definitions. Use the Warehouse Designer to
define relational, flat file, or XML targets to receive data from sources. You
can import target definitions from a relational database or a flat file, or you
can manually create a target definition.
3. Create the target tables. If you add a target definition to the repository
that does not exist in a relational database, you need to create target tables
in your target database. You do this by generating and executing the
necessary SQL code within the Warehouse Designer.
4. Design mappings. Once you have source and target definitions in the
repository, you can create mappings in the Mapping Designer. A mapping is a
set of source and target definitions linked by transformation objects that
define the rules for data transformation. A transformation is an object that
performs a specific function in a mapping, such as looking up data or
performing aggregation.
5. Create mapping objects. Optionally, you can create reusable objects for
use in multiple mappings. Use the Transformation Developer to create
reusable transformations. Use the Mapplet Designer to create mapplets. A
mapplet is a set of transformations that may contain sources and
transformations.
6. Debug mappings. Use the Mapping Designer to debug a valid mapping to
gain troubleshooting information about data and error conditions.
7. Import and export repository objects. You can import and export
repository objects, such as sources, targets, transformations, mapplets, and
mappings to archive or share metadata.
Workflow Manager
The Workflow Manager consists of three tools to help you develop a workflow:
Task Developer. Create tasks you want to accomplish in the workflow in the
Task Developer.
Before you create a workflow, you must configure the following connection
information:
Workflow Monitor
After you create a workflow, you run the workflow in the Workflow Manager and
monitor it in the Workflow Monitor. The Workflow Monitor is a tool that displays
details about workflow runs in two views, Gantt Chart view and Task view. You can
monitor workflows in online and offline modes.
The Workflow Monitor consists of the following windows:
Getting Started
Before you can begin using PowerCenter, you must create the environment and
perform the following administration tasks to allow access to the repository and the
PowerCenter Server:
1. Configure the sources. If you extract data from relational sources, ask the
database administrator to create user profiles with read access. These user
profiles allow you to import source definitions into the repository and access
the sources at runtime.
If you extract data from file sources, the files must be accessible to the PowerCenter
Server and Client machines.
2. Configure the targets. Ask the database administrator to create user
profiles with read and write access. These user profiles allow you to import
target definitions into the repository and write to the targets at runtime.
If the target database does not exist, create it using the database administration
tools included with your RDBMS. After you create the target database, you can use
the Designer to design and create target tables.
For flat file targets, you need a target directory large enough to process the
resulting files.
3. Choose globalization settings and data movement modes. The data
movement mode you use depends on whether you want the PowerCenter
Server to process single-byte data or multibyte character data. You select
code pages for the repository, PowerCenter Client and PowerCenter Server.
4. Create repository database. Create a database for the repository. Users
accessing the repository database need full rights in that database. If you
upgrade the repository to a new version, you need database rights to drop or
modify these tables.
5. Install the PowerCenter Client. Install the client software on a machine
that accesses the sources, targets, and repository databases, as well as the
PowerCenter Server.
6. Install and configure the Repository Server. Install and configure the
Repository Server on a machine that accesses the repository database, the
PowerCenter Client, and the PowerCenter Server.
7. Install and configure the PowerCenter Server. Install the PowerCenter
Server on a Windows or UNIX system that accesses the sources, targets, and
the repository database.
8. Configure connectivity. Configure network, native, and ODBC connectivity.
Create ODBC data sources to connect to the PowerCenter Clients to the
sources and targets. You must also have network connections between all
databases and PowerCenter Servers.
9. Create the repository. After you configure connectivity between source,
target, and repository databases, you can create the metadata repository.
Connect to the Repository Server from within the Repository Server
Administration Console to create the metadata repository. The Repository
Server connects to the repository database and runs the SQL to create the
repository tables. All the objects you create with PowerCenter are stored as
metadata in the repository.
10.Create repository users and groups. Create groups and user profiles,
then assign privileges and permissions that determine tasks that users can
perform.
11.Register the PowerCenter Server. Before you can start the PowerCenter
Server, you must register the PowerCenter Server so the Workflow Manager
can direct the PowerCenter Server to the repository.
12.
Changing Data Movement Modes
13.You can change the PowerCenter Server data movement mode in the
PowerCenter Server configuration parameters. After you change the data
movement mode, the PowerCenter Server runs in the new data movement
mode the next time you start the PowerCenter Server. When the data
movement mode changes, the PowerCenter Server handles character data
differently. To avoid creating data inconsistencies in your target tables, the
PowerCenter Server performs additional checks for sessions that reuse
session caches and files.
14.Table 2-1 describes how the PowerCenter Server handles session files and
caches after you change the data movement mode:
15.
Table 2-1. Session and File Cache Handling After Data Movement Mode Change
Session File or
Cache
Each session.
Workflow Log
Each workflow.
Reject File
(*.bad)
Each session.
Output File
(*.out)
file.
Indicator File
(*.in)
file.
error message:
Sessions with Incremental
Aggregation enabled.
Unnamed
Persistent
transformation configured
Lookup Files
cache.
(*.idx, *.dat)
lookup cache.
Named
Persistent
transformation configured
Lookup Files
(*.idx, *.dat)
cache.
PowerCenter Server
Repository Server
Global Repository
Local Repository
Standalone Repository
PowerCenter Client
BadFiles
Cache
ExtProc
LkpFiles
SessLogs
SrcFiles
Temp
TgtFiles
WorkflowLogs
Required/Optio
Description
nal
Required
$PMSessionLogDir
Required
$PMBadFileDir
Required
$PMCacheDir
Required
$PMTargetFileDir
Required
$PMSourceFileDir
Required
$PMExtProcDir
Required
procedures. Defaults to
$PMRootDir/ExtProc.
$PMTempDir
Required
$PMSuccessEmailUser
Optional
$PMFailureEmailUser
Optional
Optional
$PMSessionErrorThres
hold
Optional
$PMWorkflowLogDir
Required
$PMWorkflowLogCount Optional
$PMLookupFileDir
Optional