Sie sind auf Seite 1von 167

1.

Informatica Product Overview


1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 1.14 1.15 1.16 Introduction Sources and Targets PowerCenter 8 Domain PowerCenter 8 Repository PowerCenter 8 Administration Console PowerCenter 8 Client Repository Manager Repository Objects Workflow Manager Workflow Monitor Repository Services Integration Services Web Services Hub Data Analyzer Metadata Manager PowerCenter Repository Reports

2. Installation Steps

3. Repository Manager
3.1 What is Repository? 3.2 Repository Connectivity 3.3 Repository Server 3.4 Repository Objects 3.5 Repository Metadata 3.6 Using Repository

4. PowerCenter Designer
4.1 Designer Overview 4.2 About Transformation 4.3 Lookup Transformation 4.4 Expression Transformation 4.5 Router Transformation 4.6 Filter Transformation 4.7 Joiner Transformation 4.8 Sequence Generator Transformation 4.9 Source Qualifier Transformation 4.10 Aggregator Transformation 4.11 Update Strategy 4.12 Stored procedure Transformation 4.13 Rank Transformation 4.14 Java Transformation 4.15 User Defined Functions 4.16 Data Profiling
2

4.17 Profile Manager 4.18 Debugger Overview

5. PowerCenter Workflow Manager


5.1 Workflow Manager 5.2 Workflow Manager Tools 5.3 Workflow Structure 5.4 Workflow Tasks 5.5 Task Developer 5.6 Session Task 5.7 Event-Task 5.8 E-Mail Task 5.9 Worklet 5.10 Workflow Scheduler 5.11 Server Connections 5.12 Relational Connections (Native) 5.13 FTP Connection 5.14 Workflows Design 5.15 Workflow Monitor

6. Transformations Overview

Informatica Product Overview

1.1 Introduction Informatica Power Center is a single, unified enterprise data integration platform that allows companies and organizations of all sizes to access, discover, and integrate data from virtually any business system, in any format, and deliver that data throughout the enterprise at any speed. Power Center helps organizations derive business value from all their data so that they can reduce IT costs and complexity, streamline operations, and drive revenue growth. PowerCenter helps organizations derive business value from all their data so that they can Reduce IT costs and complexity Streamline business operations and processes Drive revenue growth

1.2 Informatica Powercenter 8 can access the following data sources and load the data the data into the following targets Sources

Targets

Informatica PowerCenter 8 provides an environment which allows us to load data into a centralized location, such as a data warehouse or operational data store (ODS). We can extract data from multiple sources, transform the data according to business logic, and load the transformed data into file and relational targets. PowerCenter 8 also provides us the ability to view and analyze business information and browse and analyze metadata from disparate metadata repositories. PowerCenter 8 Components PowerCenter domain PowerCenter repository Administration Console PowerCenter Client Repository Service
6

Integration Service Web Services Hub Data Analyzer Metadata Manager PowerCenter Repository Reports

1.3 PowerCenter 8 Domain PowerCenter has a service-oriented architecture that provides the ability to scale services and share resources across multiple machines. It provides the PowerCenter domain to support the administration of the PowerCenter Services. Domain is the primary unit for management and administration of services in PowerCenter. It has the following components

One or more nodes Node is a logical representation of a machine in a domain. Domain may contain more than one node. The node that hosts the domain is the master gateway for the domain. We can add the other machines as nodes in the domain and configure the nodes to run Integration Services and Repository Services. All service requests from other nodes in the domain go through the master gateway. Service Manager Service Manager is built in to the domain to support the domain and the application services. The Service Manager runs on each node in the domain. Service Manager performs the following functions

Authentication Authorization Configuration Node configuration


8

Licensing Logging

Application services Application services are a group of services that represent PowerCenter serverbased functionality. The application services that run on each node in the domain depend on the way you configure the node and the application service. The following services are installed once you install PowerCenter services. Repository Service Integration Service Web Services Hub SAP BW Service

1.4 PowerCenter 8 Repository PowerCenter repository resides in a relational database. Repository database tables contain the instructions required to extract, transform, and load data. PowerCenter Client applications access the repository database tables through the Repository Service. The repository consists of database tables that store metadata. Metadata describes different types of objects, such as mappings or transformations, that we can create or modify using the Client tools. Integration Service uses repository objects to extract, transform, and load data. The repository also stores administrative information such as user names, passwords, permissions, and privileges. We add metadata to the repository tables when you perform tasks in the PowerCenter Client application, such as creating
9

users, analyzing sources, developing mappings or mapplets, or creating workflows. The Integration Service reads metadata created through the Client application when you run a workflow. It also creates metadata, such as start and finish times of a session or session status. We can administer the repository using the Repository Manager Client tool and we can also develop global and local repositories to share metadata. Global repository

Global repository is the hub of the repository domain. Use the global repository to store common objects that multiple developers can use through shortcuts. These objects may include operational or Application source definitions, reusable transformations, mapplets, and mappings. Local repositories

Local repository is any repository within the domain that is not the global repository. Use local repositories for development. From a local repository, we can create shortcuts to objects in shared folders in the global repository. These objects include source definitions, common dimensions and lookups, and enterprise standard transformations. We can also create copies of objects in non-shared folders.

1.5 PowerCenter 8 Administration Console Administration Console is a web application that we use to manage a PowerCenter domain. If you have a user login to the domain, we can access the Administration Consol to perform administrative tasks such as managing logs, user accounts, and domain objects. Domain objects include services, nodes, and licenses. Administration Console performs the following tasks in the domain.
10

Manage application services Configure nodes Manage domain objects View and edit domain object properties View log events

1.6 PowerCenter 8 Client PowerCenter Client consists of the following applications that you use to manage the repository, design mappings, mapplets, and create sessions to load the data. Designer

11

Designer is used to create mappings that contain transformation instructions for the Integration Service. The Designer has the following tools that we use to analyze sources, design target schemas, and build source-to-target mappings Source Analyzer It imports or creates source definitions. Target Designer It imports or creates target definitions. Transformation Developer Develop transformations to use in mappings. We can also develop user-defined functions to use in expressions. Mapplets Designer It Creates sets of transformations to use in mappings. Mapping Designer It Creates mappings that the Integration Service uses to extract, transform, and load data.

The following things are displayed in Designer Navigator It connect to repositories, and open folders within the Navigator. We can also copy objects and create shortcuts within the Navigator. Workspace

12

It opens different tools in this window to create and edit repository objects, such as sources, targets, mapplets, transformations, and mappings. Output View details about tasks you perform, such as saving your work or validating a mapping. Status bar It Displays the status of the operation you perform. Overview An optional window to simplify viewing a workspace that contains a large mapping or multiple objects. Outlines the visible area in the workspace and highlights selected objects in color. Instance data View transformation data while you run the Debugger to debug a mapping. Target data View target data while you run the Debugger to debug a mapping.

13

1.7 Repository Manager Repository Manager is to create repository users and groups, assign privileges and permissions, and manage folders and locks. We can navigate through multiple folders and repositories, and complete the following tasks. Manage users and groups Create, edit, and delete repository users and user groups. We can assign and revoke repository privileges and folder permissions. Perform folder functions

14

Create, edit, copy, and delete folders. Work you perform in the Designer and Workflow Manager is stored in folders. If you want to share metadata, we can configure a folder to be shared. View metadata Analyze sources, targets, mappings, and shortcut dependencies, search by keyword, and view the properties of repository objects.

The following are displayed in Repository manager Navigator It displays all objects that you create in the Repository Manager, the Designer, and the Workflow Manager. It is organized first by repository, then by folder and folder version. Main It provides properties of the object selected in the Navigator window. The columns in this window change depending on the object selected in the Navigator window.

Output It provides the output of tasks executed within the Repository Manager, such as creating a repository.

15

1.8 Repository Objects We can create repository objects using the Designer and Workflow Manager Client tools. And we can view the following objects in the Navigator window of the Repository Manager.

Source definitions Definitions of database objects (tables, views, synonyms) or files that provide source data.
16

Target definitions Definitions of database objects or files that contain the target data. Mappings A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Integration Service uses to transform and move data. Reusable transformations Transformations that you use in multiple mappings. Mapplets A set of transformations that you use in multiple mappings. Sessions and workflows Sessions and workflows store information about how and when the Integration Service moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping 1.9 Workflow Manager In the Workflow Manager, we can define a set of instructions to execute tasks, such as sessions, emails, and shell commands. This set of instructions is called a workflow. It has the following tools to help you develop a workflow. Task Developer It creates tasks that we want to accomplish in the workflow.
17

Worklet Designer It creates a worklet in the Worklet Designer. A worklet is an object that groups a set of tasks. Worklet is similar to a workflow, but without scheduling information. We can nest worklets inside a workflow. Workflow Designer It creates a workflow by connecting tasks with links in the Workflow Designer. You can also create tasks in the Workflow Designer as we develop the workflow

18

1.10 Workflow Monitor We can monitor workflows and tasks in the Workflow Monitor. View details about a workflow or task in Gantt chart view or Task view. We can run, stop, abort, and resume workflows from the Workflow Monitor. We can view sessions and workflow log events in the Workflow Monitor Log Viewer. The Workflow Monitor displays workflows that have run at least once. The Workflow Monitor continuously receives information from the Integration Service and Repository Service. It also fetches information from the repository to display historic information. The Workflow Monitor consists of the following windows. Navigator window It displays monitored repositories, servers, and repositories objects. Output window. It displays messages from the Integration Service and Repository Service. Time window It displays progress of workflow runs. Task view It displays details about workflow runs in a report format. Gantt Chart view It displays details about workflow runs in chronological format.

19

1.11 Repository Service Repository Service manages connections to the PowerCenter repository from client applications. The Repository Service is a separate, multi-threaded process that retrieves, inserts, and updates metadata in the repository database tables. Repository Service ensures the consistency of metadata in the repository. It accepts connection requests from the following PowerCenter applications: PowerCenter Client Use the Designer and Workflow Manager to create and store mapping metadata and connection object information in the repository. Use the Workflow Monitor to retrieve workflow run status information and session logs written by the Integration Service. Use the Repository Manager to organize and secure metadata by creating folders, users, and groups.

20

Command line programs Use command line programs to perform repository metadata administration tasks and service-related functions. Integration Service When you start the Integration Service, it connects to the repository to schedule workflows. When you run a workflow, the Integration Service retrieves workflow task and mapping metadata from the repository. The Integration Service writes workflow status to the repository. Web Services Hub. When you start the Web Services Hub, it connects to the repository to access web-enabled workflows. The Web Services Hub retrieves workflow task and mapping metadata from the repository and writes workflow status to the repository.

1.12 Integration Services The Integration Service reads mapping and session information from the repository. It extracts data from the mapping sources and stores the data in memory while it applies the transformation rules that you configure in the mapping. Integration Service loads the transformed data into the mapping targets. The Integration Service can combine data from different platforms and source types. It can also load data to different platforms and target types. The Integration Service connects to the repository through the Repository Service to fetch metadata from the repository.
21

1.13 Web Services Hub The Web Services Hub is a web service gateway for external clients. It processes SOAP requests from web service clients that want to access PowerCenter functionality through web services. Web service clients access the Integration Service and Repository Service through the Web Services Hub. Web Services Hub hosts the following web services. Batch web services Run and monitor web-enabled workflows. Real-time web services It creates service workflows that allow you to read and write messages to a web service client through the Web Services Hub. 1.14 Data Analyzer PowerCenter Data Analyzer provides a framework to perform business analytics on corporate data. With Data Analyzer, we can extract, filter, format, and analyze corporate information from data stored in a data warehouse, operational data store, or other data storage models. Data Analyzer uses a web browser interface to view and analyze business information at any level. It extracts, filters, and presents information in easy-to-understand reports. We can use Data Analyzer to design, develop, and deploy reports and set up dashboards and alerts to provide the latest information to users at the time and in the manner most useful to them. It works with a database repository to keep track of information about enterprise metrics, reports, and report
22

delivery. Once we install Data Analyzer, users can connect to it from any computer that has a web browser and access to the Data Analyzer host. Data Analyzer can access information from databases, web services, or XML documents. You can set up reports to analyze information from multiple data sources. You can also set up reports to analyze real-time data from message streams. Data Analyzer Components With Data Analyzer, we can read data from a data source, create reports, and view the results on a web browser. It contains the following components. Data Analyzer repository. The repository stores the metadata necessary for Data Analyzer to track the objects and processes it requires to handle user requests. The metadata includes information on schemas, user profiles, personalization, reports and report delivery, and other objects and processes. We can use the metadata in the repository to create reports based on schemas without accessing the data warehouse directly. Data Analyzer connects to the repository through Java Database Connectivity (JDBC) drivers. The Data Analyzer repository is separate from the PowerCenter repository

23

Application server Data Analyzer uses a third-party Java application server to manage processes. The Java application server provides services such as database access and server load balancing to Data Analyzer. The Java application server also provides an environment that uses Java technology to manage application, network, and system resources. Web server Data Analyzer uses an HTTP server to fetch and transmit Data Analyzer pages to web browsers. Data source For analytic and operational schemas, Data Analyzer reads data from a relational database. It connects to the database through JDBC drivers. For hierarchical schemas, Data Analyzer reads data from an XML document. The XML document may reside on a web server or be generated by a web service operation. Data Analyzer connects to the XML document or web service through an HTTP connection

24

1.15 Metadata Manager PowerCenter Metadata Manager is a metadata management tool that we can use to browse and analyze metadata from disparate metadata repositories. Metadata Manager helps you understand and manage how information and processes are derived, the fundamental relationships between them, and how they are used. It provides the following tools Metadata Manager Console. It set ups, configure, and run XConnects, which load the source repository metadata into the Metadata Manager Warehouse. We can also use the Metadata Manager Console to set up connections to source repositories and other Metadata Manager components. Metadata Manager Custom Metadata Configurator. Create XConnects to load metadata from source repositories for which Metadata Manager does not package XConnects. Metadata Manager Interface Browse source repository metadata and run reports to analyze the metadata. Also, use it to configure metamodels, set up source repositories, configure the reporting schema, and set up access and privileges for users and groups.

Metadata Manager Components Application server Helps the Metadata Manager Server manage its processes efficiently.
25

Metadata Manager Server It manages the source repository metadata stored in the Metadata Manager Warehouse. Metadata Manager Warehouse It stores the Metadata Manager Metadata, such as the Metadata Manager reporting schema, user profiles, and reports. It also stores source repository metadata and metamodels. PowerCenter repository. It stores the workflows, which are XConnect components that extract source metadata and load it into the Metadata Manager Warehouse. Web server It fetches and transmits Metadata Manager Pages to web browsers. Each supported application server contains an integrated web server.

26

1.16 PowerCenter Repository Reports Use PowerCenter Repository Reports to browse and analyze PowerCenter metadata. PowerCenter Repository Reports provide the following types of reports to help us administer our PowerCenter environment.

Configuration Management With Configuration Management reports, we can analyze deployment groups and PowerCenter repository object labels.
27

Operations With Operations reports, we can analyze operational statistics for workflows, Worklets, and sessions. PowerCenter Objects With PowerCenter Object reports, we can identify PowerCenter objects, their properties, and their interdependencies with other repository objects. Security With the Security report, we can analyze users, groups, and their association within the repository.

Informatica power center 8 is having the following features which makes it more powerful, easy to use and manage when compared to previous versions. Supports Service oriented architecture Access to structured, unstructured and semi structured data Support for grid computing High availability Pushdown optimization Dynamic portioning Metadata exchange enhancements Team based Development
28

Global Web-based Admin console New transformations 23 New functions User defined functions Custom transformation enhancements Flat file enhancements New Data Federation option Enterprise GRID

29

2 Installation Steps
1. Verify that your environment meets the minimum system requirements and complete the pre-installation tasks. 2. Log on to the machine with the user account you want to use to install PowerCenter. 3. Close all other applications. 4. To begin the installation on Windows from a DVD, insert the DVD into the DVD drive. Run install.bat from the DVD root directory. -OrTo begin the installation on Windows from a hard disk, run install.bat from the root directory in the location where you copied the installer. -OrTo begin the installation on UNIX, use a shell command line to run install.sh from the DVD root directory or the root directory in the location where you downloaded the installer. 5. On UNIX, select the option for GUI mode installation. 6. Select the language to use during installation and click OK. The Welcome window introduces the PowerCenter installation. 7. Click Next. On UNIX, the Configure Environment Variables window appears. Verify that you have configured the required environment variables. The PowerCenter installer gives you the option to stop the installation and modify the environment variables.
30

8. Click Next. The Choose Installation Type window appears. Choose Install PowerCenter 8.6 9. Click Next. The PowerCenter License Key window appears. 10. Enter the location and file name of the PowerCenter license key or click Browse to locate the license key file. 11. Click Next. The Installation Prerequisites window displays the platforms and databases you can use and the disk space requirements. Verify that all PowerCenter installation requirements are met before you continue the installation. 12.Click Next. The Installation Directory window appears. 13.Enter an absolute path for the installation directory. Click Browse to find a directory or use the default directory. On Windows, the default directory is C:\Informatica\PowerCenter8.6. On UNIX, the default directory is $HOME/Informatica/PowerCenter8.6. Note: On Windows, the installation directory path must be on the current machine. On UNIX, HOME is the user home directory. The name of the installation directory cannot contain spaces. 14.Click Next. The HTTPS Configuration window appears.
31

15.. To use HTTP connection between the Administration Console and the Service Manager, clear Enable HTTPS. Skip to step 18. To set up a secure connection between the Administration Console and the Service Manager, select Enable HTTPS and continue to the next step. 17. Select the type of keystore to use and enter the following information based on your selection: Option Description Use a Keystore Generated by the Installer Select this option to use a self-signed keystore file generated by the PowerCenter installer. Specify the port number to use. Use an Existing Keystore File Select this option to use a keystore file you specify. The keystore file can be self signed or signed by a certification authority. Specify the port number and the location and password of the keystore. HTTPS Port Number Port used by the node to communicate between the Administration Console and the Service Manager. Keystore Password
32

A plain-text password for the keystore file. Disabled when you use a keystore generated by the installer. Keystore File Location Path and file name of the keystore file. You can use a self-signed certificate or a certificate signed by a certification authority. Disabled when you use a keystore generated by the installer. If you use a generated keystore, the installer creates the keystore file in the following directory: <PowerCenterInstallationDirectory>\server\tomcat\conf

18. Click Next. 19. On the Pre-Installation Summary window, review the installation information, and click Install to continue. The installer copies the files to the installation directory. When the file copy process completes, the Create or Join Domain window appears. 20. Choose to create a domain if you are installing PowerCenter for the first time or you are installing PowerCenter on a single machine. Continue to the next step. -orChoose to join a domain if you have created a PowerCenter domain on another machine and you want to Add the current machine as a node in the domain. On Windows, skip to step 27. On UNIX, skip to step 31.
33

For more information about the available domain options, click the Help Me Select link. 21. Click Next. The Configure Domain Database window appears. PowerCenter stores the PowerCenter domain configuration in a relational database. The domain configuration must be accessible to all gateway nodes in the domain.

22.Enter the following information: Property Description Database Type Database for the domain configuration. Select Oracle, Microsoft SQL Server, Sybase ASE, or IBM DB2. Database URL Host name and port number for the database instance in the format <host name>:<port number>. Database User ID Domain configuration database user account. Database User Password Password for the domain configuration database user account.
34

Database Service Name Service name for Oracle and IBM DB2 databases or database name for Microsoft SQL Server or Sybase ASE databases. Use the following guidelines: - If you want to use an Oracle SID, you can use the Custom String option.

- If you want to use another database other than the default Sybase ASE database for the user acount, use the Custom String option. Custom String JDBC connect string. - To use an Oracle SID instead of an Oracle service name, use the following JDBC connect string: jdbc:informatica:oracle://host_name:port;SID=sid - To specify a non-default Sybase database, use the following JDBC connect string: DatabaseName=<name of database> Tablespace (optional) Name of the tablespace in which to create the repository tables. If blank, the installation creates the repository tables in the default tablespace. Define the repository database in a single-node tablespace to optimize performance. Enabled if you select IBM DB2. 23. Click Test Connection to verify that you can connect to the domain configuration database.
35

24. Click Next. The Configure Domain window appears.

25. Enter the following information: Property Description Domain Name Name of the PowerCenter domain to create. The domain name must be in 7-bit ASCII format and less than 79 characters. The domain name cannot contain spaces or the following characters: \ / : * ? > < " | Note: If you are upgrading from PowerCenter 7.x, the name of the PowerCenter domain cannot be the same as the name of the PowerCenter Server in 7.x.

Domain Host Name Host name of the machine on which to create the PowerCenter domain. If you create a domain on a machine with a single network name, use the default host name. If you create a domain on a machine with multiple network names, you can modify the default host name to use an alternate network name. Optionally, you can use the IP address of the machine on which to create the domain.

Node Name Node name for the current machine. This is the name of the gateway node for the domain. This is not the host name for the machine.
36

Domain Port No. Port number for the current machine. The installer displays a default port number of 6001. If the port number is not available on the machine, the installer displays the next available port number.

Domain User Name User name of the domain administrator. Use this name to log in to the PowerCenter Administration Console. The user name must be less than 79 alphanumeric characters and cannot contain special characters. Do not use Administrator or administrator as the domain user name. Default is admin.

Domain Password Password for the domain administrator. The password must be between 3 and 16 characters. Confirm Password Enter the password again. To set the range of port numbers for PowerCenter on the node, click Advanced Configuration. 26. Enter the range of port numbers that the PowerCenter installer can use for PowerCenter on the node and Click OK. The default range for the port numbers is 6005 - 6105.
37

Skip to step 29.

27. Click Next. The Configure Domain window appears. Verify that the gateway node for the domain you want to join is available before you continue. 28. Enter the following information: Property Description Domain Name Name of the domain you want to join. Domain Host Name Host name or IP address of the gateway node for the domain. Domain Port No. Port number for the gateway node. Domain User Name User name for a domain administrator in the PowerCenter domain you want to join. Domain Password Password for the domain administrator user account.
38

29. Click Next. On Windows, the Configure Informatica Services window appears. Informatica Services is the Windows service that runs PowerCenter. You can specify a different user Account to run the service. 30. Enter the following information: Property Description Run Informatica Services With a different user account Indicates whether the current Windows user account that installs Informatica Services also run Informatica Services. If selected, enter the user name and password of the user account to run Informatica Services.

Use a different account to run Informatica Services if PowerCenter needs to access a network location not available to the current Windows user account. You must also use a different account to run Informatica Services to use a trusted connection for authentication with the PowerCenter repository database.

If not selected, the current user account that installs Informatica Services also runs Informatica Services.
39

User name

User account used to run the Informatica Services service. Enter the Windows domain and user account in the format <DomainName>\<UserAccount>. This user account must have the Act as operating system permission.

Password Password for the Windows user account to run Informatica Services. 31. Click Next. The PowerCenter Post-installation Summary window indicates whether the installation completed successfully. It also shows the configured properties for PowerCenter components and the status of installed components.

32. Click Done. You can view the log files generated by the installer to get more information about the installation tasks performed by the installer and to view configuration properties for the installed components.

40

3 Repository Manager
3.1 What is Repository?
The Informatica repository is a relational database managed by the Repository Server. The Repository stores information, or metadata, used by the Informatica Server and Client tools

3.2 Repository Connectivity.


Through Repository Server Repository Client applications access the Repository database tables. Repository Server protects the metadata. Repository Server notifies you the objects you are working are modified or deleted by another user. Repository Server uses native drivers to communicate with Repository database. Informatica client tools and Server communicate with Repository Server over TCP/IP To manage Repository database Repository Server uses a process called Repository Agents. To manage multiple Repositories on different machines we use multiple Repository Agents.

3.3 Repository Server.


Each Repository has independent architecture for the management of physical repository tables Components: One Repository Server and Repository Agent for each Repository

41

The Repository Server starts the Repository Agent process for the repository database. The client application sends a repository connection request to the Repository Server. The Repository Server verifies connectivity information for the target repository.

3.4 Repository Objects


Folders: Folders organize and store metadata in the Repository Folders share the metadata with other repository users. You must create a Folder in Repository before connect to Repository through Designer or Workflow manager.

Folder Creation

42

43

Folder Comparison

44

45

46

47

Folder Sharing

48

the same Repository. reverse it.

Shared Folder available to all other Folders in Once you make a Folder shared you cannot Shared Folder in Global Repository can be used any folder in the domain.

49

Repository Manager has Four Types of Windows

Navigator Window Contains: Repositories. Folders. Folder versions. Nodes. Repository objects.
50

Main window displays details about the objects

51

Dependency window appears when you configure the Repository Manager to display dependencies.

Output Windows displays detailed information


52

Folder Versions development. Folder Versions can revert to previous version during the development process. Folder Versions can create at any time in the Designer. Users: Repository users have a username and password that allow access to the repository. Repository user belongs to at least one user group. Create User Folder Versions stores a copy of metadata in

53

54

55

56

User Groups: group. User groups organize individual repository users. Individual users inherit all privileges assigned to the user

3.5 Repository Metadata The repository stores metadata that describes how to extract, transform, and load source and target data.

57

Repository objects: Source definitions Target definitions Transformations Reusable transformations Mappings Mapplets Multi-Dimensional metadata Shortcuts Database connections Connection objects Sessions Workflows Workflow tasks Worklets Folders Folder versions Users User Groups

Repository server Administration console contains two types of windows: Console tree Main window

Console tree contains following nodes: Informatica Repository Server


58

Repository Server name. Repositories. Repository name. Connections. Locks. Backups.

Main window displays details of the node you select in the Console Tree.

59

3.6 Using Repository


Registering Repository

60

Copying a Repository Production. Provides a Metadata copy as a basis for New Repository. To Preserve Original Repository before Upgrading. When Moving a Repository from Development to

Repository Security Features to Implement Security: User groups. Repository users. Repository privileges Folder permissions. User connections. Locking.

User Groups.
61

o o

By default there are two default user groups Administrators Public In Administrators group there are two default users. Administrator Database username used to create Repository. To administer the user groups we must have one of the following privileges. Administer Repository Super User You can create custom groups and assign specific privileges to that user groups. Using the User groups you can manage users and Repository privileges efficiently.

Stop the repository before Upgrading

62

Change Password

63

Managing privileges

Repository Locks

The repository uses locks to prevent users from duplicating or overriding work. Types of Locks In-use lock. Write-intent lock. Execute lock.

64

In-Use Lock IN-Use Lock is created when: write permission. Viewing an object that is already write-locked. Exporting an object. Viewing an object in a folder for which you do not have

We can create unlimited number of In-Use Locks per object.

Write-Intent Lock Write-Intent Lock is created when: permission. write permission. Importing an object. We can create Only One Write-Intent Lock per object. Editing a repository object in a folder for which you have Viewing an object in a folder for which you have write

Execute Lock Execute Lock is created when: To start the Workflow that is already running. Drawback:

65

Informatica Server load duplicate or inaccurate data. We can create Only One Execute Lock per object.

4 PowerCenter Designer
4.1 Designer Overview
Designer is used to create mappings that contain transformation instructions for the Integration Service. The Designer has the following tools that we use to analyze sources, design target schemas, and build source-to-target mappings.

Source Analyzer It imports or creates source definitions. 66

Target Designer It imports or creates target definitions.

Transformation Developer Develop transformations to use in mappings. we can also develop user-defined functions to use in expressions.

Mapplets Designer It creates sets of transformations to use in mappings.

Mapping Designer It Creates mappings that the Integration Service uses to extract, transform, and load data.

4.2 About Transformation


The transfer of data is called transformation. A transformation is a repository object that generates, modifies, or passes data. We configure logic in a transformation that the Integration Service uses to transform data. The Designer provides a set of transformations that perform specific functions. Transformations in a mapping represent the operations the Integration Service performs on the data. Data passes into and out of transformations through ports that we link in a mapping or mapplet. Transformations can be Active or Passive. An active transformation can change the number of rows that pass through it. A passive transformation does not change the number of rows that pass through it. Transformations can be connected to the data flow. An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.

Tasks to incorporate a transformation into a mapping Create the transformation 67

Configure the transformation Link the transformation to other transformations and target definitions Mapping Designer Transformation Developer Mapplet Designer

Designer Transformations Aggregator - to do things like "group by". Expression - to use various expressions. Filter - to filter data with single condition. Joiner - to make joins between separate databases, file, ODBC sources. Lookup - to create local copy of the data. Normalizer - to transform denormalized data into normalized data. Rank - to select only top (or bottom) ranked data. Sequence Generator - to generate unique IDs for target tables. Source Qualifier - to filter sources (SQL, select distinct, join, etc.) Stored Procedure - to run stored procedures in the database - and capture their returned values. Update Strategy - to flag records in target for insert, delete, update (defined inside a mapping). Router - same as filter but with multiple conditions

Active Vs Passive Transformation

68

4.3 Lookup Transformation


Lookup Transformation Overview A Lookup transformation is a passive transformation. Use a Lookup transformation in a mapping to look up data in a flat file or a relational table, view, or synonym. We can import a lookup definition from any flat file or relational database to which both the PowerCenter Client and Integration Service can connect. We can Use multiple Lookup transformations in a mapping. The Integration Service queries the lookup source based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup source column values based on the lookup condition.

69

Tasks of Lookup Transformation

Get a related value. Perform a calculation. Update slowly changing dimension tables. Connected or unconnected. Cached or uncached.

Lookup Components

70

We have to define the following components when we configure a Lookup transformation in a mapping.

Lookup source Ports Properties Condition Metadata extensions

Creating a Lookup Transformation

71

In the Mapping Designer, click Transformation > Create. Select the Lookup transformation. Enter a name for the transformation and Click OK. The naming convention for Lookup transformation is LKP_Transformation Name. In the Select Lookup Table dialog box, we can choose the following options. Choose an existing table or file definition. Choose to import a definition from a relational table or file. Skip to create a manual definition. If we want to manually define the lookup transformation, click the Skip button. Define input ports for each Lookup condition we want to define.

72

For Lookup transformations that use a dynamic lookup cache, associate an input port or sequence ID with each lookup port. On the Properties tab, set the properties for the lookup. Click OK.

Configuring Unconnected Lookup Transformations An unconnected Lookup transformation is separate from the pipeline in the mapping. We write an expression using the: LKP reference qualifier to call the lookup within another transformation. Adding Input Ports. Adding the Lookup Condition. ITEM_ID = IN_ITEM_ID PRICE <= IN_PRICE Designating a Return Value. Calling the Lookup Through an Expression. LKP.lookup_transformation_name(argument, argument, ...) Double click on lookup transformation edit transformation opens

73

Setting the properties to port tab and properties tab

74

Port Tab Lookup Transformation Tips Add an index to the columns used in a lookup condition Place conditions with an equality operator (=) first. Cache small lookup tables. Join tables in the database. Use a persistent lookup cache for static lookups. Call unconnected Lookup transformations with the: LKP reference qualifier.

Properties Tab Lookup Caches The Integration Service builds a cache in memory when it processes the first row of data in a cached Lookup transformation. It allocates memory for the cache based on the amount we configure in the transformation or session properties. The Integration Service stores condition values in the index cache and output values in the data cache. The Integration Service queries the cache for each row that enters the transformation. The Integration Service also creates cache files by default in the $PMCacheDir. Types of lookup caches Persistent cache 75

Recache from database Static cache Dynamic cache Shared cache

4.4 Expression Transformation


Expression Transformation We can use the Expression transformation to calculate values in a single row before we write to the target We can use the Expression transformation to test conditional statements To perform calculations involving multiple rows, such as sums or averages we can use expression transformation We can use the Expression transformation to perform any non-aggregate calculations

Setting Expression in Expression Transformation Enter the expression in the Expression Editor we have disable to in port. Check the expression syntax by clicking Validate.

76

Connect to Next Transformation Connect the output ports to the next transformation or target

Select a Tracing Level on the Properties Tab Select a tracing level on the Properties tab to determine the amount of transaction detail reported in the session log file. Choose Repository-Save.

77

4.5Router Transformation
A Router transformation is an Active Transformation. A Router transformation is similar to a Filter transformation because both transformations allow us to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives us the option to route rows of data that do not meet any of the conditions to a default output group. If we need to test the same input data based on multiple conditions, use a Router transformation in a mapping instead of creating multiple Filter transformations to perform the same task.

Creating a Router Transformation In the Mapping Designer, click Transformation > Create. Select the Router transformation. Enter a name for the transformation and Click OK. The naming convention for router transformation is RTR_TransformationName. Input values in the Router Transformation Select and drag all the desired ports from a transformation to add them to the Router transformation. Double-click the title bar of the Router transformation to edit transformation properties.

78

Setting the properties to port tab and properties tab

Ports tab

Properties tab

79

Group tab in Router Transformation Click the Group Filter Condition field to open the Expression Editor. Enter a group filter condition. Click Validate to check the syntax of the conditions we entered. Click OK. Connect group output ports to transformations or targets. Choose Repository-Save.

A Router transformation has the following types of groups. Input Output

There are two types of output groups. User-defined groups

80

Default group

Router Transformation Components

Working with Ports A Router transformation has input ports and output ports. Input ports reside in the input group, and output ports reside in the output groups. We can create input ports by copying them from another transformation or by manually creating them on the Ports tab.

Port tab in Router Transformation

81

Connecting Router Transformations in a Mapping When we connect transformations to a Router transformation in a mapping consider the following rules. We can connect one group to one transformation or target.

Connect one port to Multiple Target We can connect one output port in a group to multiple transformations or targets.

Connect Multiple out ports to Multiple Target We can connect multiple output ports in one group to multiple transformations or targets.

82

4.6 Filter Transformation


A Filter transformation is an Active Transformation. We can filter rows in a mapping with Filter transformation. We pass all the rows from a source transformation through the Filter transformation and then enter a filter condition for the transformation. All ports in a Filter transformation are input/output and only rows that meet the condition pass through the Filter transformation.

Creating a Filter Transformation In the Mapping Designer, click Transformation > Create. Select the Filter transformation. Enter a name, and click OK. The naming convention for Filter transformations is FIL_TransformationName. Select and drag all the ports from a source qualifier or other transformation to add them to the Filter transformation. After we select and drag ports, copies of these ports appear in the Filter transformation. Each column has both an input and an output port. Double-click the title bar of the Router transformation to edit transformation properties.

83

A Filter transformation is an Active Transformation. We can filter rows in a mapping with Filter transformation. We pass all the rows from a source transformation through the Filter transformation and then enter a filter condition for the transformation. All ports in a Filter transformation are input/output and only rows that meet the condition pass through the Filter transformation.

Creating a Filter Transformation In the Mapping Designer, click Transformation > Create. Select the Filter transformation. Enter a name, and click OK. The naming convention for Filter transformations is FIL_TransformationName. Select and drag all the ports from a source qualifier or other transformation to add them to the Filter transformation.

84

After we select and drag ports, copies of these ports appear in the Filter transformation. Each column has both an input and an output port. Double-click the title bar of the Router transformation to edit transformation properties.

Click the Value section of the condition, and then click the Open button. The Expression Editor appears. Enter the filter condition we want to apply. Use values from one of the input ports in the transformation as part of this condition However, we can also use values from output ports in other transformations. We may have to fix syntax errors before continuing. Click OK. Select the Tracing Level, and click OK to return to the Mapping Designer. Choose Repository-Save.

85

Filter Transformation Tips Use the Filter transformation early in the mapping. Use the Source Qualifier transformation to filter.

4.7 Joiner Transformation


A Joiner transformation is an active transformation. Joiner transformation is used to join source data from two related heterogeneous sources residing in different locations or file systems. We can also join data from the same source. The Joiner transformation joins sources with at least one matching column. The Joiner transformation uses a condition that matches one or more pairs of columns between the two sources.

We can use the following sources Two relational tables existing in separate databases. Two flat files in potentially different file systems. Two different ODBC sources. A relational table and an XML source. A relational table and a flat file source. Two instances of the same XML source.

Creating a Joiner Transformation In the Mapping Designer, click Transformation > Create. Select the Joiner transformation. Enter a name, and click OK. The naming convention for Joiner transformations is JNR_TransformationName. Drag all the input/output ports from the first source into the Joiner transformation. 86

The Designer creates input/output ports for the source fields in the Joiner transformation as detail fields by default. We can edit this property later. Select and drag all the input/output ports from the second source into the Joiner transformation. The Designer configures the second set of source fields and master fields by default.

Edit Transformation Double-click the title bar of the Joiner transformation to open the Edit Transformations dialog box. Select the port tab. Add default values for specific ports as necessary.

87

Setting the Condition Select the Condition tab and set the condition. Click the Add button to add a condition. Click the Properties tab and configure properties for the transformation. Click OK.

Defining the Join Type Join is a relational operator that combines data from multiple tables into a single result set. We define the join type on the Properties tab in the transformation. The Joiner transformation supports the following types of joins. Normal Master Outer Detail Outer Full Outer

Joiner Transformation Tips Perform joins in a database when possible. Join sorted data when possible. 88

For an unsorted Joiner transformation, designate the source with fewer rows as the master source. For a sorted Joiner transformation, designate the source with fewer duplicate key values as the master source.

4.8 Sequence Generator Transformation


A Sequence Generator transformation is a passive transformation. The Sequence Generator transformation generates numeric values. We can use the Sequence Generator to create unique primary key values, cycle through a sequential range of numbers. The Sequence Generator transformation is a connected transformation. The Integration Service generates a value each time a row enters a connected transformation, even if that value is not used. When NEXTVAL is connected to the input port of another transformation, the Integration Service generates a sequence of numbers. When CURRVAL is connected to the input port of another transformation, the Integration Service generates the NEXTVAL value plus one. We can make a Sequence Generator reusable, and use it in multiple mappings. Web might reuse a Sequence Generator when we perform multiple loads to a single target. If we have a large input file we can separate into three sessions running in parallel, we can use a Sequence Generator to generate primary key values. If we use different Sequence Generators, the Integration Service might accidentally generate duplicate key values. Instead, we can use the reusable Sequence Generator for all three sessions to provide a unique value for each target row. Tasks with a Sequence Generator Transformation Create keys 89

Replace missing values Cycle through a sequential range of numbers Creating a Sequence Generator Transformation

In the Mapping Designer, select Transformation-Create. Select the Sequence Generator transformation. The naming convention for Sequence Generator transformations is SEQ_TransformationName. Enter a name for the Sequence Generator, and click Create. Click Done. The Designer creates the Sequence Generator transformation.

Edit Transformation

90

Double-click the title bar of the transformation to open the Edit Transformations dialog box.

Properties Tab Select the Properties tab. Enter settings as necessary. Click OK. To generate new sequences during a session, connect the NEXTVAL port to at least one transformation in the mapping. Choose Repository-Save.

91

Sequence Generator Ports The Sequence Generator provides two output ports: NEXTVAL and CURRVAL. Use the NEXTVAL port to generate a sequence of numbers by connecting it to a transformation or target. We connect the NEXTVAL port to a downstream transformation to generate the sequence based on the Current Value and Increment By properties. Connect NEXTVAL to multiple transformations to generate unique values for each row in each transformation. We might connect NEXTVAL to two target tables in a mapping to generate unique primary key values.

92

NEXTVAL to Two Target Tables in a Mapping We configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When we run the workflow, the Integration Service generates the following primary key values for the T_ORDERS_PRIMARY and T_ORDERS_FOREIGN target tables.

T_ORDERS_PRIMARY TABLE: PRIMARY KEY 1 3 5

T_ORDERS_FOREIGN TABLE: PRIMARY KEY

2 4 6

93

7 9

8 10

Sequence Generator and Expression Transformation We configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1

Output key values for the T_ORDERS_PRIMARY and T_ORDERS_FOREIGN target tables

94

T_ORDERS_PRIMARY TABLE: PRIMARY KEY 1 2 3 4 5

T_ORDERS_FOREIGN TABLE: PRIMARY KEY 1 2 3 4 5

CURRVAL is the NEXTVAL value plus one or NEXTVAL plus the Increment By value. We typically only connect the CURRVAL port when the NEXTVAL port is already connected to a downstream transformation. When a row enters the transformation connected to the CURRVAL port, the Informatica Server passes the last-created NEXTVAL value plus one.

Connecting CURRVAL and NEXTVAL Ports to a Target We configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When we run the workflow, the Integration Service generates the following values for NEXTVAL and CURRVAL.

95

OUT PUT When we run the workflow, the Integration Service generates the following values for NEXTVAL and CURRVAL. If we connect the CURRVAL port without connecting the NEXTVAL port, the Integration Service passes a constant value for each row.

NEXTVAL CURRVAL

1 2 3 4 5

2 3 4 5 6

96

Only the CURRVAL Port to a Target For example, we configure the Sequence Generator transformation as follows.

OUTPUT Current Value = 1, Increment By = 1 When we run the workflow, the Integration Service generates the following constant values for CURRVAL. CURRVAL 1 1 1

97

1 1

4.9 Source Qualifier Transformation


A Source Qualifier is an active transformation. The Source Qualifier represents the rows that the Integration Service reads when it executes a session. When we add a relational or a flat file source definition to a mapping source Qualifier transformation automatically comes.

Task of Source Qualifier Transformation We can use the Source Qualifier to perform the following tasks. Join data originating from the same source database. Filter records when the Integration Service reads source data. 98

Specify an outer join rather than the default inner join Specify sorted ports. Select only distinct values from the source. Create a custom query to issue a special SELECT statement for the Integration Service to read source data.

Default Query of Source Qualifier For relational sources, the Integration Service generates a query for each Source Qualifier when it runs a session. The default query is a SELECT statement for each source column used in the mapping.

To view the Default Query To view the default query. From the Properties tab, select SQL Query Click Generate SQL

Click Cancel to exit

99

Example of source Qualifier Transformation We might see all the orders for the month, including order number, order amount, and customer name. The ORDERS table includes the order number and amount of each order, but not the customer name. To include the customer name, we need to join the ORDERS and CUSTOMERS tables.

Setting the properties to Source Qualifier Double-click the title bar of the transformation to open the Edit Transformations dialog box. Select the Properties tab. Enter settings as necessary. 100

SQL Query We can give query in the Source Qualifier transformation. From the Properties tab, select SQL Query the SQL Editor displays. Click Generate SQL.

Joining Source Data We can use one Source Qualifier transformation to join data from multiple relational tables. These tables must be accessible from the same instance or database server. 101

Use the Joiner transformation for heterogeneous sources and to join flat files.

Sorted Ports In the Mapping Designer, open a Source Qualifier transformation, and click the Properties tab. Click in Number of Sorted Ports and enter the number of ports we want to sort. The Integration Service adds the configured number of columns to an ORDER BY clause, starting from the top of the Source Qualifier transformation. The source database sort order must correspond to the session.

102

4.10

Aggregator Transformation
The Aggregator is an active transformation. The Aggregator transformation allows us to perform aggregate calculations, such as averages and sums. The Aggregator transformation is unlike the Expression transformation, in that we can use the Aggregator transformation to perform calculations on groups. The Expression transformation permits us to perform calculations on a row-by-row basis only. We can use conditional clauses to filter rows, providing more flexibility than SQL language. The Integration Services performs aggregate calculations as it reads, and stores data group and row data in an aggregate cache. Components of the Aggregator Transformation necessary

Aggregate expression Group by port Sorted input Aggregate cache Aggregate Expression

An aggregate expression can include conditional clauses and non-aggregate functions. It can also include one aggregate function nested within another aggregate function, such as. MAX( COUNT( ITEM ) Aggregate Functions

The aggregate functions can be used within an Aggregator transformation. You can nest one aggregate function within another aggregate function. AVG 103

COUNT

Aggregate Functions FIRST LAST MEDIAN MAX MIN STDDEV PERCENTILE SUM VARIANCE

Conditional Clauses We use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.

Null Values in Aggregate Functions When we configure the Integration Service, we can choose how we want the Integration Service to handle null values in aggregate functions. We can choose to treat null values in aggregate functions as NULL or zero. By default, the Integration Service treats null values as NULL in aggregate functions.

Creating Aggregator Transformation In the Mapping Designer, click Transformation > Create. Select the Aggregator transformation. Enter a name for the Aggregator, click Create. Then click done. 104

The Designer creates the Aggregator transformation. Drag the ports to the Aggregator transformation. The Designer creates input/output ports for each port we include.

Double-click the title bar of the transformation to open the Edit Transformations dialog box . Select the Ports tab. Click the group by option for each column you want the Aggregator to use in creating groups. Click Add and enter a name and data type for the aggregate expression port. Make the port an output port by clearing Input (I). Click in the right corner of the Expression field to open the Expression Editor. Enter the aggregate expression, click Validate, and click OK.

105

Add default values for specific ports. Select the Properties tab. Enter settings as necessary.

Click OK. Choose Repository-Save.

4.11

Update Strategy

An Update Strategy is an active transformation. When we design a data warehouse, we need to decide what type of information to store in targets. As part of the target table design, we need to determine whether to maintain all the historic data or just the most recent changes. The model we choose determines how we handle changes to existing rows. In PowerCenter, we set the update strategy at two different levels. Within a session

106

Within a mapping

Setting the Update Strategy We use the following steps to define an update strategy To control how rows are flagged for insert, update, delete, or reject within a mapping, add an Update Strategy transformation to the mapping. Update Strategy transformations are essential if we want to flag rows destined for the same target for different database operations, or if we want to reject rows. Define how to flag rows when we configure a session. We can flag all rows for insert, delete, or update, or we can select the data driven option, where the Integration Service follows instructions coded into Update Strategy transformations within the session mapping. Define insert, update, and delete options for each target when we configure a session. On a target-by-target basis, we can allow or disallow inserts and deletes. Creating an Update Transformation In the Mapping Designer, select Transformation-Create. Select the Update transformation. The naming convention for Update transformations is UPD_TransformationName. Enter a name for the Update transformation, and click Create. Click Done. The Designer creates the Update transformation.

Drag all ports from another transformation representing data we want to pass through the Update Strategy transformation. 107

In the Update Strategy transformation, the Designer creates a copy of each port we drag. The Designer also connects the new port to the original port. Each port in the Update Strategy transformation is a combination of input/output port. Normally, we would select all of the columns destined for a particular target. After they pass through the Update Strategy transformation, this information is flagged for update, insert, delete, or reject. Double-click the title bar of the transformation to open the Edit Transformations dialog box. Click the Properties tab.

Click the button in the Update Strategy Expression field. The Expression Editor appears. Enter an update strategy expression to flag rows as inserts, deletes, updates, or rejects. Validate the expression and click OK. Click OK to save the changes. Connect the ports in the Update Strategy transformation to another transformation or a target instance. Click Repository > Save

108

Setting the Update Strategy for a Session When we configure a session, we have several options for handling specific database operations, including updates.

Specifying an Operation for All Rows When we configure a session, we can select a single database operation for all rows using the Treat Source Rows As setting. Configure the Treat Source Rows As session property. Treat Source Rows displays the options like. Insert Delete Update 109

Data Driven

Specifying Operations for Individual Target Tables Once we determine how to treat all rows in the session, we also need to set update strategy options for individual targets. Define the update strategy options in the Transformations view on mapping tab of the session properties. We can set the following update strategy options for Individual Target Tables. Insert. Select this option to insert a row into a target table. Delete. Select this option to delete a row from a table.. Update. You have the following options in this situation. Update as Update. Update each row flagged for update if it exists in the target table. Update as Insert. Inset each row flagged for update. Update else Insert. Update the row if it exists. Otherwise, insert it.

Truncate table. Select this option to truncate the target table before loading data.

110

Specifying Operations for Individual Target Tables Once we determine how to treat all rows in the session, we also need to set update strategy options for individual targets. Define the update strategy options in the Transformations view on mapping tab of the session properties. We can set the following update strategy options for Individual Target Tables. Insert. Select this option to insert a row into a target table. Delete. Select this option to delete a row from a table. Update. You have the following options in this situation. Update as Update. Update each row flagged for update if it exists in the target table. Update as Insert. Inset each row flagged for update. Update else Insert. Update the row if it exists. Otherwise, insert it.

Truncate table. Select this option to truncate the target table before loading data.

111

4.12

Stored procedure Transformation

A Stored Procedure is a passive transformation A Stored Procedure transformation is an important tool for populating and maintaining databases. Database administrators create stored procedures to automate tasks that are too complicated for standard SQL statements. Stored procedures run in either connected or unconnected mode. The mode we use depends on what the stored procedure does and how we plan to use it in a session. we can configure connected and unconnected Stored Procedure transformations in a mapping. Connected: The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. All data entering the transformation through the input ports affects the stored procedure. We should use a connected Stored Procedure transformation when we need data from an input port sent as an

112

input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation. Unconnected: The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping.

Creating a Stored Procedure Transformation After we configure and test a stored procedure in the database, we must create the Stored Procedure transformation in the Mapping Designer

To import a stored procedure In the Mapping Designer, click Transformation >Import Stored Procedure. Select the database that contains the stored procedure from the list of ODBC sources. Enter the user name, owner name, and password to connect to the database and click Connect

113

Select the procedure to import and click OK.. The Stored Procedure transformation appears in the mapping. The Stored Procedure transformation name is the same as the stored procedure we selected. Open the transformation, and click the Properties tab Select the database where the stored procedure exists from the Connection Information row. If we changed the name of the Stored Procedure transformation to something other than the name of the stored procedure, enter the Stored Procedure Name. Click OK. Click Repository > Save to save changes to the mapping.

114

4.13

Rank Transformation
The Rank transformation is Active Transformation The Rank transformation allows us to select only the top or bottom rank of data. The Rank transformation differs from the transformation functions MAX and MIN, to select a group of top or bottom values, not just one value.

115

Creating Rank Transformation In the Mapping Designer, click Transformation > Create. Select the Rank transformation. Enter a name for the Rank. The naming convention for Rank transformations is RNK_TransformationName. Enter a description for the transformation. This description appears in the Repository Manager. Click Create, and then click done. The Designer creates the Rank transformation. Link columns from an input transformation to the Rank transformation. Click the Ports tab, and then select the Rank (R) option for the port used to measure ranks. If we want to create groups for ranked rows, select Group By for the port that defines the group.

116

Click the Properties tab and select whether we want the top or bottom rank For the Number of Ranks option, enter the number of rows we want to select for the rank. Change the other Rank transformation properties, if necessary. Click OK. Click Repository > Save.

Properties Tab

117

4.14

Java Transformation

The Java transformation is a Active/Passive Connected transformation that provides a simple native programming interface to define transformation functionality with the Java programming language. You create Java transformations by writing Java code snippets that define transformation logic. The Power Center Client uses the Java Development Kit (JDK) to compile the Java code and generate byte code for the transformation. The Integration Service uses the Java Runtime Environment (JRE) to execute generated byte code at run time.

Steps To Define Java Transformation Create the transformation in the Transformation Developer or Mapping Designer. 118

Configure input and output ports and groups for the transformation. Use port names as variables in Java code snippets. Configure the transformation properties. Use the code entry tabs in the transformation to write and compile the Java code for the transformation. Locate and fix compilation errors in the Java code for the transformation. Enter the ports and use that ports as identifier in java code. Go to java code and enter the java code and click compile and check the output in the output window. Create session and workflow and run the session. Functions Some functions used in designer are AVG

Syntax: AVG( numeric_value [, filter_condition ] ) MAX

Syntax: MAX( value [, filter_condition ] ) MIN

Syntax :MIN( value [, filter_condition ] ) INSTR

Syntax: INSTR (string, search_value [, start [, occurrence ] ] ) SUBSTR

Syntax: SUBSTR (string, start [, length]) IS_DATE

Syntax: IS_DATE (value)

119

4.15

User Defined Functions


We can create user-defined functions using the PowerCenter transformation language. Create user-defined functions to reuse expression logic and build complex expressions. User-defined functions are available to other users in a repository. Once you create user-defined functions, we can manage them from the User-Defined Function Browser dialog box. We can also use them as functions in the Expression Editor. They display on the User-Defined Functions tab of the Expression Editor. We create a user-defined function in the Transformation Developer. Configure the following information when we create a user-defined function. Name Type Description Arguments Syntax

120

Steps to Create User-Defined Functions In the Transformation Developer, click Tools > User-Defined Functions. Click New

The Edit User-Defined Function dialog box appears Enter a function name Select a function type

If we create a public user-defined function, we cannot change the function to private when we edit the function

Optionally, enter a description of the user-defined function. We can enter up to 2,000 characters.

Create arguments for the user-defined function. When we create arguments, configure the argument name, data type, precision, and scale. We can select transformation data types.

Click Launch Editor to create an expression that contains the arguments we defined. Click OK The Designer assigns the data type of the data the expression returns. The data types have the precision and scale of transformation data types. 121

Click OK The expression displays in the User-Defined Function Browser dialog box.

4.16

Data Profiling
Data profiling is a technique used to analyze source data. PowerCenter Data Profiling can help us to evaluate source data and detect patterns and exceptions. We can profile source data to suggest candidate keys, detect data patterns and evaluate join criteria. Use Data Profiling to analyze source data in the following situations. During mapping development. During production to maintain data quality. To profile source data, we create a data profile. We can create a data profile based on a source or mapplet in the repository. Data profiles contain functions that perform calculations on the source data. The repository stores the data profile as an object. We can apply profile functions to a column within a source, to a single source, or to multiple sources. We can create the following types of data profiles. Auto profile Contains a predefined set of functions for profiling source data. Use an auto profile during mapping development.

Custom profile Use a custom profile during mapping development to validate documented business rules about the source data. We can also use a custom profile to monitor data quality or validate the results of BI reports.

Steps To Create Auto Profile When we create an auto profile, we can profile groups or columns in the source. Or, we can profile the entire source.

122

To create an auto profile. Select the source definition in the Source Analyzer or mapplet in the Mapplet Designer you want to profile. Launch the Profile Wizard from the following Designer tools. Source Analyzer. Click Sources > Profiling > Create Auto Profile. Mapplet Designer. Click Mapplets > Profiling > Create Auto Profile. You set the default data profile options to open the Auto Profile Column Selection dialog box when you create an auto profile. The source definition contains 25 or more columns. Optionally, click Description to add a description for the data profile. Click OK. Enter a description up to 200 characters. Optionally, select the groups or columns in the source that you want to profile. By default, all columns or groups are selected Select Load Verbose Data if you want the Integration Service to write verbose data to the Data Profiling warehouse during the profile session. By default, Load Verbose Data option is disabled. Click Next. Select additional functions to include in the auto profile. We can also clear functions we do not want to include.

123

Optionally, click Save As Default to create new default functions based on the functions selected here. Optionally, click Profile Settings to enter settings for domain inference and structure inference tuning. Optionally, modify the default profile settings and click OK. Click Configure Session to configure the session properties after you create the data profile. Click Next if you selected Configure Session, or click Finish if you disabled Configure Session. The Designer generates a data profile and profile mapping based on the profile functions. Configure the Profile Run options and click next. Configure the Session Setup options. Click Finish.

We can create a custom profile from the following Designer tools. Source Analyzer. Click Sources > Profiling > Create Custom Profile.

124

Mapplet Designer. Click Mapplets > Profiling > Create Custom Profile. Profile Manager. Click Profile > Create Custom. To create a custom profile, complete the following. Enter a data profile name and optionally add a description. Add sources to the data profile.

Add, edit, or delete a profile function and enable session configuration. Configure profile functions.

Configure the profile session if we enable session configuration.

125

4.17

Profile Manager

Profile Manager is a tool that helps to manage data profiles. It is used to set default data profile options, work with data profiles in the repository, run profile sessions, view profile results, and view sources and mapplets with at least one profile defined for them. When we launch the Profile Manager, we can access profile information for the open folders in the repository.

There are two views in the Profile Manager Profile View: The Profile View tab displays the data profiles in the open folders in the repository.

Source View: The Source View tab displays the source definitions in the open folders in the repository for which we have defined data profiles.

126

sd

4.18

Debugger Overview
We can debug a valid mapping to gain troubleshooting information about data and error conditions. Debugger used in the following situations Before we run a session

After we save a mapping, we can run some initial tests with a debug session before we create and configure a session in the Workflow Manager. After we run a session

If a session fails or if we receive unexpected results in the target, we can run the Debugger against the session. We might also run the Debugger against a session if we want to debug the mapping using the configured session properties. Create breakpoints. Create breakpoints in a mapping where we want the Integration Service to evaluate data and error conditions. Configure the Debugger. Use the Debugger Wizard to configure the Debugger for the mapping. Select the session type the Integration Service uses when it runs Debugger.

127

Run the Debugger. Run the Debugger from within the Mapping Designer. When we run the Debugger the Designer connects to the Integration Service. The Integration Service initializes the Debugger and runs the debugging session and workflow. Monitor the Debugger. While we run the Debugger, we can monitor the target data, transformation and mapplet output data, the debug log, and the session log. Modify data and breakpoints. When the Debugger pauses, we can modify data and see the effect on transformations, mapplets, and targets as the data moves through the pipeline. We can also modify breakpoint information.

Create Breakpoints Go to mapping<<debugger<<edit transformations. Choose the instant name, breakpoint type. And then ADD to add the breakpoints. Give the condition for data breakpoint type. Give no. of errors before we want to stop.

Run the Debugger Got mapping<debugger<start debugger 128

Click next and then choose the session as create debug session otherwise choose existing session Click on next

Choose connections of target and source and click next. Click on next Debug Indicators

129

130

131

PowerCenter Workflow Manager


5.1 Workflow Manager
In the Workflow Manager, we can define a set of instructions to execute tasks, such as sessions, emails, and shell commands. This set of instructions is called a workflow. It has the following tools to help you develop a workflow.

Workflow Manager

Task Developer 132

It creates tasks that we want to accomplish in the workflow.

Worklets Designer

It creates a worklet in the Worklet Designer. A worklet is an object that groups a set of tasks. Worklet is similar to a workflow, but without scheduling information. We can nest worklets inside a workflow. Workflow Designer

It creates a workflow by connecting tasks with links in the Workflow Designer. You can also create tasks in the Workflow Designer as we develop the workflow.

5.2 Workflow Manager Tools


Workflow Designer Maps the execution order and dependencies of Sessions, Tasks and Worklets, for the Informatica Server

Task Developer Create Session, Shell Command and Email tasks Tasks created in the Task Developer are reusable

Worklet Designer Creates objects that represent a set of tasks Worklet objects are reusable

133

5.3 Workflow Structure


A Workflow is set of instructions for the Informatica Server to perform data transformation and load Combines the logic of Session Tasks, other types of Tasks and Worklets The simplest Workflow is composed of a Start Task, a Link and one other Task

5.4 Workflow Tasks


Workflow tasks. Workflow tasks are instructions the Integration Service executes when running a workflow. Workflow tasks perform functions supplementary to extracting, transforming, and loading data. Workflow tasks include commands, decisions, timers, and email notification. You can create the following types of tasks in the Workflow Manager: 134

Assignment. Assigns a value to a workflow variable. Command. Specifies a shell command to run during the workflow. Control. Stops or aborts the workflow. Decision. Specifies a condition to evaluate. Email. Sends email during the workflow. Event-Raise. Notifies the Event-Wait task that an event has occurred. Event-Wait. Waits for an event to occur before executing the next task. Session. Runs a mapping you create in the Designer. Timer. Waits for a timed event to trigger.

5.5 Task Developer


You can create the following three types of tasks in the Task Developer. Command Session Email

135

136

Creating a Task in the Task Developer In the Task Developer, click Tasks > Create. The Create Task dialog box appears. Select the task type you want to create, Command, Session, or Email.

Enter a name for the task. For session tasks, select the mapping you want to associate with the session. Click Create.

137

The Task Developer creates the workflow task. Click done to close the Create Task dialog box.

5.6 Session Task


Server instructions to runs the logic of ONE specific Mapping e.g. - source and target data location specifications, memory allocation, optional Mapping overrides, scheduling, processing and load instructions

Becomes a component of a Workflow (or Worklet) If configured in the Task Developer, the Session Task is reusable (optional)

138

Session Task (continued) Double click on the Session object Valid Mappings are displayed in the dialog box.

Session Task tabs General Properties Config Object Mapping Components Metadata Extensions

139

Session Task-General

140

Session Task-Properties

Session Task-Config Object

Session Task Properties 141

Enable Test Load With a test load, the Integration Service reads and transforms data without writing to targets. $Source Connection Value Enter the database connection you want the Integration Service to use for the $Source variable. Select a relational or application database connection. You can also choose a $DBConnection parameter. Use the $Source variable in Lookup and Stored Procedure transformations to specify the database location for the lookup table or stored procedure. $Target Connection Value Enter the database connection you want the Integration Service to use for the $Target variable. Select a relational or application database connection. You can also choose a $DBConnection parameter. Use the $Target variable in Lookup and Stored Procedure transformations to specify the database location for the lookup table or stored procedure. Treat Source Rows As If the mapping for the session contains an Update Strategy transformation or a Custom transformation configured to set the update strategy, the default option is Data Driven. Commit Interval In conjunction with the selected commit interval type, indicates the number of rows. By default, the Integration Service uses a commit interval of 10,000 rows. This option is not available for user-defined commit.

Session Task-Properties

142

Session Task-Config Object Line Sequential Buffer Length Affects the way the Integration Service reads flat files. Increase this setting from the default of 1024 bytes per line only if source flat file records are larger than 1024 bytes. On Stored Procedure Error Required if the session uses pre- or post-session stored procedures. If you select Stop Session, the Integration Service stops the session on errors executing a pre-session or post-session stored procedure. If you select Continue Session, the Integration Service continues the session regardless of errors executing pre-session or post-session stored procedures.

On Pre-Post SQL Error Required if the session uses pre- or post-session SQL. If you select Stop Session, the Integration Service stops the session errors executing presession or post-session SQL. If you select Continue, the Integration Service continues the session regardless of errors executing pre-session or post-session SQL.

143

Session Task-Config Object

Session Task-Mapping Connection Before the Integration Service can access a source or target database in a session, you must configure the database connections in the Workflow Manager. When you create or modify a session that reads from or writes to a relational database, you can select configured source and target database connections. When you create a connection, you must have the following information available: Database name Name for the connection. Database type Type of the source or target database. 144

Database user name Name of a user who has the appropriate database permissions to read from and write to the database. To use an SQL override with pushdown optimization, the user must also have permission to create views on the source or target database.

Password Database password (7-bit ASCII only).

Connect string Connect string used to communicate with the database.

Database code page Code page associated with the database.

Partition Points Partition points mark the boundaries between threads in a pipeline. The Integration Service redistributes rows of data at partition points. You can add partition points to increase the number of transformation threads and increase session performance Add Partition Point

Click to add a new partition point. When you add a partition point, the transformation name appears under the Partition Points node. Delete Partition Point

Click to delete the selected partition point. You cannot delete certain partition points. Edit Partition Point

145

Click to edit the selected partition point. This opens the Edit Partition Point dialog box. $$PushdownConfig Mapping Depending on the database workload, you may want to use source-side, target-side, or full pushdown optimization at different times. For example, you might want to use partial pushdown optimization during the peak hours of the day, but use full pushdown optimization from midnight until 2 a.m. when activity is low. To use different pushdown optimization configurations at different times, use the $$PushdownConfig mapping parameter. The parameter lets you run the same session using the different types of pushdown optimization. Complete the following steps to configure the mapping parameter: Create $$PushdownConfig in the Mapping Designer.

When you add the $$PushdownConfig mapping parameter in the Mapping Designer, use the following values: Field Value Name $$PushdownConfig Type Parameter Datatype String Precision or Scale 10 Aggregation n/a Initial Value None Description Optional When you configure the session, choose $$PushdownConfig for the Pushdown Optimization attribute. Define the parameter in the parameter file. Enter one of the following values for $$PushdownConfig in the parameter file: None. The Integration Service processes all transformation logic for the session. Source. The Integration Service pushes part of the transformation logic to the source database. Source with View. The Integration Service creates a view to represent the SQL override value, and it runs an SQL statement against this view to push part of the transformation logic to the source database.

146

Target. The Integration Service pushes part of the transformation logic to the target database. Full. The Integration Service pushes all transformation logic to the database. Full with View. The Integration Service creates a view to represent the SQL override value, and it runs an SQL statement against this view to push part of the transformation logic to the source database. The Integration Service pushes any remaining transformation logic to the target database.

Session Task-Transformations Allows overrides of some transformation properties Does not change the properties in the Mapping

Session Task-Components Pre-Session Command Shell commands that the Integration Service performs at the beginning of a session. Post-Session Success Command Shell commands that the Integration Service performs after the session completes successfully Post-Session Failure Command Shell commands that the Integration Service performs after the session if the session fails. 147

5.7 Event-Task
Event-Raise task The Event-Raise task represents the location of a user-defined event. A user-defined event is the sequence of tasks in the branch from the Start task to the Event-Raise task. When the Integration Service runs the Event-Raise task, the Event-Raise task triggers the user-defined event.

To use an Event-Raise task

148

In the Workflow Designer workspace, create an Event-Raise task and place it in the workflow to represent the user-defined event you want to trigger. A user-defined event is the sequence of tasks in the branch from the Start task to the Event-Raise task. Double-click the Event-Raise task to open it. Click the Open button in the Value field on the Properties tab to open the Events Browser for user-defined events. Choose an event in the Events Browser. Click OK twice to return to the workspace.

Event-Wait task The Event-Wait task waits for a predefined event or a user-defined event. A predefined event is a file-watch event. When you use the Event-Wait task to wait for a predefined event, you specify an indicator file for the Integration Service to watch. The Integration Service waits for the indicator file to appear. Once the indicator file appears, the Integration Service continues running tasks after the Event-Wait task.

149

To wait for a user-defined event In the workflow, create an Event-Wait task and double-click the Event-Wait task to open it. In the Events tab of the task, select User-Defined. Click the Event button to open the Events Browser dialog box. Select a user-defined event for the Integration Service to wait. Click OK twice.

150

5.8 E-Mail Task


You can send email during a workflow using the Email task on the Workflow Manager. You can create reusable Email tasks in the Task Developer for any type of email. Or, you can create non-reusable Email tasks in the Workflow and Worklet Designer. Use Email tasks in any of the following locations Session properties. You can configure the session to send email when the session completes or fails. Workflow properties. You can configure the workflow to send email when the workflow is interrupted. Workflows or worklets. You can include an Email task anywhere in the workflow or worklet to send email based on a condition you define.

You can create Email tasks in the Task Developer, Worklet Designer, and Workflow Designer. To create an Email task in the Task Developer: In the Task Developer, click Tasks > Create. The Create Task dialog box appears. 151

Select an Email task and enter a name for the task. Click Create. The Workflow Manager creates an Email task in the workspace. Click Done. Double-click the Email task in the workspace. The Edit Tasks dialog box appears. Click Rename to enter a name for the task. Enter a description for the task in the Description field. Click the Properties tab. Enter the fully qualified email address of the mail recipient in the Email User Name field. Enter the subject of the email in the Email Subject field. Or, you can leave this field blank. Click the Open button in the Email Text field to open the Email Editor. Enter the text of the email message in the Email Editor. You can leave the Email Text field blank. Note: You can incorporate format tags and email variables in a post-session email. However, you cannot add them to an Email task outside the context of a session. Click OK twice to save the changes.

152

153

5.9 Worklets
A worklet is an object that represents a set of tasks that you create in the Worklet Designer. Create a worklet when you want to reuse a set of workflow logic in more than one workflow. To run a worklet, include the worklet in a workflow. The workflow that contains the worklet is called the parent workflow. When the Integration Service runs a worklet, it expands the worklet to run tasks and evaluate links within the worklet. It writes information about worklet execution in the workflow log. To create a reusable worklet In the Worklet Designer, click Worklet > Create. The Create Worklet dialog box appears. Enter a name for the worklet. Click OK. The Worklet Designer creates a Start task in the worklet.

To create a non-reusable worklet: In the Workflow Designer, open a workflow. Click Tasks > Create. For the Task type, select Worklet. Enter a name for the task. Click Create. The Workflow Designer creates the worklet and adds it to the workspace. Click Done.

154

5.10

Workflow Scheduler
Workflow Scheduler Objects Setup reusable schedules to associate with multiple Workflows- Used in Workflows and Session Tasks

155

5.11

Server Connections
Configure Server data access connections- Used in Session Tasks

156

5.12

Relational Connections (Native)


Creating a relational (database) connection o Instructions to the Server to locate relational tables o Used in Session Tasks

157

Relational Connections Properties Define native relational (database) connection

5.13

FTP Connection
Creating an FTP connection

Instructions to the Server to ftp flat files Used in Session Tasks

158

159

5.14

Workflows Design

Sample Workflow

Developing Workflow

160

Building Workflow Components Add Sessions and other Tasks to the Workflow Connect all Workflow components with Links Save the Workflow Start the Workflow NOTE: Sessions in a workflow can be executed independently

161

5.15

Workflow Monitor

Workflow Monitor

We can monitor workflows and tasks in the Workflow Monitor. View details about a workflow or task in Gantt Chart view or Task view. We can run, stop, abort, and resume workflows from the Workflow Monitor. We can view sessions and workflow log events in the Workflow Monitor Log Viewer. The Workflow Monitor displays workflows that have run at least once. The Workflow Monitor continuously receives information from the Integration Service and Repository Service. It also fetches information from the repository to display historic information. The Workflow Monitor consists of the following windows.

Navigator window It displays monitored repositories, servers, and repositories objects.

Output window. It displays messages from the Integration Service and Repository Service.

Time window It displays progress of workflow runs.

Task view It displays details about workflow runs in a report format.

Gantt Chart view It displays details about workflow runs in chronological format.

162

Workflow Monitor Windows

The Workflow Monitor displays Workflows that have been run at least once we can monitor a Server in two modes: online or offline Online mode Workflow Monitor continuously receives information from the Informatica Server and the Repository Server Offline mode Workflow Monitor displays historic information about past Workflow runs by fetching information from the Repository Monitoring Workflow Perform the following operations in the Workflow Monitor Restart -- restart a Task, Workflow or Worklet Stop -- stop a Task, Workflow, or Worklet Abort -- abort a Task, Workflow, or Worklet 163

Resume -- resume a suspended Workflow after a failed task is corrected View Session and Workflow logs Abort has a 60 second timeout

If the Server has not completed processing and committing data during the timeout period, the threads and processes associated with the Session are killed

Monitoring Workflow

164

Monitoring Window Filtering

165

6 Transformations Overview
Type Active/ Aggregator Connected Application Source Active/ Qualifier Connected Description Performs aggregate calculations.

Represents the rows that the Integration Service reads from an application, such as an ERP source, when it runs a session. Custom Active or Passive/ Calls a procedure in a shared library or DLL. Connected Passive/ Calculates a value. Expression Connected External Procedure Passive/ Calls a procedure in a shared library or in the COM layer Connected or of Windows. Unconnected Active/ Filters data. Filter Connected HTTP Passive/ Connects to an HTTP server to read or update data. Connected Passive/ Defines mapplet input rows. Available in the Mapplet Input Connected Designer. Java Active or Passive/ Executes user logic coded in Java. The byte code for the Connected user logic is stored in the repository. Active/ Joins data from different databases or flat file systems. Joiner Connected Passive/ Looks up values. Lookup Connected or Unconnected Normalizer Active/ Source qualifier for COBOL sources. Can also use in the Connected pipeline to normalize data from relational or flat file sources. Passive/ Defines mapplet output rows. Available in the Mapplet Output Connected Designer. Rank Active/ Limits records to a top or bottom range. Connected Active/ Routes data into multiple transformations based on group Router Connected conditions. Passive/ Generates primary keys. Sequence Connected Generator
166

Sorter Source Qualifier SQL Stored Procedure

Transaction Control Union Unstructured Data Update Strategy XML Generator XML Parser XML Source Qualifier

Active/ Connected Active/ Connected Active or Passive/ Connected Passive/ Connected or Unconnected Active/ Connected Active/ Connected Active or Passive/ Connected Active/ Connected Active/ Connected Active/ Connected Active/ Connected

Sorts data based on a sort key. Represents the rows that the Integration Service reads from a relational or flat file source when it runs a session. Executes SQL queries against a database. Calls a stored procedure.

Defines commit and rollback transactions. Merges data from different databases or flat file systems. Transforms data in unstructured and semi-structured formats. Determines whether to insert, delete, update, or reject rows. Reads data from one or more input ports and outputs XML through a single output port. Reads XML from one input port and outputs data to one or more output ports. Represents the rows that the Integration Service reads from an XML source when it runs a session.

167

Das könnte Ihnen auch gefallen