Sie sind auf Seite 1von 29

An Introduction to Informatica

Chandrashekar P

Abstract

Informatica is an ETL product, known as Informatica Power Center.

It is a tool, supporting all the steps of Extraction, Transformation


and Load process. Its an easy to use tool. It can communicate
with all major data sources (mainframe/RDBMS/Flat
Files/XML/VSM/SAP etc), can move/transform data between them.
It can move huge volumes of data in a very effective

It can effectively join data from two distinct data sources

This document gives you an Intro to Informatica

Table of Contents

1. An Overview of DWH...............................................................................................................4
2. Informatica Architecture......................................................................................................5
2.1. Informatica PowerCenter Client Tools...........................................................................5
2.2. Application Services..............................................................................................................7
3. Informatica Transformations.............................................................................................8
3.1. Source Qualifier Transformation......................................................................................9
3.2. Expression Transformation..............................................................................................12
3.3. Aggregate Transformation...............................................................................................15
3.4. Filter Transformation..........................................................................................................16
3.5. Router Transformation.......................................................................................................17
3.6. Sorter Transformation........................................................................................................18
3.7. Joiner Transformation:......................................................................................................19
3.8. Lookup Transformation......................................................................................................21
3.9. Union Transformation.........................................................................................................24
4. Workflow Creation..................................................................................................................25
5. Summary.......................................................................................................................................30
An Introduction to Informatica

1. An Overview of DWH
A data warehouse is a relational database that is designed for query and analysis
rather than for transaction processing. It usually contains historical data derived
from transaction data, but it can include data from other sources. In addition to a
relational database, a data warehouse environment includes an extraction,
transportation, transformation, and loading (ETL) solution

ETL Technology (shown below with arrows) is an important component of the


Data Warehousing Architecture. It is used to copy data from Operational
Applications to the Data Warehouse Staging Area, from the DW Staging Area into
the Data Warehouse and finally from the Data Warehouse into a set of conformed
Data Marts that are accessible by decision makers.
We have different types of tools in DWH: Informatica, Ab Initio, Data stage,
Oracle Data Integrator,

Lets have a detail look on Informatica.

Page 2 of 29
An Introduction to Informatica

2. Informatica Architecture

Tool_View:

2.1. Informatica PowerCenter Client Tools

These are the development tools installed at developer end. These tools enable a
developer to

Define transformation process, known as mapping. (Designer)

Click on Designer:

Repository Navigator windows gets displayed, Right_Click on the required


Repository then select Connect, Provide the credentials to login, then

Select the corresponding folder Click on Open, to access the folders,

Page 3 of 29
An Introduction to Informatica

The Designing Window gets displayed

Define run-time properties for a mapping, known as sessions (Workflow


Manager)

Page 4 of 29
An Introduction to Informatica

Monitor execution of sessions (Workflow Monitor)

Manage repository, useful for administrators (Repository Manager)

2.2. Application Services

Application services are a group of services that represent PowerCenter server-


based functionality. When you configure an application service, you designate the
node where it runs.

Type of Application Services:

Page 5 of 29
An Introduction to Informatica

Repository Service: The Repository Service is an application service that manages


the repository. It retrieves, inserts, and updates metadata in the repository database
tables.

Integration Service: The Integration Service is an application service that runs


data integration sessions and workflows

SAP BW Service: The SAP BW Service is an application service that listens for RFC
requests from SAP BW and initiates workflows to extract from or load to SAP BW

Web Services Hub : The Web Services Hub is a web service gateway for external
clients.
It processes SOAP requests from web service clients that want to access
PowerCenter functionality through web services. Web service clients access the
Integration Service and Repository Service through the Web Services Hub

Core Services:

The PowerCenter Architecture has a new set of Core Services which comprises of:
Log Service / Gateway Service / Administration Service / Configuration Service
Authentication Service and Domain Service

3. Informatica Transformations

A transformation is a repository object that generates, modifies, or passes data. The


Designer provides a set of transformations that perform specific functions.

Transformation can be

Page 6 of 29
An Introduction to Informatica

Type of Transformations:

Note: To view all the available types of transformations click on Transformations


in Tool bar.

3.1. Source Qualifier Transformation

1. Active and Connected transformation.

2. The Source Qualifier is used to join data originating from the same
source database,

3. Filter rows when the Integration Service reads source data,

4. Specify an outer join rather than the default inner join

5. To specify sorted ports.

6. It is used to select only distinct values from the source

Hands_on:

Select the type of source file, (Source can be a DB, XML File, Flat file)

As part of this example the source is DB. Click on Import from Database

Select the Data source, and the source table,

Page 7 of 29
An Introduction to Informatica

The table will be imported as mentioned below and it gets stored in Source folder.

Now Navigate to Mapping designer ,

Click on mapping A popup window will get displayed, provide the mapping name
(eg: m_*****)

Page 8 of 29
An Introduction to Informatica

Now open the Source folder and drag the required table in to the Mapping designer
window.

While dragging itself each Source definition will have its source qualifier. Double click
on SQ A popup window gets opened, In the ports tab provide the needed ports
in the order of the results getting retrieved from your query.

Note: All the fields might not be required, so in that case delete the unwanted ports,
while linking the field the Datatype needs to be taken care.

Page 9 of 29
An Introduction to Informatica

In the properties tab query can be generated as per the requirement.

3.2. Expression Transformation

Passive and Connected Transformation.


It permits you to perform calculations row by row basis only.

Example: Discount of Each Product, Concatenate Names

Click on the Expression Transformation icon and drag it in the designer window,
Double click on the dragged Exp_Trns Ports tab will have the details of the
ports and its type (Input, Output or variable)

Page 10 of 29
An Introduction to Informatica

In the above example we have variable named: Name & Annual_Income, Click as
mentioned below and write the required expressions. In the functions tab the
in-built function of the tools can be seen with the syntax.

Import Target:

Navigate to Target designer Select Create and provide the Target Table name

Page 11 of 29
An Introduction to Informatica

Add the required columns as per the output and its appropriate datatype &
precision.

Now Navigate to Mapping Designer and drag the created target in to the designer
window. Once after linking all the fields, save it, the output window shows the
status of the mapping.

Source Value:

Page 12 of 29
An Introduction to Informatica

Output:

To get the output, run the Workflow corresponding to that. (Please refer How to
Create Workflow Session for details)

3.3. Aggregate Transformation

1. Active & Connected transformation..


2. Aggregate Functions are :
Average, sum, count etc. on multiple rows or groups. Aggregate Functions:
AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM,
VARIANCE.
3. Here you can perform calculations on groups.

Example:
To calculate total of daily sales / To calculate average of monthly / yearly sales.

Click on the Aggregate Transformation icon and drag it in the designer window,
Double click on the dragged Aggr_Trns Ports tab will have the details of the
ports and its type (Input, Output or variable). Select he column on what basis the
Grouping needs to be done.

Page 13 of 29
An Introduction to Informatica

Source Value:

Output:

3.4. Filter Transformation

Active and connected transformations.


It can be used to filter rows in a mapping that do not meet the condition.

Example:
Employees who are working in Department: 10
Product that falls in the rate category $500 and $1000

Click on the Filter Transformation icon and drag it in the designer window, Double
click on the dragged Filtr_Trns Ports tab will have the conditions for filtering.

Page 14 of 29
An Introduction to Informatica

Output:

3.5. Router Transformation

Active & Connected Transformation.


It is similar to filter transformation because both allow you to apply a
condition to test data.
The only difference is, filter transformation drops the data that do not meet
the condition whereas router has an option to capture the data that do not
meet the condition and route it to a default output group.
The Router transformation is more efficient.

Example: If State=Michigan, State = California and all other.

Page 15 of 29
An Introduction to Informatica

NewGroup1:

Page 16 of 29
An Introduction to Informatica

NewGroup2:

Default:

3.6. Sorter Transformation

Active & Connected transformation.


It is used sort data either in ascending or descending order according to a
specified sort key.
When you create a Sorter transformation in a mapping, you specify one or
more ports as a sort key and configure each sort key port to sort in ascending
or descending order.
Also its used to configure for case- sensitive sorting and specify whether the
output rows should be distinct.

Fetching frm staging table without query (sorter)

Page 17 of 29
An Introduction to Informatica

3.7. Joiner Transformation:

Active & Connected


It is used to join data from two related heterogeneous sources residing in
different locations
To join data from the same source.

Note: In order to join two sources, there must be at least one or more pairs of
matching column between the sources and a must to specify one source as
master and the other as detail.

Click on the Joiner Transformation icon and drag it in the designer window, Double
click on the dragged Jnr_Trns condition tab will have the conditions for joining &
Ports tab will have the type of join.

Page 18 of 29
An Introduction to Informatica

As the joined salary_hiked_details, table doesnt hold information for Project


Manager. The output doesnt have the PM information.

Page 19 of 29
An Introduction to Informatica

3.8. Lookup Transformation

Passive & Connected or UnConnected.


It is used to look up data in a flat file, relational table, view, or synonym.
It compares lookup transformation ports (input ports) to the source column
values based on the lookup condition. Later returned values can be passed to
other transformations.
You can create a lookup definition from a source qualifier and can also use
multiple Lookup transformations in a mapping.

For Example: You are having Emp_details which will hold all the existing
employee level informations.

Need to create emp_ids for New_Joinees to verify the existing ids and to
generate new id Lookup can be used. (Because joiner will not work here, as there is
no common key).

Function unconnected Return 1 port


Procedure connected. Return multiple ports

Select the Lookup Icon select the Lookup table from Source / Target otherwise
Import the same.

Drag the needed values from Source Qualifier to the Lookup, Double click the Lookup
transformation Name the fields from Source Qualifier as In_****.

Page 20 of 29
An Introduction to Informatica

In the Condition Tab Provide the required Conditions.

Link the fields in the appropriate way save the mapping.

Page 21 of 29
An Introduction to Informatica

Source_Table_Values:

Output: (Depends upon the Performance and Designation the MSI percentage has
been assigned to each employee)

Unconnected Lookup

Page 22 of 29
An Introduction to Informatica

Output:

3.9. Union Transformation

Active & Connected.


The Union transformation is a multiple input group transformation that you
use to merge data from multiple pipelines or pipeline branches into one
pipeline branch.
It merges data from multiple sources similar to the UNION ALL SQL statement
to combine the results from two or more SQL statements.

Page 23 of 29
An Introduction to Informatica

Similar to the UNION ALL statement, the Union transformation does not
remove duplicate rows.

4. Workflow Creation

Navigate to Workflow manager click on Workflow designer select Workflow


create Provide the new workflow name (wf_*****)

Click on Task Create Provide new session name and select the
appropriate mapping for workflow.

Page 24 of 29
An Introduction to Informatica

Click Tasks Link Task Then link the session

Double click on the Session General provide Fail Parent if the task fails

Properties Provide the log file name & path details .

Page 25 of 29
An Introduction to Informatica

Mapping Select the DB Connection details for Source & Provide the Target file
type & location for the output file.

Note: SQL Query can be modified at session level. If the query in SQ & Session is
different then the job will take the session level query.

Page 26 of 29
An Introduction to Informatica

Once after modifying everything Save the workflow.

To run the WF Right click on the appropriate Workflow Start Workflow.

Page 27 of 29
An Introduction to Informatica

Once after running the WF The Monitor Window gets opened.

To view the reason for failure Right Click on the Session Get Session Log.

Page 28 of 29
An Introduction to Informatica

After making the required changes Refresh the mapping then run the
Workflow again.

Page 29 of 29