Sie sind auf Seite 1von 57

Informatica Interview Questions

What is Enterprise Data Warehousing?

When the organization data is created at a single point of access it is called as enterprise data
warehousing. Data can be provided with a global view to the server via a single source store.
One can do periodic analysis on that same source. It gives better results but however the time
required is high.

What the difference is between a database, a data warehouse and a data mart?

Database includes a set of sensibly affiliated data which is normally small in size as compared to
data warehouse. While in data warehouse there are assortments of all sorts of data and data is
taken out only according to the customers needs. On the other hand datamart is also a set of data
which is designed to cater the needs of different domains. For instance an organization having
different chunk of data for its different departments i.e. sales, finance, marketing etc.

What is domain in Informatica?

When all related relationships and nodes are covered by a sole organizational point, its called
domain. Through this data management can be improved.

What is the difference between a repository server and a powerhouse?

Repository server controls the complete repository which includes tables, charts, and various
procedures etc. Its main function is to assure the repository integrity and consistency. While a
powerhouse server governs the implementation of various processes among the factors of
servers database repository.

How many repositories can be created in Informatica?

There can be any number of repositories in Informatica but eventually it depends on number of
ports.

What is the benefit of partitioning a session?

Partitioning a session means solo implementation sequences within the session. Its main
purpose is to improve servers operation and efficiency. Other transformations including
extractions and other outputs of single partitions are carried out in parallel.

How are indexes created after completing the load process?

For the purpose of creating indexes after the load process, command tasks at session level can be
used. Index creating scripts can be brought in line with the sessions workflow or the post
session implementation sequence. Moreover this type of index creation cannot be controlled after
the load process at transformation level.
Explain sessions. Explain how batches are used to combine executions?

A teaching set that needs to be implemented to convert data from a source to a target is called a
session. Session can be carried out using the sessions manager or pmcmd command. Batch
execution can be used to combine sessions executions either in serial manner or in a parallel.
Batches can have different sessions carrying forward in a parallel or serial manner.

How many number of sessions can one group in batches?

One can group any number of sessions but it would be easier for migration if the number of
sessions are lesser in a batch.

Explain the difference between mapping parameter and mapping variable?

When values change during the sessions execution its called a mapping variable. Upon
completion the Informatica server stores the end value of a variable and is reused when session
restarts. Moreover those values that do not change during the sessions execution are called
mapping parameters. Mapping procedure explains mapping parameters and their usage. Values
are allocated to these parameters before starting the session.

What is complex mapping?

Following are the features of complex mapping.

1. Difficult requirements

2. Many numbers of transformations

3. Complex business logic

How can one identify whether mapping is correct or not without connecting session?

One can find whether the session is correct or not without connecting the session is with the help
of debugging option.

Can one use mapping parameter or variables created in one mapping into any other
reusable transformation?

Yes, One can do because reusable transformation does not contain any mapplet or mapping.

Explain the use of aggregator cache file?

Aggregator transformations are handled in chunks of instructions during each run. It stores
transitional values which are found in local buffer memory. Aggregators provides extra cache
files for storing the transformation values if extra memory is required.
Briefly describe lookup transformation?

Lookup transformations are those transformations which have admission right to RDBMS based
data set. The server makes the access faster by using the lookup tables to look at explicit table
data or the database. Concluding data is achieved by matching the look up condition for all look
up ports delivered during transformations.

What does role playing dimension mean?

The dimensions that are utilized for playing diversified roles while remaining in the same
database domain are called role playing dimensions.

How can repository reports be accessed without SQL or other transformations?

Repositoryreports are established by metadata reporter. There is no need of SQL or other


transformation since it is a web app.

What are the types of metadata that stores in repository?

The types of metadata includes Source definition, Target definition, Mappings, Mapplet,
Transformations.

Explain the code page compatibility?

When data moves from one code page to another provided that both code pages have the same
character sets then data loss cannot occur. All the characteristics of source page must be
available in the target page. Moreover if all the characters of source page are not present in the
target page then it would be a subset and data loss will definitely occur during transformation
due the fact the two code pages are not compatible.

How can you validate all mappings in the repository simultaneously?

All the mappings cannot be validated simultaneously because each time only one mapping can
be validated.

Briefly explain the Aggregator transformation?

It allows one to do aggregate calculations such as sums, averages etc. It is unlike expression
transformation in which one can do calculations in groups.

Describe Expression transformation?

Values can be calculated in single row before writing on the target in this form of transformation.
It can be used to perform non aggregate calculations. Conditional statements can also be tested
before output results go to target tables.
What do you mean by filter transformation?

It is a medium of filtering rows in a mapping. Data needs to be transformed through filter


transformation and then filter condition is applied. Filter transformation contains all ports of
input/output, and the rows which meet the condition can only pass through that filter.

What is Joiner transformation?

Joiner transformation combines two affiliated heterogeneous sources living in different locations
while a source qualifier transformation can combine data emerging from a common source.

What is Lookup transformation?

It is used for looking up data in a relational table through mapping. Lookup definition from any
relational database is imported from a source which has tendency of connecting client and server.
One can use multiple lookup transformation in a mapping.

How Union Transformation is used?

It is a diverse input group transformation which can be used to combine data from different
sources. It works like UNION All statement in SQL that is used to combine result set of two
SELECT statements.

What do you mean Incremental Aggregation?

Option for incremental aggregation is enabled whenever a session is created for a mapping
aggregate. Power center performs incremental aggregation through the mapping and historical
cache data to perform new aggregation calculations incrementally.

What is the difference between a connected look up and unconnected look up?

When the inputs are taken directly from other transformations in the pipeline it is called
connected lookup. While unconnected lookup doesnt take inputs directly from other
transformations, but it can be used in any transformations and can be raised as a function using
LKP expression. So it can be said that an unconnected lookup can be called multiple times in
mapping.

What is a mapplet?

A recyclable object that is using mapplet designer is called a mapplet. It permits one to reuse the
transformation logic in multitude mappings moreover it also contains set of transformations.

Briefly define reusable transformation?


Reusable transformation is used numerous times in mapping. It is different from other mappings
which use the transformation since it is stored as a metadata. The transformations will be
nullified in the mappings whenever any change in the reusable transformation is made.

What does update strategy mean, and what are the different option of it?

Row by row processing is done by informatica. Every row is inserted in the target table because
it is marked as default. Update strategy is used whenever the row has to be updated or inserted
based on some sequence. Moreover the condition must be specified in update strategy for the
processed row to be marked as updated or inserted.

What is the scenario which compels informatica server to reject files?

This happens when it faces DD_Reject in update strategy transformation. Moreover it disrupts
the database constraint filed in the rows was condensed.

What is a surrogate key?

Surrogate key is a replacement for the natural prime key. It is a unique identification for each
row in the table. It is very beneficial because the natural primary key can change which
eventually makes update more difficult. They are always used in form of a digit or integer.

What are the prerequisite tasks to achieve the session partition?

In order to perform session partition one need to configure the session to partition source data
and then installing the Informatica server machine in multifold CPUs.

Which files are created during the session rums by informatics server?

During session runs, the files created are namely Errors log, Bad file, Workflow low and session
log.

What is a Session task?

It is a chunk of instruction the guides Power center server about how and when to transfer data
from sources to targets.

What does Command task mean?

This specific task permits one or more than one shell commands in UNIX or DOS in windows to
run during the workflow.

What is a Standalone Command task?

This task can be used anywhere in the workflow to run the shell commands.
What is meant by Pre and Post session shell command?

Command task can be called as the pre or post session shell command for a session task. One can
run it as pre session command r post session success command or post session failure command.

What is a Predefined event?

It is a file-watch event. It waits for a specific file to arrive at a specific location.

What is a user defined event?

User defined event can be described as a flow of tasks in the workflow. Events can be created
and then raised as need arises.

What is a Informatica Work flow?

Work flow is a bunch of instructions that communicates server about how to implement tasks.

What are the different tools in Informatica workflow manager?

Following are the different tools in workflow manager namely

1. Task Designer

2. Task Developer

3. Workflow Designer

Tell me any other tools for scheduling purpose other than workflow manager pmcmd?

The tool for scheduling purpose other than workflow manager can be a third party tool like
CONTROL M.

What is OLAP (On-Line Analytical Processing?

A method by which multi-dimensional analysis occurs.

What are the different types of OLAP? Give an example?

ROLAP eg.BO, MOLAP eg.Cognos, HOLAP, DOLAP

What is a Worklet?

When the workflow tasks are grouped in a set, it is called as worklet. Workflow tasks includes
timer, decision, command, event wait, mail, session, link, assignment, control etc.
What is the use of target designer?

Target Definition is created with the help of target designer.

Where can we find the throughput option in informatica?

Throughput option can be found in informatica in workflow monitor. In workflow monitor, right
click on session, then click on get run properties and under source/target statistics we can find
throughput option.

What is a target load order?

Target load order is specified on the basis of source qualifiers in a mapping. If there are multifold
source qualifiers linked to different targets then one can entitle order in which informatica server
loads data into targets.

Aggregator Transformation in Informatica

Aggregator Transformation
This is the type an active transformation which allows you to calculate the summarys for a
group of records. An aggregated transformation is created with following components.

Group by

This component defines the group for a specific port (s) which participates in aggregation

Aggregate Expression

Use aggregate functions to drive the aggregate expression which can be develop either in
variable ports (or) In only output ports

Sorted input

Group by ports are sorted using a sorted transformation and receive the sorted data as a input
to improve the performance of data aggregation.

Keep the sorted transformation prior the aggregator transformation to perform sorting on fro up
by ports.
Aggregate Cache

An integration service create aggregate ache for

Unsorted inputs

The aggregate cache contains group by ports, non group by input ports and ouptput port which
contains aggregate expressions.

This transformation offers even more functionality than SQLs group by statements since one
can apply conditional logic to groups within the aggregator transformation. Many different
aggregate functions can be applied to individual output ports within the transformation. One is
also able to code nested aggregate functions as well. Below is a list of these aggregate functions:

AVG
COUNT
FIRST
LAST
MAX
MEDIAN
MIN
PERCENTILE
STDDEV
SUM
VARIANCE

Creating an Aggregator Transformation in Informatica


Go to the Mapping Designer, click on transformation in the toolbar -> create.
Select the Aggregator transformation, enter the name and click create. Then click Done. This
will create an aggregator transformation without ports.

To create ports, you can either drag the ports to the aggregator transformation or create in the
ports tab of the aggregator.

Configuring the aggregator transformation

You can configure the following components in aggregator transformation in Informatica.

Aggregate Cache: The integration service stores the group values in the index cache and row
data in the data cache.

Aggregate Expression: You can enter expressions in the output port or variable port.

Group by Port: This tells the integration service how to create groups. You can configure input,
input/output or variable ports for the group.

Sorted Input: This option can be used to improve the session performance. You can use this
option only when the input to the aggregator transformation in sorted on group by ports.

Informatica Nested Aggregate Functions


You can nest one aggregate function within another aggregate function. You can either use
single-level aggregate functions or multiple nested functions in an aggregate transformation. You
cannot use both single-level and nested aggregate functions in an aggregator transformation. The
Mapping designer marks the mapping as invalid if an aggregator transformation contains both
single-level and nested aggregate functions. If you want to create both single-level and nested
aggregate functions, create separate aggregate transformations.

Examples: MAX(SUM(sales))

Conditional clauses

You can reduce the number of rows processed in the aggregation by specifying a conditional
clause.

Example: SUM(salary, slaray>1000)

This will include only the salaries which are greater than 1000 in the SUM calculation.

Non Conditional clauses

You can also use non-aggregate functions in aggregator transformation.

Example: IIF( SUM(sales) <20000, SUM(sales),0)


Note: By default, the Integration Service treats null values as NULL in aggregate functions. You
can change this by configuring the integration service.

Incremental Aggregation in Informatica


After you create a session that includes an Aggregator transformation, you can enable the session
option, Incremental Aggregation. When the Integration Service performs incremental
aggregation, it passes source data through the mapping and uses historical cache data to perform
aggregation calculations incrementally.

Rank Transformation in Informatica

What is Rank Transformation


This is of type an active transformation which allows you to calculate the Top and Button
Performers. The rank transformation is created with following types of ports.

Input port
Output port
Variable port (V)
Rank Port (R)
Rank Port

The port which is participated in a rank calculation is known as Rank port.

Variable Port

A port which allows you to develop expression to store the data temporarily for rank calculation
is known as variable port. Variable port support to write expressions which are required for rank
calculation.

Set the if properties to calculates the ranks

Top or bottom
Number of Ranks

What is Rank Index?


The Developer tool creates a RANKINDEX port for each Rank transformation. The Data Integration
Service uses the Rank Index port to store the ranking position for each row in a group. After the Rank
transformation identifies all rows that belong to a top or bottom rank, it then assigns rank index values.
If two rank values match, they receive the same value in the rank index and the transformation skips the
next value.
The RANKINDEX is an output port only. You can pass the rank index to another transformation in the
mapping or directly to a target.

Advanced Properties for Rank Transformation

Top/Bottom Specifies whether you want the top or bottom ranking for a column.

Number of Ranks Number of rows to include in the top or bottom ranking.

Case-Sensitive String Comparison Specifies whether the Data Integration Service uses case-sensitive
string comparisons when it ranks strings. Clear this option to have the Data Integration Service ignore
case for strings.

Cache Directory- Local directory where the Data Integration Service creates the index cache files and
data cache files. Default is the CacheDir system parameter.

Rank Data Cache Size Data cache size for the transformation. Default is Auto.

Rank Index Cache Size Index cache size for the transformation. Default is Auto.

Tracing Level Amount of detail that appears in the log for this transformation. You can choose terse,
normal, verbose initialization, or verbose data. Default is normal.

Defining Groups

Like the Aggregator transformation, the Rank transformation lets you group information.

Example: If you want to select the 10 most expensive items by manufacturer, you would first define a
group for each manufacturer.

Rank Transformation in Informatica with Examples


Procedure to create and configure Rank Transformation

In the Mapping Designer, open a Mapping.

Click Transformation > Create. Select Rank transformation.

Enter a name and click Done.

You will see one port RANKINDEX port already there. This port store the ranking of each
record and can be used to populate target as well

Add all additional port from source input which are going to be use in following transformation.

Open the port tab and first check the Group by option for desired column ( for example deptno in
our case)
Also check the Rank (R) option for the port which you want to do ranking. For example salary
in our case.

We can define Group by indicator for multiple port, but Ranking can be done on single
port only.

Go to the properties tab, select the Top/Bottom value as Top and the Number of Ranks property
as per need.
Click OK.

Connect output ports of Rank Transformation to other transformation or target

Lookup Transformation in Informatica

Lookup Transformation
The Lookup transformation is used to look up a source, source qualifier, or target to get the
relevant data. You can look up flat file and relational tables. The Lookup transformation in
Informatica works on similar lines as the joiner, with a few differences. For example, lookup
does not require two sources. Lookup transformations can be connected and unconnected. They
extract the data from the lookup table or file based on the lookup condition.

This is the type passive transformation which allows you to perform lookup on relational table.
Flat files, synonyms and views.

When the mapping contain the workup transformation the integration service queries the lock up
data and compares it with lookup input port values.

The lookup transformation is created with following type of ports


1. Input port (I)
2. Output port (O)
3. Look up Ports (L)
4. Return Port (R)

The lookup transformation support horizontal merging such as equijoin and non equijoin.

Informatica Lookup Transformation Uses


Lookup transformation is used to return following tasks.

get a related value

in updating slowly changing dimension

Connected Lookups in Informatica


In Connected lookup transformation, we connect input and output ports to other transformation
in the mapping.

For each input row, the Integration Service queries the lookup source or cache based on the
lookup ports and the condition in the transformation.

If the transformation is uncached or uses a static cache, the Integration Service returns values
from the lookup query.

If the transformation uses a dynamic cache, the Integration Service inserts the row into the cache
when it does not find the row in the cache. When the Integration Service finds the row in the
cache, it updates the row in the cache or leaves it unchanged. It flags the row as insert, update, or
no change.

The Integration Service passes return values from the query to the next transformation. If the
transformation uses a dynamic cache, you can pass rows to a Filter or Router transformation to
filter new rows to the target.

If there is no match for the lookup condition, the Integration Service returns the default value for
all output ports.

Informatica UnConnected Lookups


An unconnected lookup is not part of mapping data flow

It is neither connected to the source nor connected to the target


An unconnected look up transformation can receive the multi input ports but returns
single
output port which is designated by return port (R)
An unconnected look up transformation is commonly used when the look is not need for
every record.
An unconnected look up transformation that supports to write expression
The look up function used in conditional statement

Mapping Naming Standards in Informatica

Mapping Naming Standards:


Stage to star mappings

1. M_Account_Dim (stage to star)


2. M_Date_Dim (stage to star)
3. M_Employee_Dim (stage to star)
4. M_Market_Dim (stage to star)
5. M_Product_Dim (stage to star)
6. M_Transaction_Details_Fact (stage to star)

Date Dimension Loading:

create the transformation type expression and sequence generator

In expression transformation create the following Expression

Port Name Expression

Day_of_week > To_Decimal (To_Char(Full_Date , D)

Date_Number_In_Month > To_Decimal (To_Char(Full_Date , DD)

Date_Number_In_Year > To_Decimal (To_Char(Full_Date , DDD)

Week_Number_In_Month > To_Decimal (To_Char(Full_Date , W)

Week_Number_In_Year > To_Decimal (To_Char(Full_Date , WW)

Month_Number > To_Decimal (To_Char(Full_Date , mm)


Month_Name > To_Char (Full_Date , MON)

Quarter_Number > To_Decimal (To_Char(Full_Date , Q)

Year > To_Decimal (To_Char(Full_Date , yyyy)

Run_Week_Number > To_Decimal (To_Char(Full_Date , W)

Market Dimension SedType1:

In expression transformation create a expression for Null handling for all column except
INS_UPD_User

Port Name Expression

Flag_Exception IFF(IS NULL( ) OR

Double click on router T/R click on Group Port

Port Name

Exp_Correct

Exp_Exception

In this Router T/R , copy the Exp_Exception Ports to Exp_Expression T/R and Exp T/R ports
connect to the Excp_Target

From Router T/R copy the Exp_correct ports to the Expression T/R
From transformation select the LOOK UP transformation

From transformation select the Expression

Port Name Expression

Insert Iff (ISNULL( True, False )

Update Iff (NOT is NULL( True, False)

From Transformation select the Router T/R

Double click on Router T/R click on group tab

Port Name

New Flag

Update Flag

From Transformation select the Exp and sequence generator

> Copy the New Flag ports to the Exp

> Copy the Update Flag ports to the expression and connect Target

Product Dimension SED Type1:


Note: Write a expression for NULL handling for all the ports except folowing ports.

1. INS_Upd_date
2. INS_Upd_User

Employee Dimension- SED Type 2:

Create a mapping with the name m_Employee_din

Drop the source defination T_Employee

Drop the target defination as a three Instances


Dm_T_Employee_DIm (New record to insert)

Dm_T_Employee_DIm 1 (Update record to insert)

Dm_T_Employee_DIm 2 (Update record to Update,enddate)

Drop the Exception table

Create the transformation type expression to handle the null values create the I/R type
router to pass the correct date to one exp to pass the expression data to another expression I/R

NOTE: define the null handling on all the ports except following ports

Emp_insert_DATE

EMP_I/p date_DATE

INS_Upd_User

From expression transformation expression

LOADING ACCOUNT DIMENSION: SED TYPE 2

Note: In expression T/R write an expression to handle the nulls on all ports except rp-details-
last-modified date, rp- details-lost-modified by

From look up transformation click on condition

From properties tab BSR:

From exp T/R


New-flag Iff(ISNULL(account- key ), true, false

Update-flag Iff(not is null (account key) AND

Is null(end-date)And

(Client name ! =src-client) True


false

FACT table loading:

There are three sources to load the data into fact table

1. Client order
2. Client allocation
3. Client execution

Creation of source definition:

From tool menu select source analyzer from source menu select create enter the name stg-
transaction-detail-fact

Select database type oracle click create and done

Double click the source definition selects the columns tab

Column name Data type Precision Scale Not null Key type
Branch Varchar 2 4
Account number Varchar 2 10
Deport Varchar 2 2
Client-flag Varchar 2 3
Counter party-flag Varchar 2 3
Full date date
Emp-web-SSO-ID Varchar 2 200
market Varchar 2 50
Product-ISI Varchar 2 200
Client-order-amount number 20
Allocation-amount number 20
Execution number 20
Pending-amount number 20

Creating a mapping with the name m-transaction detail- fact


Drop the source and target definition
Create the transformation type
Double click the SQ transformation selects the properties tab

Transformation attribute Value


SQL query In live query to join

Clicks apply, click ok

Double click on expression T/R select the ports tab UN check the output ports (except client
order amount, allocation amount, execution amount)

Create output port to make trim operation

Branch I

D-branch

Similarly all ports this type C

Trim (RT rim

(C-branch)

Full date O

Trane (full date)

(To (-char) (full date)

Create transformation type expression exp- date- conversion

From expression transformation copy the ports to the expression transformation

Double click on express T/R select the ports tab


Uncheck the output port for a port name 0 full date

Create an output port with the name full- date with the following expression

To- date (to-char (full-date), (mm-dd-yyyy), mm -dd-yyyy)

Click applies and clicks ok

Create a look up T/R which perform a look up on data-dim market-dim, account dim, product
dim, employee-dim

From look up transformation copy the following ports to expression transformation

Date- key, product-key, account-key, market-key, employee-key

From expression transformation (from source) copy the 13 ports to the expression T/R

Create the transformation type lookup which perform a look on target table (transaction- detail-
fact)

From the expression T/R copy the following ports to the look up transformation
(Branch, account-no, deport, client-flag, counterparty-flag, full-date, market-code, product-ISIN,
EMP-web SSC-ID)

Create the transformation type expression

From source copy the entire source to expression transformation

From look up T/R copy transaction-sequence-ID

Double click on expression T/R, select the port tab

New -flag O

Iff (is null (transaction-sequence-ID)

True,false

Update -flag O

Iff (not is null (transactionsequence-ID)

True,false

Click applies, click ok


Create the transformation type router T/R

From the Expression T/R copy all ports to the router T/R

Double click on the router select the group tab

New-flag new-
flag=true

Update-flag
update=true

Click applies and clicks ok

Define a new record follow:

Create the transformation type Expression, Update strategy, Sequence generate

Define update follow:

Create the T/R type expression update strategy

SQL Transformation:

Create a target table which bellows structure

SQL> create table EMP SQL

EMP no Number (4)

EMP name Varchar2 (15)

Job Varchar2 (15)

Dept no Number (4)


Hire date Date

Sal Dumber (7)

Comm Number (5)

MGR Number (4)

Go to tool menu and click on target designer import the target table

EMP- SQL:

Create a mapping with the name m-cmp-SQL


Drag and drop the source tmp and target (EMP-SQL) to the mapping designer
Go to transformation menu click on create select the T/R type SQL T/R

Click on create select Query-mode DB type is oracle click ok

Double click on the SQL transformation click on SQL ports

Port INPUT PORTS

name data type P S I o


Emp no Number

OUT PUT PORT

EMP no Number

EMP name Varchar2

Job Vcarchar2

Dept no Number

Hire date date

Sal number

Comm. number
MGR number

Click on SQL query , write SQL query in that window

Bellow

Select Emp number, E name, job, dept no, hire date, sal, comm, MGR,

Where empno =? empno?

Parameter

Click ok

Click applies and clicks ok

Select emp no from SQ and link to the SQL T/R input ports

Select the all the output column from SQL T/R link to the Target

In session level provide the relation connection information (BSR-Reader) to SQL T/R

EVENT WAIT AND EVENT RISE in Inforamtica

EVENT WAIT AND EVENT RISE:


Create a work flow with the name w-event-wait-raise
Go to task menu click on create
Select the task type as event wait, enter the name event-source click-create
Drag the session s10, s20, s30 to the workflow designe
Double click on the event wait
Click on events tab
Please event the name of the file to watch D:/path/filename.txt click on ok
Tasks and types of tasks:

Event rise task:

The integration service rises the user defined events during the work flow run

The user defined events declared in work flow properties event tab
The event rise task are always use in conjunction with event wait task
The event raise task send a signal to the event wait task that a particular set of predetermined
event have occurred
A user defined event is defined completion of tasks from start task to event raise task

Event wait task:

An event wait task waits for a specific event to occur to start the next task in the work flow

Pauses the processing of the pipeline until a specified event occurs


There are two types of events can be defined

1. File watch event (predefined event)


2. User defined event
Procedure:

From tools menu selects work flow designer


From work flow menu select create ]
Enter the work flow name w-star-load
Selects the event tab
From tools bar click on new
Enter the event name dim- load complete, click ok
Drop the sessions beside the work flow
From the workflow designer select task, select event raise and event wait
Create the task type event raise and event wait

Make link between tasks as described in work flow Diagram

Double click on event raise task, select properties tab

Attribute Value
User defined event Dim-load-complete

From work flow designer select task, select command and event wait
Double click on command task and select command tab

From tool bar click on add a new command

Name Command
Success Copy e:/result.txt to c:/ batch 4pm

Click on apply and click ok

Double click on event wait task selects the event tab

Select the predefined event

Enter the name of the file to watch

C:\batch 4pm \result.txt

Click on apply and click on ok

Assignment task:

It allows you to assign the values (or) expressions to the user defined workflow
variables

The user defined workflow variables are declared in a work flow properties variable tab

Decision task:

The decision task allows you to define the condition and the condition is evaluated by
integration service, returns true (or) false.

The decision task as a predefined condition variable called $decision-task-name. Condition,


which is assigned with results of decision condition

Use the task instead of using multiple link condition


Procedure:

From tools menu, selects workflow designer, from the work flow menu select create
Enter the work flow name w-daily-weekly-wad
Select the variable tab from tool bar click on add a new variable

Name Data type Presentence


$$wkf-runs Integer

Enter the default value =0

Bellow

Click ok

Drop the session beside the work flow


Create the task type assignment and decision
make a links between the task as described above
define the link condition between the S10 and assignment task $S10.status
=succeeded
double click on assignment task, select the expression tab
from tool bar click on add a new expression

USER DEFINED VARIABLE OPERATOR EXPRESSION


$$WKF-runs = $$wkf-RUNS+

Click on apply and click ok


Define the link condition between assignment and decision

$assign. Status=succeeded

Double click on decision task select the properties tab

Attribute Value

Decision name MOD ($$WHF-runs, 7 )=0

Click on apply and click ok

Define the link condition between the decision and S20

$Decision. Condition=true

To view the presentence value right click on work flow select

ETL Project Architecture in Informatica

ETL Project Architecture:


These are two stages defined in current project architecture

1. ETL Stage1
2. ETL Stage2

Data Profiling: (ods)

Its process of study and analyzing source data. We can detect records with in the Null Values,
duplicate Records, Inconsistency data and data definition.
Key Points Sequence Generator:

1) Start value First Value

2) Increment By Amount to Increment at each iteration

3) End Value Highest value to use

4) Cycle If checked, sequence generator returns to start value when end value is reached
otherwise it stops.

5) Number of cached Values Enables storing multiple values when the same sequence
generator is used in multiple sessions simultaneously

6) Reser it checked each session returns to starts value; otherwise each new session continues
from the last stored value.

7) Trancing Level Level of detail to be written to session logs


Design a Mapping to perform Round robin Loading:

Senarios 2 :

Workflow Senarios:

A workflow is having a 5 session run first 4 sessions in parallel, If all four sesion are soccers
then execute 5 sesion

Hint: Use Link Condition

Dynamic LOOKUP Cache:

The Integration Service inserts the records and updates the records in the cache. Use dynamic
LOOKUP cache for eliminating duplicates (or) In implementing slowly changing dimensions
Type1

The dynamic LOOKUP cache is used when you perform a LOOKUP on target table

The dynamic LOOKUP transformation allows for the synchronization of the dynamic target
LOOKUP table image in the memory with its physical tale in the database.

New LOOKUP row:

New look up Description

0 The integration service does not update (or) Insert the row in the cache

1 The integration service inserts the row into the cache


2 The integration service update the row in the cache

Key Points:

The LOOKUP transformation associates port matches a LOOKUP input port with the
corresponding port in the LOOKUP cache.

The Ignore NULL ports for updates should be checked to port where NULL

The Ignore a camparision should be checked for any port that is not to be compared.

The flag New LOOKUP Row indicates the type of row manipulation of the cache. If an input
row creates an insert in the LOOKUP cache.

The flag is set to 1 if an input row creates an updates of the LOOKUP ache the flag is set to
2. If no changes is detected the flow is set to 0. A filter or router T/R can be used with an
updates to set the proper row flag to update a target table.

Procedure:

Source definition emp

Create a target defination with the name emp_DC

Empkey + All 8 columns


Create a mapping with the name m-employee-dynamic-cache .

Drop the source defination emp


Drop the target defination as a two Insentation
Create the transformation type LOOKUP which perform a LOOKUP on the target table (emp_Dc)

From SQ-emp copy all the ports to the LOOKUP transformation

Double click on the LOOKUP transformation select the condition tab

LOOKUP table Column Operator Transformation port

emp no = EmpNO

Select the properties tab

Transformation attribute Value

Connection information BSR Writer

Dynamic look up cache

Insert else update

Select the ports tab

For a port Name empkey select the data type Integer

Port Name Associated Port

Ename Ename1

Job Job1

MGR MGR1

Hiredate Hiredate1

sal sal1

comm comm1
Deptno Deptno1

Click apply and click ok

Create the transformation type router

From LOOKUP transformation copy the following ports to the router transformation (new
LOOKUP row, empkey, empno, ..deptno)

Double click the router transformation select the groups tab

GroupName GroupFilter Condition

New New LOOKUP row = 1

Update New LOOKUP row = 2

Click apply and click Ok

New Record Flow:

Create the transformation type Update strategy

From New output group copy the ports to [except new LOOKUP] Update strategy and develop
the express DD-Insert .

From Update strategy transformation connect the ports to target.

Update Flow:

Create the transformation type Update strategy

From Updateoutputgroup copy th eports Update strategy transformation and develop the
following expression DD-Update

From Update strategy transformation connect the port to target

UnConnected stored procedure Drop and Create Index:

Create the following two stored procedures in the target databse account.
Create or Replace procedure Emp_Create_Index

V_Table_Name in varchar2;

V_Index_Col_Name in varchar2;

V_Index_Name in varchar2;

as

begin

Execute Immediate

Create Index space ll V_Index_Namell space on space llV_Table_Namell


CllV_Index_Col_Namell);

end;

Procedure 2:

Create or REplace procedure Emp_drop_Index

V_Index_Name in varchar2

as

begin

Execute Immediate

Drop Index space ll V_Index_Name;

end;
/

SetUp Target Table Defination With Index:

SQL> Create table emp_TGt as select * from scott.emp where 1=2; enter

Table create

SQL> Select INDEX_NAME From USER_INDEXEX WHERE TABLE_NAME\EMP-TGT;

SQL> Create Index EMP_IDX ON EMP_TGT(EMP NO);

Mapping Designing:

Create source defination with the Name emp

Create a target defiantion withe the Name emp-TGT [using target designer tool]

Create a mapping with the Name m_drop_create_Index

Drop the source and target defination

From Sourcequalifier Connect the ports to target

Create two stored procedure transformations with the Name

SP_Drop_Index and SI_Create_Index:

Double click on the stored procedure transformation with the name

SP_drop_Index

Select the properties tab

Transformation Attribute Value

stored procedure Target_preeLoad


call text emp_drop_index(emp_Index)

connection information BSR_Writer

Click Apply and click Ok

Double click on stored procedure with the Name Sp_create_Index

Select the properties tab

Transformation Attribute Value

stored procedure type terget_postLoad

call text emp_create_Index(Emp_TGT)

connection Information BSR-Writer

Click Apply and Click Ok

UnConnected Stored Procedure Returning Multiplpe Output Ports:

Execute the following stored procedure in the source database account(username scott)

SQL> Create or Replace procedure Emp-

V_empNo IN Number;

TOTSAL OUT Number;

TAX OUT Number;

HRA OUT Number;

as

begin
SELECT SAL+NVL(COMM, 0), Sal *0.1, sal *0.4

INTO

TOTALSAL , TAX, HRA

FROM EMP WHERE EMPNO = V_EMPNO;

END;

Create source defination with the Name emp

Create a target defination with the Name emp_prc_var

EmpNo, EName, Job, Sal, TOTALsal, TAX, HRA,deptno

Create a mapping with the Name m_prc_var

drop the source and target defination

Create the transformation type storedprocedure and expression

From source qualifier capo the required ports to the expression transformation

Double click on expression transformation select the ports tab

Port name Data type precision scale I O V Expression

V tax Decimal 7 2

V HRA decimal 7 2

TOTAL SAL 7 2

TAX 7 2 V Tax

HRA 7 2 V HRA

:SP PROC VRA(Empno,proc result,V- tax, V- HRA)

Click Apply and Click Ok


From expression transformation connects the ports to the targets.

From repository menu click on save

SHORTCUTS:

>Shortcut is a reusable component

> For create a shortcut , first we have to create a sharable folders

> Open the destination folder

> Drag the object (source, target, mapping)

> From the shared folder into the destination folder

i.e., source analyzer and mapping designer

> You can create a shortcut to a shred folder in the same repository

>

When you will create a shortcut in the same repository that is known as local shortcut.

Creation of Sharable Folder:

Open the client repository manager from folder menu select create enter the folder name
AXEDW (any name)

select allow shortcut [-/] click oko

open the client power center designer

Activate the sharable folder (AXEDW)

Create source defination with the emp in the source analyser tool

Activate the destination folder wher we want too shortcut.

From sharable folder drag the source defination emp , drop on source analyzer workspace which
belongs to destination folder Click on Yes
Version Control:

> By using version control we are maintaining the history of the metadata objects.

> A versioned repository stores multiple versions of an objects

> Each version is a seperate object with unique number.

> You can perform the following change management tasks to create and manage multiple
version of objects in the repository.

1. Checkin
2. Check out

Check in:

You must save an object before you can check it in.


When you check in an object the repository creates a new version of the object and assigns it a
version member.
The repository increments the version member when you check in an object.

Check Out:

It edit an object you must check out the object.


When you check out an object the repository obtains a write intent lock on the object.
No other users can edit the object when you have it checked out.

Check in:

Select the mapping m_customer_dimension , right click on that mapping select the versioning
and click on checkIn

Name:Date:Goal

Click on Applyall
Check out:

Select the mapping m_customer_dimension , right click on that mapping select the versioning
and click on checkout

Comments

Click on Apply to all

From version menu select the two version of the mappping , right click on compare , select the
selected version

Click on save file , enter the file name 4pm_compare click on save

Partition Points Overview:

> Partition points mark the boundaries between thread in a pipeline the integration service
redistributes rows of do at partition points.

> You can edit partition to increase the number of transformation threads and increase threads
and increase session performance.

Types of Partition:

1. Key range partition


2. Pass through partition
3. Round robin partition
4. Hash partition
5. Database partition

1) Key Range Partition:

> With key range partitioning service attributes rows of data based on a port that you define as
the partition key

> For each port you define a range of values.


> The integration service uses the key and ranges send rows to the appropriate partition.

(1) How many repository are created in real time?

ans: minmum 3 , maximum 5

1. Designer
2. Testing(Q A)
3. Pre Population
4. Production

Procedure:

> Create a mapping with the name m_emp_partition

> Drag and drop you source target emp_partition to the mapping designer

> Copy all the columns to sa_emp and connect to the target

> Doble click on the sesion , click on mapping tab

> From left pane click on partition (below) tab

> Select the source qualifier (SQ_emp) , click on edit partition point

> Select the parition type keyrange

Name Destination

Partition 1

Partition 2

Partition 3

Click on Ok

Click on edit key select the SAL column click on Add click Ok

Click on target emp_partition , click on edit partition point

Select the partition type , keyrange , click on Ok

Select the SAL column , click on Add


Click Ok , save on repository

Business Requirements:

Senario 1:

Client Name BSR(customer Name)

MARKET HYD

Product IBMshare

Client order 5000

Account RAM(A person from Bank of America)

Client allocation 500

Client execution 500

Pending Amount 0

Senario 2:

Client Name

MARKET HYD

Product IBMshare

Client order 1000

Account RAM(A person from Bank of America)

Client allocation 1000

Client execution 700

Pending Amount 300


Scenario 3:

Multiple client order and single market order

Scenario 4:

Designing Database from Requirements:

The datamart is designed with following dimensional table and facttables.

DM_T_DATE_DIM * ERVIN is a data tool

DM_T_MARKET_DIM * Tode is database tool

DM_T_PRODUCT_DIM

DM_T_ACCOUNT_DIM

DM_T_EMPLOYEE_DIM

TRANSACTION_DETAIL_FACT

ETLStage1 Implementation:

The source is defined with the flatfiles, the following are the delimited flatfiles which provides
the data for extraction.

Accounts.Txt

Market.Txt

Product.Txt
Client_Allocation.Txt

Client_Execution.Txt

Client_Order.Txt

Employee.Txt

Defining Staging Database:

Define the staging database as a target to perform the dataprofiling.

Create the following table in a staging database account

List of Table:

T_Product

T_Account

T_Employee

T_Market

Client_Order

Client_Allocation

Client_Execution

T_date

ETL Stage 1 Mappings:

Design the simple pass mappings which migrates the data from source to staging.

MT_stg_Account_Flatfile_Ora(source to stage)

MT_stg_Client_Allocation_Flatfile_Ora(source to stage)
MT_stg_Client_Execution_Flatfile_Ora(source to stage)

MT_stg_Client_Order_Flatfile_Ora(source to stage)

MT_stg_Employee_Flatfile_Ora(source to stage)

MT_stg_Market_Flatfile_Ora(source to stage)

MT_stg_Product_Flatfile_Ora(source to stage)

Defining Source metadata:

Logon to oracle with the user account system

> Create a user Account with the Name SRC

> Execute the following source table structure

Product_SRC

Account_SRC

Employee_SRC

Market_SRC

Client_Order_SRC

Client_Allocation_SRC

Client_Execution_SR

Create odbe connection with the Name SRC_odbc

Import the source table definations from user account SRC

Convert the table definations to flatfile

Defining Staging Metadata:

Logon to the oracle with the user accout system


Create a user account stg

Executing the following table structures

T_Product

T_Account

T_Employee

T_Market

Client_Order

Client_Allocation

Client_Execution

T_date

Create the following stored procedure to populate the data into the stage table T_date

SQL> Create or Replace procedure date_stg_proc

as

v_start_date date : = to_date(01-01-2007,dd-mm-yyyy)

begin

for i in 12000

Loop

insert into t_date values (v_start_date);

v_start_date : = v_start_date + 1;

end Loop;

end;

Execute the above stored procedure in staging database

SQL> Exec date_stg_proc;


SQL> commit;

ETL Stage2 Implementation:

Source System:

Define the staging database source system with the following table

T_Account

T_Product

T_Market

T_Employee

Client_Order

Client_Execution

T_Date

Defining The Target System:

Logon to the oracle with the user Account System

SQL> Create user CDM identified by CDM

SQL> grant DBA to CDM

SQL> connect CDM/CDM

Execute the following tables

DM_T_Account_DIM

DM_T_Account_DIM_EXCEP

DM_T_DATE_DIM

DM_T_EMPLOYEE_DIM
DM_T_EMPLOYEE_DIM_EXCEP

DM_T_Product_DIM

DM_T_Product_DIM_EXCEP

DM_T_MARKET_DIM

DM_T_MARKET_DIM_EXCEP

TRANSACTION_DETAIL_FACT

create the ODBC connection with the name CDM_ODBC

Import the target table defination from CDM user Account

> product and exception product make to change the product key and excep key will make it as
a forst and generate SQL.

Designing and mapping with SED implementation in Informatica

Design a mapping with SED type 1 implementation:

A Type one dimension keeps only current data in the target. It doesnt any history.

Procedure:

Create a source definition with name EMP

Create a target definition with the name EMP DIM Type1(Emp key, Emp no, tname, job,sal)

Note:

Emp key should be primary key

Creating a mapping with the name M_ EMPLOYEE _ DIM_Type1 drop the source definition
Emp

Drop the target definition as a 2 instances


1. EMP_ DIM_Type1(insert)
2. EMP_ DIM_Type1(update)

From transformation menu select create

Select the transformation type look up

Enter the name LKP_ TRG click on create

From target select the table EMP _DIM_Type1 click ok

Click done

From source qualifier (SQ- EMP) copy the port empno to the look transformation.

Double click on the look up transformation select condition tab from tool brace click on add a
new condition.

Look up table column Operation Transformation Port

Empno = Emp no1

Click apply ,click ok

Create the transformation type expression transformation (Exe _src_lookup)

From SQ_ EMP copy the following ports to expression transformation empno, ename, job, sal

From look up transformation copy the port emp key to the expression transformation.

Double click on expression transformation select the ports tab

Port Name Data type Precision IOV Expression

New- Flag String 10 Iff(IS NULL(temp key),true, False)

Update Flag String 10 Iff(IS NOT NULL(temp key),true, False)

Create the transformation type router


From expression transformation copy the port to the router transformation ,couble click on router
transformation select group tab

Group Name Group Filter condition

New- Record New flag =True

Update Record Update flag=True

Click apply and click ok

Defining new records data flow:

Create the transformation type expression , update strategy and sequence generator

From router transformation , from new record output group copy the ports to the expression
transformation (emp no, Ename, job,sal) from expression transformation copy the ports to update
strategy transformation.

Double click on update strategy transformation select properties tab

Transformation Attribute Value

Update strategy expression DD insert (or) 0

Click apply and click ok

From updating transformation connect the ports to the target , from sequence generator
transformation connect the net work port to the EMP key of target table.

Defining update record data flow:

Create the transformation type expression and update strategy from router transformation , from
update record output group copy the following ports to the expression transformation (Emp key,
emp no, ename, job,sal)
From expression transformation copy the ports to the update strategy transformation.

Double click on update strategy transformation select properties tab

Transformation Attribute Value

Update strategy expression DD- update

From update strategy transformation connect the ports to the target from repository menu click
on save.

Create a session with the name S_M _EMPLOYEE_DIM_Type1, double click the session select
the properties tab

Attribute Value

Dollar source connection value Batch 4pm source reader

Dollar target connection value Batch 4pm- target writer

Select the mapping tab set the reader and writer connections, click apply and click ok

SQL> update emp set sal =sal +11000 where empno=10;

SQL> commit;

Design a mapping with SED type2 implementation:

A type 2 dimension stores complete history in the target for each update at insert a new record in
the target.

There are 3 different methods to maintain the history

1. Keep the version number in a separate column


2. Mark the current dimension record with the flag
3. Use start date and end date [date range]
Procedure:

Create a source definition with the name EMP

Create a target definition with the name EMP- type2 (emp key, emp no, enmae, sal, job, deptno,
version)

Create a mapping with the name M_EMP_DIM_TYPE2

Drop the source definition

Drop the target definition as a 2 instances

1. EMP_type2(new record to insert)


2. EMP _type2 (update record as insert)

Transformation menu select create select the transformation type look up enter name LKP
TRG click on create

From target select table EMP- type2 click ok click on done

From SQ- EMP copy the port EMPNO to the look up transformation

Double click on the lookup transformation select the condition tab tool bar click on add a new
condition.

Look up table column Operator Transformation port

Emp no = Empno1

Click apply and click ok

Create the transformation type expression from source qualifier, copy the following ports to the
expression transformation

From look up transformation copy the following ports to expression transformation(EMP


key,sal,version)

Double click on expression T/R select ports tab


Port Name Data type Precision Scale IOV Expression

New Flag String 10 IFF(IS NULL(Emp key) True, False)

Update Flag String 10 IFF(NOT IS NULL(Emp key),AND(Sal!=Sal1)True, False)

Click apply and click ok

Create the transformation type router from expression T/R, copy the following ports to the router
transformation(EMPno, Ename,job,sal,deptno,version,new-flag,update-flag)

Double click the router T/R select the groups tab

Group Name Group filter condition

New record New Flag = True

Update record Update Flag = True

Click apply and click ok

Defining new record dataflow:

Create the transformation type expression, update strategy and sequence generator
transformation

From router T/R , from new record output group copy the following ports to the expression
transformation emp no, ename, job,sal,deptno

Double click on expression t/R select ports tab

Port Name Data type Precession IOV Expression

Version Decimal 5 0

Click apply ,click ok


From expression T/R copy the ports to update strategy T/R

Double click on update strategy T/R select properties

Transformation Attribute Value

Update strategy expression 0 (or) DD- Insert

Click apply , click ok

From update strategy T/R connect the port to target

From sequence generator T/R connect the textual port to the EMP key of the target table

Defining the update records as insert:

Creating the transformation type expression and update strategy from router T/R , for update
record output group copy the following ports to expression T/R

(EMPno, Ename,job,sal,deptno,version)

Double click on expression T/R select the ports tab

Uncheck the output port for a port name version

From tool bar click on add a new port

Port Name Data type Precession IOV Expression

Version Decimal 5 Version 3+2

Click apply and click ok

From expression T/R copy the ports to update strategies T/R and develop the following strategy
expression DD- insert

From update strategy T/R connect the ports to target


From sequence generator T/R connect the to the EMP key of target table.

Look up SQL overwrite:

Double click on the look up T/R select properties tab

T/R Attribute Value

Select

Out. Emp Key as Emp key,

Out. Ename as Ename,

Out. Job as Job,

Out. Sal as Sal,

Out. Dept no as dept no,


Look up SQL overwrite
Out . version as version,

Out . emp no as emp no

From emp type2 out

Where out. Emp key =(select max(Inn .Empkey)

From Emp type2 Inn

Where Inn .emp no = out.empno)