Informatica Interview Questions

TCS
How you pull the records on daily basis into your ETL Server.
Answer: Running incremental/delta/cdc.. For this we have so many logics some of
them are.. 1) using mapping level variable 2) using control table and ...
How to Join Tables if my Source is having 15 tables and the target is one?
Answer: If your source is Flat file and Same Structure then you can go for indir
ect file type.If source is relation is file then you can use Source Qualifier Sql override or join condition. or Joiner Transformation (n-1) A flat file havin
g 1 laks records.
if I want convert excel file what happen. (bcoz a excel sheet having 65536 colum
ns but flat files one lak columns). how to get one laks columns in excel sheet.
Answer; If you want Flat file to Excel format, convert it into excel format and
save it as .CSV (comma separate value) format. It is having a capacity 10lakhs r
ecord..
MINDTREE
I have a workflow I want to run this very day 3 times every 3 hours how can you
schedule that?
In the Scheduler we have option called Customized Report. By selecting Days(s)o
p tion in Repeat every option,u can see Daily Frequency options so select the Ru
n Every option particular Hours(Gap between every run of the WF) your workflow w
ant to run, after that select End after(3 runs) option of the listen End Optio n
s in the main Scheduler tab
What is Mapplet? What is logic?
We can create any No of Mapplets for 1 mapping. There is no limit for Mapplets.
Every Mapplet can have a Logic or logics, There is no limit for logics.
KPIT
When we develop a project what are the performance issues will raise??
KPIT 2:
if a table has INDEX and CONSTRAINT why it raise the performance issue
bcoz when w e drop the index and disable the constraint it performed better??
KPIT 3: what are Unix commands frequently used in Informatica??
Performance Issue in Informatica:
In your Project, My Project, Any Project Mostly our final Target was Loading the
Data in to Target Table, under Efficiently in Less time. Tune the Mapping, Less
Active Transformations, Use Best Loading Option, Partitio n the Data, Index les
s Target Table,,,,, 2)Yes, we drop the index and disable the constraint Perform
Better, That loads the Data based on the Particular Table only, Else I will go f
or Parent, Child test in FK Relation, Condition in constraint,,,After Loading th
e Data You can Configure indexes, constraints on Table. 3)SED-commands,AWK-comma
nds,Dir Commands, File Commands, Copy Commands only,, By seeing the parameter fi
le how do u identify whether it is a workflow paramete r or mapping parameter? m
apping parameter starts with $$ and workflow parameter starts with $ What is the
query to find nth highest salary? What is the use of cursors?
There are 3 ways to find out the nth highest salary in a given table (e.g. emp)
as below; 1) select distinct sal from emp e1 where &n=(select count distinct sal
from emp e2 where e1.sal <=e2.sal); 2) select empno,enaame,sal,deptno,rank () o
ver (order by sal desc) as ra from em p where ra=&n; 3) select empno,enaame,sal,
deptno,dense_rank () over (order by sal desc) as ra f rom emp where ra=&n; What
is a Cursor. When a query is executed in oracle, a result set is produced and st
ored in the m emory. Oracle allows the programmer to access this result set in t
he memory thro ugh cursors. Why use a cursor? Many times, when a query returns m
ore than one row as a result, we might want to go through each row and process t
he data in different w ay for them. Cursor is handy here. Types of cursors: Orac
le PL/SQL declares a cursor implicitly for all queries and DML statements (i ncl
uding queries that return only one row). But in most of the cases we dont use the
se cursors for queries that return one row. What r the type of indexes u general
ly used in Informatica? B tree and bitmap indexes we r using in Informatica. B t
ree is low cardinality a nd bitmap have high cardinality that is why bitmap inde
x gives better performanc e. Why we select the table which has minimum number of
records as Master table in j oiner? The Integration Service (IS) reads data fro
m master table and builds data cache and data index for building cache. After th
at read data from detail table perfor m joins. Because routine will be very less
(master means very less record compar ed to detail table). Finally time saving.
Performance increasing. What is a diff between joiner and lookup transformation
? If you are not a performance geek, somewhat basic things you can do with both
jo iner and lookup t/f. However Joiner operates on sources where as lookup opera
tes source as well as target. There are other things can also be identified as l
imitation of workin g with heterogeneous sources in both the case.(can lookup wo
rk on xml file?) Joiner doesnt support nonequi joins where as lookup support none
quijoin Joiner transformation does not match for null values where as lookup tra
nsformat ion matches for null values joiner transformation supports only equalit
y operato r in condition of join where as lookup transformation can support <=,>
=, =,! = i n the lookup condition
L & T How can you display only hidden files in UNIX ls -a|grep "^\." Tell me one
complex query in oracle? Select users.user_id, users.email, max (classified_ads
.posted) from users, class ified_ads where users.user_id = classified_ads.user_i
d group by users.user_id, users.email order by upper (users.email); Data is pass
ed from one active trans and one passive trans into a passive transf ormation. I
s the mapping valid or invalid? The answer is: Mapping is invalid. We can never
connect a assive transformation TO A PASSIVE TRANSFORMATION (Coming es, we shoul
d take the question in this way). Can you connect output of n to an expression t
ransformation (where output of joiner from different pipelines). data from Activ
e and p from different pipelin a joiner and Expressio and expression coming
I have one source 52 million records I want target only 7 records? How will you
do what logic to implement? if u want load 7 records from source to target, we c
an use sequence generator t/ r current value =0 end value =7 reset enable then d
rag the nextval port to exp t/r create one output port give the condition like n
extval =7. I have ten flat files with same structure, if I want to load it to si
ngle target , and mapping needs to should show only one source. What will be the
steps to ta ken to achieve it? Create a file system with extension .txt in that
copy all the 10 flat file paths . Now at session level Source type: indirect Fi
le directory: File system path Source file name: Name of file system.txt When we
use lookup, if have 5 records, if I dont need 1st and 2nd records..what w ill be
the procedure to achieve by using lookup? Use lookup override as Select *from e
mp minus select *from emp where rownum<=2 Note: we cant use > with romnum Can we
see default group, when we use router? If yes how? Yes we can see the default gr
oup. When we add router, there r no groups. When we add any group, then a defaul
t group is also added. We can add conditions for th is group according to our re
quirement. If we do not give any conditions, all the records that doesnt meet the
condition comes to this default group. Just connect it to any other transformat
ion and use It according to your reqmnt.
What is left outer join? For eg. Select M.*, D.* from Master M LEFT OUTER JOIN D
etails D on M.ID = D.ID In the above query we get all the records from Master ta
ble and matching records from Detail table. In simple it takes all rows from the
table which is left of the join and matchin g records from table which is in ot
her side of the join. When we use dynamic lookup, if condition matches what will
be the o/p? It updates the row in the cache r leaves it unchanged. This lookup
property is r ecommended only for the mapping where sources has duplicate record
s A View is just a stored query and has no physical role. Once a view is instant
i ated, performance can be quite good, until it is aged out of the cache. A mate
rialized view has a physical table associated with it, it does not have to resol
ve the query each time it is queried. Depending on how large the result an d how
complex the query, a materialized view should perform better. How to import ora
cle sequence into Informatica? With the help of stored procedures as well as SQL
Override through unconnected l ookup also we can import sequence into Informati
ca Why cant we put a sequence generator or update strategy transformation before
joi ner transformation? Joiner is to join two different sources .If u use Update
strategy T.F and tryin g to use DD_delete&DD_reject option the some of the data
will get deleted and u can t see the at joiner output. What is metadata? A meta
data is Data about data. The repository contains the metadata, it means al l the
information of the mappings, task.....etc We have tables like c1 a c2 b c3 c c4
x c5 y And I need output like abcx in a single row and abcy in a single row? Ho
w do u do this? You can override the source qualifier with the following query S
elect c1, c2, c3, c4 from table1 union all select c1, c2, c3, c5 from table1 We
have to use order by, where, having we to implement sql query In a query we can
use all the three....but using where n having doesnt mean any s ense...
If we use group by we can use where clause before group by port... And if we use
group by we can use having after group by clause. And generally order by clause
is always written at the end of the query Where is the cache stored in Informat
ica? Cache is stored in the cache directory. For Aggregator, Joiner and Lookup t
ransf ormations cache values is stored in the cache directory. For sorter transf
ormation cache values stored in the temp directory. How we can add header and fo
oter to flat files? go to session edit>>>>mapping tab>>>>>>>select source>>>>>>i
n header command opt ion and footer command option type the command What is data
merging, data cleansing and sampling? Data merging: multiple details values are
summarized into single summarized valu e. Data cleansing: to eliminate the inco
nsistent data Sampling: it is the process, orbitarly reading the data from group
of records. I have thousand records in my source (flat file) I want to load 990
records I do n t want load first 5 records and last 5 records at Informatica le
vel? Pass records from source qualifier to expression. Create a variable with a
aggre gation as count n assign its value as 0. Create two parameters n assign th
em val ues. First one 5 second one 995 In expression create an output port as nu
mber datatype n in the expression edito r write.... Sequence=setcountvariable (v
ariable name u created). Now attach it to the router create a group and enter co
ndition sequence>5 and se quence<995 Connect this port to target... How the data
will be loaded while designing the schema? Which one first (for e.g .-dimension
s and facts) Dimension tables are loaded first and fact tables. As all primary k
eys of the di mension table is linked with the foreign keys of fact table for pr
oper lookup we need to load dimension tables first. What is sql override? What i
s the use of it? It is a process of writing user defined SQL queries to define S
QL joins, source filters, sorting input data and eliminating duplicates. how to
get the first row without using rank t/r? step 1 if first row h as not null then
we can use first () function in aggregator T/F. step 2 use sequence generator T
/F And filter T/F write condition nextval=1 In Source Qualifier Sql Override Ex:
Table Name: Dept1 SELECT DEPT1.DEPTNO, DEPT1.DNAME, DEPT1.LOC FROM
DEPT1 WHERE ROWNUM=1 I have Flat file like the data, sal have 10,000. I want to
load the data in the same format as sal as 10,000. Can anybody know the answer m
eans please mail me? in target table options we have thousand separators, hundre
d separators to_char(10000, 99,999.00 ) from dual; tell me some dimension tables
names in banking domain Informatica project (dont t ell like depends on project,
tell me names of dimension and fact table names in your project)? Fact Tables:
1) Bank 2) Advanced Dimension Tables: 1) Customer 2) Accounts 3) Transaction Dir
ty Dimension 4)Time write a subject maths science social sql query following sou
rce? marks 30 20 80 social 80
required output maths science 30 20
select (select marks from sub where subject= maths ) maths, (select marks from s
ub where subject= science ) science, (select marks from sub where subject= socia
l ) social from sub group by 1; select decode(subject, maths ,mark) maths ,decod
e(subject, science ,mark) science ,decode(subject, social ,mark) social from <<t
able>> Yesterday my session run ten minutes. Today its run 30min, wt is the reas
on? If any issues how to solve that? Delay of session ----------------------(1)
Amount of source data may huge (2) Database connection may slow down, thats why d
ata transfer is slow (3) If you are using cache based t/r, then there must be pe
rformance issue in ca che par I want to load data in to two targets. One is dime
nsion table and the other is f act table? How can I load at a time Generally we
all knew that is, In DatawareHouse Environment, we should load data first in the
dimension table then we load into the fact table. Because fact tab le which con
tains the Primary keys of the dimension table along with the measu res. So we ne
ed to check first that whether the fact table which you are going t o load that
has foreign key relationship with the dimension table or not? If yes , Use pipel
ine mapping, and load dimension data first in first pipeline and in t
he second pipeline load fact table data by taking the lookup transformation on t
he dimension table which has loaded data already..and return the key value fr om
the lookup transformation then calculate the measures by using Aggregator and a
lso give "group by" on the dimension keys and map to the Target (Fact) ports a s
required. most importantly specify the "Target Load Plan" where dimension targ
et as first, fact table target as second. SOURCE 1 a 1 b 1 c 2 a 2 b 2 c TARGET
1 A B C 2 A B C IN ORACLE & INFORMATICA LEVEL HOW TO ACHIVE first sort the first
column empid using sorter t/r then expression t/r create va riable port v_ename
:=iff(prev.empid=curr.empid,v_ename|| ||ename,ename) o_ename=v_ename then using
expression t/r convert lower case higher case. Tell me how many tables used in U
r project and how many fact tables and dimensio n tables used in ur project In m
y project we have more than 100 tables but in my module we can use only 10 t o 1
5 tables only. Then dimension: 5 to 6 dimension tables And 1 or 2 fact tables wh
at is the command to get the list of files in a directory in Unix? Ls How can I
explain my project architecture in interview? 1. Source Systems: Like Mainframe,
Oracle, People soft, DB2. 2. Landing tables: These are tables act like source.
Used for easy to access, fo r backup purpose, as reusable for other mappings. 3.
Staging tables: From landing tables we extract the data into staging tables a f
ter all validations done on the data. 4. Dimension/Facts: These are the tables t
hose are used for analysis and make de cisions by analyzing the data. 5. Aggrega
tion tables: These tables have summarized data useful for managers who wants to
view monthly wise sales, year wise sales etc. 6. Reporting layer: 4 and 5 phases
are useful for reporting developers to genera te reports. I hope this answer he
lps you Difference between session variables and workflow variables? A workflow
variable can be used across sessions inside that workflow. Whereas a session var
iable is exclusive only for that particular session what is full process of Info
rmation source to target just like starting to produ ction and development? Init
ially data comes from OLTP systems, of a company, which get loaded into a da

tabase or flat files with the help of legacy systems or any pre defined methods.
from here data is transferred to the staging database applying business logic w
ith the help of Informatica or other ETL tools. at times stage to target is als
o loaded using Informatica mappings. these are transferred to another QA (qualit
y analysis database) in XML files. from there deployment is done onto the produc
t ion environment. I HAVE A SOURCE FILE CONTAINING 1|A,1|B,1|C,1|D,2|A,2|B,3|A,3
|B AND IN TARGET I SHOULD GET LIKE 1|A+B+C+D 2|A+B 3|A+B WHICH TRANSFORMATION I
SHOULD USE Aggregator with group by on column with values 1, 2, 3 My session hav
e to run Monday to Saturday not run Sunday how to schedule in Info rmatica level
? In the Scheduler we have option called Customized Report. By selecting week op
ti on in Repeat every option. u can see day options so select the particular day
s w hich your workflow want to run. what is dynamic cache? The dynamic cache rep
resents the data in the target. The integration service use s the data in the as
sociated port to insert or update rows in the lookup cache. 1) It is used to ins
ert the data in Cache and Target 2) Informatica dynamically inserts the Data in
target 3) Data is inserted only when the condition is false i.e. It means no dat
a is av ailable in target and cache table get me the resultant input:1 x,y,z 2 a
,b 3 c
output:- 1 1 1 2 2 3
x y z a b c
Use the following flow: Source ---> SQ ---> Expression ---> Normalizer ---> Filt
er - --> Target In the expression use variable ports to form 3 columns depending
on the values r eceived in Column2. I mean to say the given value is X, Y, Z in
column2 so creat e 3 ports and each port will have 1-1 values i.e. X then Y the
n Z. For this use SUBSTR and INSTR functions. SUBSTR to get the part of the stri
ng and INSTR to f ind the position. VARIABLE_PORT1 ---> substr(column2,1,1) VARI
ABLE_PORT2 ---> IIF(instr(column2, , ,1,1)!=0,substr (column2,instr(column2, , ,
1,1)+1,1),NULL) VARIABLE_PORT3 ---> IIF(instr(column2, , ,1,2)!=0,substr (column
2,instr(column2, , ,1,2)+1,1),NULL)

Direct the variable ports to 3 output ports and this output ports will go to Nor
malizer. In Normalizer create 2 ports Column1 and Column2 and put the number of
occurrences for column2 as 3. The output will be 2 ports from Normalizer which
will be fed to filter. In filte r, filter out the null values in column2 if it e
xists (IIF (ISNULL (Column2), FA LSE, TRUE) Direct the output of filter to targe
t. Hey I am net to Informatica? Can anyone explain me step by step How scd will
wor k? Select all rows. Cache the existing target as a lookup table. Compare log
ical key columns in the source against corresponding columns in the target looku
p tab le. Compare source columns against corresponding target columns if key col
umns match. Flag new rows and changed rows. Create two data flows: one for new r
ows, one for changed rows. Generate a prima ry key for new rows. Inserts new row
s to the target. Update changed rows in th e target, overwriting existing rows.

How to list Top 10 salaries, without using Rank Transmission? First use sorter w
ith salary based and sequent generator next filter transformat ion Sorter (salar
y descending order) -----> Sequent generator --------->Filter ( seq<=10) Can you
use flat file for lookup table? Why? Yes we can....we can definitely use the fl
at file as lookup but we cannot use th e xml files as look up...if u want to use
then u have to change the xml file to another database or flatfile..Then u can
able to use In my source table I want to delete first and last records and load
in between r ecords into target? How can it possible? The flow will be like this
source--->sq--->aggregator---->filter--->target Generate sequence number using
the sequence generator connect it to the aggregat or and the flow from sq.group
by sequence num and create two o/p ports 1) min(seqnumber) 2) max(seqnumber) In
filter write the condition as seqnumber<>min AND max Connect the required ports
to the target How the facts will be loaded? The most important thing about loadi
ng fact tables is that first you need to loa d dimension tables and then accordi
ng to the specification the fact tables. The fact table is often located in the
center of a star schema, surrounded by di mension tables. It has two types of co
lumns: those containing facts and other co ntaining foreign keys to dimension ta
bles.
* A typical fact table consists of: Measurements: additive-measures that can be
added across all dimensions , non-additive cannot be added and semi-additive can
be added across few dimensions. * Metrics * Facts - The fact table has frequent
ly more than one measurement field and then each field is called a fact. Usually
one fact table has at least three dime nsion tables. Note: Found this answer at
http://www.etltools.org/loading/facts.html How are parameters defined in Inform
atica? Parameters are defined in a mapping parameter/variables wizard. we can pa
ss the values to the parameter outside the mapping without disturbing the design
of map ping, but parameters are constant, until and unless user changes how do
u get sequence numbers with oracle sequence generator function in Informa tica..
.. I dont need to use sequence generator transformation..... how do u achie ve th
is??? If you want Oracle seq then go for SQL t/r in query mode. In that write a
query ( select Sequence_name from dual) in o/p port. how to run workflow in Unix
? To Run a Workflow in Unix you can run it pmcmd command Syntax: pmcmd startwork
flow -sv <server name> -d <domain name> -u <user name> -p <password> -f <folder
name> <workflow name> how I will stop my workflow after 10 errors session proper
ties we have an option I have source like this 1:2;3. so I want to load the targ
et as 123 S.D--->S.Q....>EXP T/R......>TGT In exp t/r create one out put port gi
ve condition by using Replace function we c an achieve this scenario. or sql que
ry : select replace( 1:2;3 , 1:2;3 , 123 ) from dual; REP --123 What is workflow
variable Workflow variable is similar to Mapping variable where as in workflow
variable we will pass the workflow statistics and suppose you want to configure
the mult iple run of workflows by using variable that you can do with this. Whic
h gives the more performance when compare to fixed width and delimited file ? an
d why? Surely fixed width gives best performance. Because it need not to check e
ach and every time where the delimiter is taking place.
Two tables from two different databases are there. Both having same structure bu
t different data. How to compare these two tables? If u want to compare the dat
a present in the tables go for joining and compariso n. If u want to compare the
metadata (properties)of the tables to for "compare Objects" in source analyzer/

A table contains some null values. How to get (not applicable (na)) in place of
that null value in target? Use decode function in Expression taking one new col
umn as Flag iif is_null(column_name,-NA-,column_name) In scd type 1 what is the
alternative to that lookup transformation? Use "update else insert" in the prope
rties of session One flat file is there which is comma delimited. How to change
that comma delimi ter to any other at the time of running? I think we can change
it in session properties of mapping tab. if select flat fi le on top of that we
see set file properties. Three date formats are there. How to change these thre
e into one format without using expression transformation? Use SQL Override and
apply the TO_DATE function against the date columns with th e appropriate format
mask. What are the reusable tasks in Informatica? Reusable tasks means the task
that is created in task developer is called reusab le tasks. (Session, Command,
Email) The task that created in workflow designer that is non reusable task. Wh
at are events in workflow manager? Events are the wait which v implement on othe
r tasks on workflow before the spec ified requirement is fulfills. These are of
two types 1. predefined (also called file watcher event) 2. User defined In pred
efined we can check for a file to be present in a path we specify before we proc
eed with the workflow. In user defined we can make any task to wait before a spe
cified task in complete . In user defined event wait n event raise task are used
in combination. I want skip first 5 rows to load in to target? What will be the
logic at session level? The one way to skip records for relational sources woul
d be adding a SQL Query i n session properties. Select * from employee minus sel
ect * from employee where rownum <= 5 This query would skip the first 5 records
Why do flat file load is faster if you compare that with table load? Flat file d
oesn t contain any indexes or keys so it will directly load into it. Whereas in
a table it will first check for indexes and keys while loading into t able so it
is slow when compared to flat file loading 2) another reason is that
when we load the data into table integration service also verifies data type and
will do parsing if needed?. But in case of flat file there is no need of parsin
g and checking data types.3) One more thing is while loading the data into tabl
e ,Informatica writes the data into database logs before loading into target ,an
d this cant be done while loading a flat file. 4) One more answer is in general f
l at files are kept on the server where Informatica is installed If I have a ind
ex defined on target table and if set it to bulk load will it wor k? Bulk load n
ever support the index. So if index in target with bulk load the sess ion will f
ail. So before use bulk load u have to drop index in target and recrea te index
using target post sql Replace Function: Use this function for above question, re
placestr (0, col, $ , ! , @ , # , % , ^ , & , * , NULL) Or replacechr (0, col, $
!@#%^&* , NULL) I have oracle table A and target B. I don t know how many record
s. I want to get last record in table A as first record in target table B. write
a sql query? Create table b as select * from a order by rownum desc; I have two
tables, table 1 having 2 columns and 3 rows, table2 having 3 columns and 2 rows
. What is the output if I do left outerjoin, full outer join, right ou ter join?
In table data like following Left table1 right table2 c1, c2 c3, c4, c5 1, 2 1,
1,10 4, 5 6,5,12 7, 8 matching columns c1 and c3 Left join c1, c2, c3, c4, c5 1,
2, 1,1,10. -,-, 6,5,12. Right join 1, 2, 1,1,10. 4, 5,-,-,-. 7, 8,-,-,-. Full o
uter 1, 2, 1,1,10. 4, 5,-,-,-. 7, 8,-,-,-. -,-, 6,5,12. - indicates null in abov
e table
Which transformation should we use to get the 5th rank member from a table in In
formatica? for this we have 2 use two t/r i.e. first we have 2 use rank t/r and
then use a filter t/r in filter give the condition as rank=5 connect to target
the flow is like this src --->sq--->rank--->filter--->trg we can also do this in
sql use the following query select * from(select * from emp order by sal desc)
where rownum<=5 MINUS select * from(select * from emp order by sal desc) where r
ownum<=4 How do you avoid duplicate records without using source qualifier, expr
ession, a ggregator, sorter and lookup transformations ?

u can use Unix command in pre session as sort -u file1 >newfile in a mapping whe
n we use aggregator transformation we will use group by port. if groupby is not
selected by default it will take only the last column why? Aggregator t/r perfor
ms calculations from first record to last record. There is no group it will take
last record if u are not selected groupby the integration service returns last
record from all input rows. What is the use of Data Mart? For overwriting Data.E
x" we used to load Flat file data from DSO to Cube. It is Data Mart. Loading dat
a from an infoprovider used as data target to another dat a target. The concept
we used is DATA MART. How will u get 1 and 3rd and 5th records in table what is
the query in oracle Display odd records Select * from EMP where (rowid,1) in ( s
elect rowid,mod(rownum,2) from EMP) Display even records Select * from EMP where
(rowid,0) in ( select rowid,mod(rownum,2) from EMP) Have you developed document
s in your project? And what documents we develop in r ealtime? We have to create
Low level design documentation in real time... In that we hav e to specify nami
ng conventions, source types, target types , business requirem ents, logics etc.
.. My source contain data like this cno cname sal 100 rama@gmail.com 1000 200 ka
run a@yahoo.com 2000 I want load my data to the target is cno cname sal 100 Rama
100 0 200 karuna 2000 In the expression editor, write replacestr(0,cname,(subst
r(cname,instr(cname, @ ),instr(cname, m ,-1,1)), ) Pass this port to the output
port...... my source is a comma delimited flatfile as eno, ename, sal 111,sri,ra
m,kumar,100 0 and my target should be eno ename sal 111 sri ram kumar 1000 i.e.;
we need to eliminate the commas in between the data of a comma delimited file.
See while we load into source analyzer as considered as cama separated it shows
us 5 columns now we should replace the cama with null use s->s/q-->expre-->targe
t give addi tional outputport as ename exp condition=column2|| ||column3|| ||col
umn3 Hi, in a mapping I have 3 targets and one fixed width file as source. Total
193 records are there . I connected one port in aggregator to all 3 targets. Th
e sam e value need to be load into these 3 targets . It is loaded like that only
but i n different order. Why? The order of insertion should be same know for al
l 3 tar gets ? Then why the order is changed ? Informatica don t consider the se
quence of records at the time of insertion to T arget , for that u should use a
sequence generator TF. or u can use sorter Hi, In source I have records like thi
s No name address 10 Manoj mum 10 Manoj Del hi 20 kumar usa 20 kumar Tokyo I wan
t records in target like shown below No name addr1 addr2 10 Manoj mum Delhi 20 k
umar usa Tokyo If it is reverse we can do th is by using Normalizer transformati
on by setting occurrence as 2. Somebody will say use demoralization technique. B
ut as of my knowledge I couldnt find any renor malization technique. Is there any
concept like that? I tried this seriously but I could find any idea to implemen
t this Use a dynamic lookup to check if the record exists, If it is not then

Insert that record in No , Name and Address1 If it is then use that record to up
date the address 2 field always, this might b e a case where the client wants to
keep the first record and current record in t he address 2 field I have one fla
tfile as target in a mapping . When I am trying to load data secon d time into i
t. The records already is in flatfile is getting override. I don t want to overr
ide existing records. Note : we can do this by implementing CDC / I ncremental p
ool logic if target is relational . But this is flatfile. So, even I use this sa
me technique it will override only So what is the solution ? Is ther e any optio
n at session level for flatfile target ? It s very Simple We have one option at
session level. Double click on session-->Mapping tab-->>Target properties->Appen
d if exists(check this option). What is version control in Informatica? Version
control is an option while installing the Informatica s/w. Either enable or disa
ble the version control option it will create an instance to an object. FOR EXAM
PLE (version control enable): if you create a mapping in Informatica, so it will
create an instance for that mapping. If you update that mapping, again an insta
nce is create for updated mapping. FOR EXAMPLE (version control disable): if you
create a mapping in Informatica, s o it will create an instance for that mappin
g. If you update that mapping again an instance is create for updated mapping an
d deleted first (initial) instance. How to connect two or more table with single
source qualifier? 1. 2. 3. 4. Drag all the tables in Mapping Designer. Delete a
ll the source qualifiers associated with all the tables. Create a SQ transformat
ion. Drag all the columns from all the tables into the SQ transformation.
What is procedure to use mapping variable in source qualifier transformation? Go
to source qualifier transformation with along with source and target after th a
t go to the properties, select user defined join sql editor will open after that
there is tw o options on left hand side select variable in that there is mappin
g variables a vailable How do you find out whether the column is numeric or comb
ination of char and num bers or it contains char, numeric and special characters
? in expression flag->outputport iif(is_number(id),1,0) in filter Condition flag
=1 (return numeric ) or flag=0 it will return alphabetic character Second way is
we use ascii() function ascii(id)>64 it return alphabet ascii(id)<64 it return
number Input flatfile1 has 5 columns, faltfile2 has 3 columns (no common column)
output should contain merged data (8 columns) Please let me know how to achieve
?
As per your question two files have same no of records Take first file s 5 colum
n to an expression transformation, add an output column in it let say A. Create a
mapping variable let say countA having datatype int eger. Now in port A you put
expression like setcountvariable (countA). take second file s 3 column to a exp
ression transformation ,add a output column in it let say B .create a mapping va
riable let say countB having datatype int eger. Now in port B you put expression
like setcountvariable (countB). The above two is only meant for creating common
fields with common data in two p ipelines. Now join two pipe line in a joiner t
ransformation upon the condition A=B. Now join the required 8 ports to the targe
t. You get your target as per your requirement. Input is like 1 1 1 2 2 3 and ou
tput should be 1 2 3 How can u achieve using ran k transformation?? src->sq->agg
regator->filter->target From aggt/r we would calculate count (empid) for elimina
ting duplicates we will write the filter condition count (empid) =1 What is the
significance of new lookup port in dynamic look up When we configure a look up t
/r to use a dynamic cache, the designer automatical ly adds a port called "New L
ookup Row" to the look up t/r. This port indicates w ith a numeric value whether
the Informatica server inserts or updates the row in the cache or makes no chan
ge in the cache. To keep the lookup cache and the target table synchronized, we
pass rows t o the target when the New Lookup Row value is equal to 1 or 2. I am
having a table with columns ID NAME 1 x and the requirement is to get the o /p l
ike this 1 y ID Count (*) 1 z 1 3 2 a 2 2 2 b 3 c so write a sql query to ge t t
he id n how many times its count of repetition n there u shouldn t get the di st
inct (i.e. id-3) Select id, count (*) from <table name> group by id having count
(*)>1; Every DWH must have time dimension so now what is the use of the time dim
ension how we can calculate sales for one month, half-yr ly, and yearly? how we
are doi ng this using time dimension. By taking the time dimension to expression
transformation and create new ports a nd make the new ports as output ports Wri
te condition in the new ports by using the function getdate_part (date dimens io
n) There is a table with EMP salary column how to get the fields belongs to the
sal ary greater than the average salary of particular department write a query S
elect * from EMP e where sal > (select avg (sal) from emp m where e.deptno = m.
deptno group by deptno) order by deptno

Gives u salary greater than the average salary of their departments... How we ca
n get unique records into one target table and duplicate records into a nother t
arget table?? Data flow for this one is: sq-->Aggregator-->Router-->target In ag
gregator t/r take 2 output ports and give condition like uniqe_port---->cou nt(*
)=1 duplicate_port--->count(*)>1 Connect this one router and in router take 2 gr
oups and give condition like Uniqgroup=uniqe_port duplicategroup=duplicate_port
and connect that groups to ta rget Can we load the data without a primary key of
a table? What is target plan? we can load data without primary key but if whene
ver u need to update the table from Informatica then u have to use update overri
de then only u can update the t able Alternative to update strategy transformati
on 2) out of 1000 records after loading 200 records, the session got failed. How
do u load the rest of records? 3) Use of lookup override 1) You can update the
target table by using target update override. I think this might be the alternat
ive 2) Consider performance recovery 3) Lookup override is nothing but overridin
g the default sql which is generated by the lookup at run time. The default sql
contains select, groupby and orderby clauses. The orderby clause orders the data
based on the ports in lookup tab. If you want to change the order then we need
to mention two hyphens at the end of the sql, which means the query generated by
the Informatica server is commented. How do you merge multiple Flat files for e
xample 100 flat files without using Un ion T/F First of all create a new flat fi
le and paste the location of the entire flat fi les in that newly created flat f
ile. Now import the source definition using "Imp ort from file" and define the m
apping. And assign the session properties as indi rect file type of loading and
give the location of the newly created file. If we set dd_insert in mapping and
Delete in session properties what will happen ? Yes it will perform delete as se
ssion properties overrides mapping properties. What is diff between grep and fin
d? Grep is used for finding any string in the file. Syntax - grep <String> <file
name> Example - grep compu details.txt Display the whole line, in which line com
pu string is found. Find is used to find the file or directory in given path, Sy
ntax - find <filename> Example - find compu* Display aal file name starting with
compu How can we perform incremental aggregation? Explain with example?
You have to perform the Incremental Aggregation in Session Properties only. Ex:Y
ou Target Table loaded(with Agg Calculation) 100 Records on yesterday , No w new
ly you have 50 Records(25 update,25 insert), To do the Agg calculation for Updat
ed Records ,Insert Records you need to Perform Incremental Aggregation. This is
simple way to incre ase the Performance, Reducing the time etc. If you not perfo
rm the Incremental A ggregation, you can do the same thing in another way. But i
ts lengthy What is a time dimension? Give an example? Time Dimension: Generally,
to generate dates as per the requirement we use date dimension. If youre loading
of data in fact table on the basis of time/date then we use the values of date d
imension to populate the fact. We take the last date on which the fact is popula
ted. Then check for the existen ce of dates for the data to be populated.ifnot w
e generate through some stored p rocedure or as per requirement. Eg: Daily, week
ly, financial year, calendar year, business year etc.,
F10 and F5 are used in debugging process By pressing F10, the process will move
to the next transformation from the curre nt transformation and the current data
can be seen in the bottom panel of the wi ndow.. Whereas F5 will process the fu
ll data at a stretch. In case of F5, u can see the data in the targets at the en
d of the process but cannot see intermediate trans formation values What is data
quality? How can a data quality solution be implemented into my Inf ormatica tr
ansformations, even internationally? Data Quality is when you verify your data a
nd identify if the data present in th e warehouse is efficient and error free. T
he data present in each column should have the meaning full information like it
should not have null data, it should n ot have garbage data, complete correct da
ta must be transformed to the target da ta, if all the data types are correct as
per the requirement, etc. Asking for Informatica transformations means you want
to know how source target transformations to be implemented in Informatica? Wha
t is the architecture of any Data warehousing project? Step-01------>source to s
taging Step-02------>staging to dimension Step-03------>dimension to fact Projec
t planning---Requirements gathering ---product selection and installation Dimens
ional modeling --- physical modeling ---- deployment --- maintenance In easy ter
ms for dimensional modeling 1. Select the business process 2 identify the granin
s 3. Design the dimension ta ble 4. Design the fact table Once these 4 steps are
over it will move to physical modeling in physical modeli ng u apply the etl pr
ocess and performance techniques.
What is the function of F10

Informatica?
What is causal dimension? One of the most interesting and valuable dimensions in
a data warehouse is one t hat explains why a fact table record exists. In most
data warehouses, you build a fact table record when something happens. For examp
le: When the cash register rings in a retail store, a fact table record is creat
ed f or each line item on the sales ticket. The obvious dimensions of this fact
table record are product, store, customer, sales ticket, and time. At a bank ATM
, a fact table record is created for every customer transaction. Th e dimensions
of this fact table record are financial service, ATM location, cust omer, trans
action type, and time. When the telephone rings, the phone company creates a fac
t table record for each "hook event." A complete call- tracking data warehouse i
n a telephone company r ecords each completed call, busy signal, wrong number, a
nd partially dialed call . In all three of these cases, a physical event takes p
lace, and the data warehous e responds by storing a fact table record. However,
the physical events and the corresponding fact table records are more interestin
g than simply storing a smal l piece of revenue. Each event represents a conscio
us decision by the customer t o use the product or the service. A good marketing
person is fascinated by these events. Why did the customer choose to buy the pr
oduct or use the service at th at exact moment? If we only had a dimension calle
d "Why Did the Customer Buy My Product Just Now?" our data warehouses could answ
er almost any marketing questio n. We call a dimension like this a "causal dimens
ion; because it explains what ca used the event. How many repositories can we cr
eate in Informatica?? In Informatica Power mart we can create any no of reposito
ries, but we cannot sh are the metadata across the repositories, In Informatica
Power center we can create any no of repositories, but we can des ignate only on
e repository as a global repository which can access or share meta data from all
other repositories. How can we run workflow with pmcmd? Connect to pmcmd, conne
ct to integration service. Pmcmd>connect -sv service_name -d domain_name -u user
_name -p password; Start workflow, Pmcmd>startworkflow -f folder_name What is th
e exact difference b/w IN and EXIST in Oracle..?
Here s the EXPLAIN PLAN for this query: OBJECT OPERATION ---------- --------------------------------------SELECTSTATEMENT() NESTEDLOOPS() EMP TABLEACCESS(FULL)
EMP TABLEACCESS(BY INDEX ROWID) PK_EMP INDEX(UNIQUESCAN) This query is virtuall
y equivalent to this: Select e1.ename from EMP e1, (select empno from EMP where
ename = KING ) e2 whe re e1.mgr = e2.empno;
Select ename from EMP e Where mgr in (select empno from EMP where ename = KING )
;
You can write the same query using EXISTS by moving the outer query column to a
sub query condition, like this: Select ename from EMP e where exists (select 0 f
rom EMP wheree.mgr = empno and e name = KING ); When you write EXISTS in a where
clause, youre telling the optimizer that you wan t the outer query to be run fir
st, using each value to fetch a value from the in ner query(think: EXISTS = outs
ide to inside). In what type of scenario bulk loading and normal loading we use?
We use bulk loading in such scenarios where there is bulk amount of data is to
b e loaded into target. I.e. we when we want to load large amount of data fast i
nt o the target we use bulk loading. When u dont want to do the session recovery
and u r target should not contain any primary keys How to join two flat file if
they have diff. structure? How to join one relation al and one flat file? You ha
ve two flat-files with u. prepare two source instances with the structure you ha
ve as Oracle relational tables. After that change them to flat files in the sour
ce ins tances. Then connect to the target s that u already prepared by simple-pa
ss mapp ing. Now you have two relational source tables with you, then join the t
ables us ing joiner. -------------------------------------------------How to joi
n one relational and one flat file? Same as above convert the flat file to relat
ional table through simple-pass mapp ing. Now join two relational tables using j
oiner in the same mapping itself. If the flat files has diff structure then sd->sq-->exp> ...>join-->tgt sd1-->sq1-->exp1> that means take flat file source the
n take 2 exp transformation in which u take variable ports i.e. a=1 and exp1 is
b=1 then based on condition u take a joine d trans and connect to target. Join t
akes different sources How to join two flat files in Informatica? If the structu
re of the two flat files is same we can use SQ by using in direct. if there is n
o common field in the two flat files then create dummy columns i n the exp t/r a
nd then by using the dummy columns u can join them in the joiner t/r by giving t
he condition dummy = dummy1. the flow is like this src1--->SQ----->exp---> |--->
joiner---->target src2--->SQ----->exp---> How do u identify or filter out a 0 by
te file available in a folder by using UNI X command? Most files of the followin
g command output will be lock-files and place holders created by other applicati
ons. # find ~ -empty
List all the empty files only in your home directory. # find . -maxdepth 1 -empt
y List only the non-hidden empty files only in the current directory. # find. -m
axdepth 1 -empty -not -name ".*" Can we use unconnected lookup as dynamic lookup
? No, Unconnected lookup will return one port only. But dynamic lookup will retu
rn more than one port and it updates and inserts the target while session runs.
How can u avoid duplicate rows in flat file? Sorter, aggregator, dynamic lookup
Normalizer transformation is not involved in Mapplet. Why? Mapplet er is a that
is you can is a reusable logic that you dynamic transformation which dependenden
t on the input to reuse in other mappings have can use across different mappings
. Normaliz converts rows to columns or vice-versa, so the Normalizer, it is not
fixed logic that 2 mappings for this 2 mappings
I want use only one lookup/r how? We can reuse the LKP in different mapping... W
hat we need to do is.. step1:--create a LKP in transformation designer.. if we c
reate a transformation in transformation developer its reusable... step2:-- crea
te a transformation ...click--->transformation tab--->change to reu sable How ca
n one eliminate duplicate data without using distinct option? Using Group by com
mand removes all duplicates records In a Table my source having 10 records but h
ow can I load 20 records in target; I am not bother about duplicates? QRC----->
TRGT | ----> TRGT Have two instance of target and connect source to both target
instances. In Lookup transformation a sql override should be done and disable th
e cache how do you do this procedure? If you disable cache you can t override th
e default sql query. What is the meaning of upgradtion of repository? Do one thi
ng Upgradtion of repository means u can upgrade the lower version int o higher v
ersion this u can do in Repository Manager right click on that there is the opti
on upgrade select that and then add the license & product code..... . I have fla
t file source. I want to load the maximum salary of each deptno into t arget. Wh
at is the mapping flow?
We can use an aggregator to group by on deptno and create a Newport to find the
max (salary) and load dept no and salary, well get unique dept no and the max sal
ary. How to run the batch using pmcmd command Using Command task in the workflo
w What is Test Load? The power center reads, transforms data and without writing
into targets. The po wer center generates all session files and pre-post sql fu
nctions, as if running full session. The power center writes data into relationa
l targets. But rollbac ks data when the session is completes how DTM buffer size
and buffer block size are related The number of buffer blocks in a session = DT
M Buffer Size / Buffer Block Size. Default settings create enough buffer blocks
for 83 sources and targets. If the session contains more than 83, you might need
to increase DTM Buffer Size or decrease Default Buffer Block Size (Total number
of sources + total number of targets)* 2] = (session buffer blocks ) (Session B
uffer Blocks) = (.9) * (DTM Buffer Size) / (Default Buffer Block Size) * (number
of partitions) What r the transformations that r not involved in Mapplet? 1. No
rmalizer transformations 2.COBOL sources 3.XML Source Qualifier transfo rmations
4. XML sources 5.Target definitions 6.Other mapplets 7.Pre- and post- session s
tored procedures Define Informatica repository? Informatica repository is a cen
tral meta-data storage place, which contains all the information which is necess
ary to build a data warehouse or a data mart. Meta-data like source def, target
def, businessrules, sessions, mappings, workfl ows, mapplets, worklets, database
connections, user information, shortcuts etc How much memory (size) occupied by
a session at runtime A session contains mapping and sources, trans, targets in
that mapping. I think the size of session depends on the caches that we used for
diff transformations in mapping and the size of the data that passes through tr
ansformations. What are the different options used to configure the sequential b
atches? Two options Run the session only if previous session completes successfu
lly. Always runs the session. From where you extract the data, how you did it in
to Informatica?
If source is relation tables source data is relational (i.e. means oracle 9i/10g
) But source is flat that can in UNIX server (client can give path details).in
I nformatica we have to create the sturuture of table and give that path address
i n the session property tab. What are your source in project and how you impor
t in Informatica? How can I exp lain about this? Sources in Informatica can diff
er from Client to Client and project to project. But mostly client will send sam
ple data through flat files. And the metadata of the sample data is imported fro
m Source analyzer by clicking on the option impor t from file. What is data mode
ling? What are types of modelling? In which situation will use each one? Data mo
deling: It is the process of designing a Datamart or DatawareHouse. There are th
ree phases in data modeling 1) Conceptual designing: In this phase the database
architects and the managers will understand the client requirements. After under
standing the requirements th ey will identify the attributes and entities (colum
ns and tables) 2) Logical designing: In this phase the dimension tables and the
fact tables are identified and also the relationship between the fact and dimens
ion tables will gets identified. Now the schema looks like either a star or snow
flake schema. To perform logical designing we use data modeling tools like ER ST
UDIO or ERWIN. 3) Physical designing: Once the logical designing is approved by
the client it w ill be converted in to physical existence in the database. When
will we use unconnected & connected lookup? How it will effect on the perfo rman
ce of mapping? if u want to perform look up on less values then we go for connec
ted lkp if u wa nt to perform look up from more than one table then we go for un
connected lkp i n your source u have more date columns then we should go for un
connected lkp. What is the difference between warehouse key and surrogate key? S
urrogate key concept:- Surrogate keys are generated by system and they identif i
es unique ENTITY ! yes its entity and not record ,while primary key is used f or
finding unique record. Let me give you a simple classical example for surrogate
key: On the 1st of January 2002 Employee E1 belongs to Business Unit BU1 (that
s what would be in your Employee Dimension). This employee has a turnover alloca
te d to him on the Business Unit BU1 but on the 2nd of June the Employee E1 is m
uted from Business Unit BU1 to Business Unit BU2. All the new turnovers hav e to
belong to the new Business Unit BU2 but the old one should belong to the Busine
ss Unit BU1. If you used the natural business key E1 for your employee within yo
ur Dataware House everything would be allocated to Business Unit BU2 even what a
ctually be longs to BU1. If you use surrogate keys you could create on the 2nd o
f June a new record for t he Employee E1 in your Employee Dimension with a new s
urrogate key. This way in your fact table you have your old data (before 2nd of
June) with the
You could consider Slowly Changing Dimension as an enlargement of your natural k

ey: natural key of the Employee was Employee Code E1 but for you it becomes Em
ployee Code Business Unit - E1 BU1 or E1 BU2. But the difference with th e natur
al key enlargement process is that you might not have all part of your new key w
ithin your fact table so you might not be able to do the join on the new enlarge
key -> so you need another id. After we make a folder shared can it be reversed
? Why? They cannot be unshared. Because it is to be assumed that users have crea
ted shortcuts to objects in thes e folders. Un-sharing them would render these s
hortcuts useless and could have d isastrous consequences. What is the filename w
hich you need to configure in UNIX while installing Inform atica? pmserver.cfg H
ow u know when to use a static cache and dynamic cache in lookup transformation
. Dynamic cache is generally used when u are applying lookup on a target table a
nd in one flow same data is coming twice for insertion or once for insertion and
o nce for updation. Performance: dynamic cache decreases the performance in com
parison to static ca che since it first looks in the whole table that whether da
ta was previously pr esent if no then only it inserts so it takes much time Stat
ic cache do not see such things just insert data as many times as it is coming.
Whether Sequence generator T/r uses Caches? Then what type of Cache it is Sequen
ce generator uses a cache when reusable. This option is to facilitate multiple s
essions that are using the same reusable sequence generator. The number of value
s cached can be set in the properties of the sequence generat or. Not sure about
the type of Cache. Explain grouped cross tab? Grouped cross tab means same as c
ross tab report particularly grouped Ex:- emp d ept tables take select row empno
and column in ename and group item deptno and cell select sal then its comes 10
------------------raju|ramu|krishna|.... 7098| 500 7034| 7023|600 ------------20

SID of the Employee SID of the employee
E1 E1
BU1. All new data (after 2nd of June) would take the BU2.
...... .... Like type ok Explain about HLD and LLD? HLD: It refers to the functi
onality to be achieved to meet the client requiremen t. Precisely speaking it is
a diagrammatic representation of clients operational systems, staging areas, dwh
n datamarts. Also how n what frequency the data is e xtracted n loaded into the
target database. LLD: It is prepared for every mapping along with unit test pla
n. It contains the names of source definitions, target definitions, transformati
ons used, column n ames, data types, business logic written n source to target f
ield matrix, sessio n name, mapping name. Reach me on bsgsr12@gmail.com 98661886
58 Transformer is a __________ stage option1: Passive 2.Active 3.Dynamic 4.Stati
c Dynamic more than active stage because its not taking space in your DB its init
ia te run time with session, cache data do transformations and end with session.
How can u connect client to your Informatica sever if server is located at diff
e rent place (not local to the client) Hi U need to connect remotely to your ser
ver and access the repository. U will be given repository user name and password
and add this repository and co nnect to it with your credentials.

Informatica Interview Questions

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Informatica Interview Questions

Hochgeladen von

Copyright:

Verfügbare Formate

TCS

s to the target. Update changed rows in th e target, overwriting existing rows.

metadata (properties)of the tables to for "compare Objects" in source analyzer/

What is the function of F10

You could consider Slowly Changing Dimension as an enlargement of your natural k

Das könnte Ihnen auch gefallen