Sie sind auf Seite 1von 31

DWH Concepts and Informatica Q&A

Q. What are the responsibilities of a data warehouse consultant/professional?

The basic responsibility of a data warehouse consultant is to publish the right data.
Some of the other responsibilities of a data warehouse consultant are:
Understand the end users by their business area, job responsibilities, and computer tolerance
Find out the decisions the end users want to mae with the help of the data warehouse
!dentify the best users who will mae effecti"e decisions using the data warehouse
Find the potential new users and mae them aware of the data warehouse
#etermining the grain of the data
$ae the end user screens and applications much simpler and more template dri"en
%. &hat are the #ata &arehouse 'enter administration functions(
The functions of )isual &arehouse administration are:
'reating #ata &arehouse 'enter security groups.
#efining #ata &arehouse 'enter pri"ileges for that group.
*egistering #ata &arehouse 'enter users.
+dding #ata &arehouse 'enter users to security groups.
*egistering data sources.
*egistering warehouses ,targets-.
'reating subjects.
*egistering agents.
*egistering #ata &arehouse 'enter programs.
Q. How does my company et started with data warehousin?
.uild one/ The easiest way to get started with data warehousing is to analy0e some e1isting
transaction processing systems and see what type of historical trends and comparisons might be
interesting to e1amine to support decision maing. See if there is a 2real2 user need for
integrating the data. !f there is, then !S3!T staff can de"elop a data model for a new schema and
load it with some current data and start creating a decision support data store using a database
management system ,#.$S-. Find some software for 4uery and reporting and build a decision
support interface that5s easy to use. +lthough the initial data warehouse3data6dri"en #SS may
seem to meet only limited needs, it is a 2first step2. Start small and build more sophisticated
systems based upon e1perience and successes.

Q. What is data minin?
#ata $ining is the process of automated e1traction of predicti"e information from large
databases. !t predicts future trends and finds beha"iour that the e1perts may miss as it lies
beyond their e1pectations. #ata $ining is part of a larger process called nowledge disco"ery7
specifically, the step in which ad"anced statistical analysis and modeling techni4ues are applied
to the data to find useful patterns and relationships.
#ata mining can be defined as 2a decision support process in which we search for patterns of
information in data.2 This search may be done just by the user, i.e. just by performing 4ueries, in
which case it is 4uite hard and in most of the cases not comprehensi"e enough to re"eal intricate
patterns. #ata mining uses sophisticated statistical analysis and modeling techni4ues to unco"er
such patterns and relationships hidden in organi0ational databases 8 patterns that ordinary
methods might miss. 9nce found, the information needs to be presented in a suitable form, with
graphs, reports, etc.
What is the filename which you need to confiure in !"I# while installin infromatica?
Could any one to tell How to use the $racle Analytic functions in Informatica?
Some of the9racle +nalytical functions are lie #;:S;=*+:>
,-, *+:>,-,'U$=#!ST,-,'U$=+)?,- etc. 9ne get the
fuctionalities of some of these functions such as ran by
using a ran transformation.
For other types of functions one can resort to custom
Q% What is a loo& up function? What is default transformation for the loo& up function?
loo up compares the source to update and insert
the new rows to the target
9ptimi0e <ooup transformation :
@. 'aching the looup table:
&hen caching is enabled the informatica ser"er caches the looup table and 4ueries the cache
during the session. &hen this option is not enabled the ser"er 4ueries the looup table on a row6
by row basis.
Static, #ynamic, Shared, Un6shared and Aersistent cache
B. 9ptimi0ing the looup condition whene"er multiple conditions are placed, the condition with
e4uality sign should tae precedence.
C. !nde1ing the looup table
The cached looup table should be inde1ed on order by columns.
The session log contains the 9*#;* .D statement, the un6cached looup since the ser"er
issues a S;<;'T statement for each row passing into looup transformation, it is better to inde1
the looup table on the columns in the condition
9ptimi0e Filter transformation:
Dou can impro"e the efficiency by filtering early in the data flow. !nstead of using a filter
transformation halfway through the mapping to remo"e a si0able amount of data.
Use a source 4ualifier filter to remo"e those same rows at the source, !f not possible to mo"e the
filter into S%, mo"e the filter transformation as close to the source 4ualifier as possible to remo"e
unnecessary data early in the data flow.
9ptimi0e +ggregate transformation:
@. ?roup by simpler columns. Areferably numeric columns.
B. Use Sorted input. The sorted input decreases the use of aggregate caches. The ser"er
assumes all input data are sorted and as it reads it performs aggregate calculations.
C. Use incremental aggregation in session property sheet.
9ptimi0e Se4. ?enerator transformation:
@. Try creating a reusable Se4. ?enerator transformation and use it in multiple mappings B. The
number of cached "alue property determines the number of "alues the informatica ser"er caches
at one time.
9ptimi0e ;1pression transformation:
@. Factoring out common logic
B. $inimi0e aggregate function calls.
C. *eplace common sub6e1pressions with local "ariables.
E. Use operators instead of functions.
Fow to looup the data on multiple tabels.
f the two tables are relational, then u can use the S%< looup o"er ride option to join the two
tables in the looup properties.u cannot join a flat file and a relatioanl table.
eg: looup default 4uery will be select looup table column=names from looup=table. u can now
continue this 4uery. add column=names of the Bnd table with the 4ualifier, and a where clause. if
u want to use a order by then use 66 at the end of the order by.
&hy dimenstion tables are denormali0ed in nature (
.ecause in #ata warehousing historical data should be maintained, to maintain historical data
means suppose one employee details lie where pre"iously he wored, and now where he is
woring, all details should be maintain in one table, if u maintain primary ey it won5t allow the
duplicate records with same employee id. so to maintain historical data we are all going for
concept data warehousing by using surrogate eys we can achie"e the historical data,using
oracle se4uence for critical column-.
so all the dimensions are marinating historical data, they are de normali0ed, because of duplicate
entry means not e1actly duplicate record with same employee number another record is
maintaining in the table.
!f you ha"e four looup tables in the worflow. Fow do you troubleshoot to impro'e
There r many ways to impro"e the mapping which has multiple looups.
@- we can create an inde1 for the looup table if we ha"e permissions,staging area-.
B- di"ide the looup mapping into two ,a- dedicate one for insert means: source 6 target,, these r
new rows . only the new rows will come to mapping and the process will be fast . ,b- dedicate the
second one to update : sourceGtarget,, these r e1isting rows. only the rows which e1ists allready
will come into the mapping.
C-we can increase the chache si0e of the looup
&ith out using Updatestretagy and sessons options, how we can do the update our target table(
!n session properties, There is an option
insert as update
update as update
lie that
by using this we will easily sol"e
'an i start and stop single session in concurent bstch(
ya shoor,Hust right clic on the particular session and going to reco"ery option
by using e"ent wait and e"ent rise
'an we run a group of sessions without using wor&flow manager
ya !ts Aosible using pmcmd 'ommand with out using the worflow $anager run the group of
as per my nowledge i gi"e the answer.
&hat is the logic will you implement to laod the data in to one fact" from 5n5 number of dimension
:oramally e"ey one use
/-slowly changing diemnsions
B-slowly growing dimensions
&hat is meant by comple1 mapping
'omple1 maping means in"ol"ed in more logic and more business rules.
+ctually in my project comple1 mapping is
!n my ban project, ! in"ol"ed in construct a @ dataware house
$eny customer is there in my ban project, They r after taing loans relocated in to another place
that time i feel to diffcult maintain both pr"ious and current adresses
in the sense i am using scdB
This is an simple e1ample of comple1 mapping
#ifference between *an and #ense *an(
BI66Bnd position
BI66Crd position
Same *an is assigned to same totals3numbers. *an is followed by the Aosition. ?olf game
ususally *ans this way. This is usually a ?old *aning.
#ense *an:
BI66Bnd position
BI66Crd position
Same rans are assigned to same totals3numbers3names. the ne1t ran follows the serial
I()*$+I", (A))I", )-*.$*(A"C- / 0I)1
@. +ggregator Transformation
Dou can use the following guidelines to optimi0e the performance of an +ggregator
a- Use Sorted !nput to decrease the use of aggregate caches:
The Sorted !nput option reduces the amount of data cached during the session and
impro"es session performance. Use this option with the Source %ualifier :umber of Sorted Aorts
option to pass sorted data to the +ggregator transformation.
b- <imit connected input3output or output ports:
<imit the number of connected input3output or output ports to reduce the amount of data
the +ggregator transformation stores in the data cache.
c- Filter before aggregating:
!f you use a Filter transformation in the mapping, place the transformation before the
+ggregator transformation to reduce unnecessary aggregation.
B. Filter Transformation
The following tips can help filter performance:
a- Use the Filter transformation early in the mapping:
To ma1imi0e session performance, eep the Filter transformation as close as possible to
the sources in the mapping. *ather than passing rows that you plan to discard through the
mapping, you can filter out unwanted data early in the flow of data from sources to targets.
b- Use the Source %ualifier to filter:
The Source %ualifier transformation pro"ides an alternate way to filter rows. *ather than
filtering rows from within a mapping, the Source %ualifier transformation filters rows when read
from a source. The main difference is that the source 4ualifier limits the row set extracted from a
source, while the Filter transformation limits the row set sent to a target. Since a source 4ualifier
reduces the number of rows used throughout the mapping, it pro"ides better performance.
Fowe"er, the source 4ualifier only lets you filter rows from relational sources, while the
Filter transformation filters rows from any type of source. +lso, note that since it runs in the
database, you must mae sure that the source 4ualifier filter condition only uses standard S%<.
The Filter transformation can define a condition using any statement or transformation function
that returns either a T*U; or F+<S; "alue.
C. Hoiner Transformation
The following tips can help impro"e session performance:
a- Aerform joins in a database:
Aerforming a join in a database is faster than performing a join in the session. !n some
cases, this is not possible, such as joining tables from two different databases or flat file systems.
!f you want to perform a join in a database, you can use the following options:
'reate a pre6session stored procedure to join the tables in a database before
running the mapping.
Use the Source %ualifier transformation to perform the join.

b- #esignate as the master source the source with the smaller number of records:
For optimal performance and dis storage, designate the master source as the source
with the lower number of rows. &ith a smaller master source, the data cache is smaller, and the
search time is shorter.
E. <ooUp Transformation
Use the following tips when you configure the <ooup transformation:
a- +dd an inde1 to the columns used in a looup condition:
!f you ha"e pri"ileges to modify the database containing a looup table, you can impro"e
performance for both cached and uncached looups. This is important for "ery large looup
tables. Since the !nformatica Ser"er needs to 4uery, sort, and compare "alues in these columns,
the inde1 needs to include e"ery column used in a looup condition.
b- Alace conditions with an e4uality operator ,G- first:
!f a <ooup transformation specifies se"eral conditions, you can impro"e looup
performance by placing all the conditions that use the e4uality operator first in the list of
conditions that appear under the 'ondition tab.
c- 'ache small looup tables:
!mpro"e session performance by caching small looup tables. The result of the <ooup
4uery and processing is the same, regardless of whether you cache the looup table or not.
d- Hoin tables in the database:
!f the looup table is on the same database as the source table in your mapping and
caching is not feasible, join the tables in the source database rather than using a <ooup
e- Un Select the cache loo6up option in <oo Up transformation if there is no loo up o"er6ride.
This impro"es performance of session.
(A))I", +A*IA23-1
@. ?o to $appings Tab, 'lic Aarameters and )ariables Tab, 'reate a :;& port as below.
KK<ast*unTime)ariable date3time @L M $a1
?i"e an !nitial )alue. For e1ample @3@3@LMM.
B. !: ;NA Transformation, 'reate )ariable as below:
Set<ast*unTime ,date3time- G S;T)+*!+.<; ,KK<ast*unTime, S;SSST+*TT!$;-
C. ?o to S9U*'; %U+<!F!;* Transformation,
'lic Aroperties Tab, !n Source Filter area, ;:T;* the following ;1pression.
!pdateDate0ime 4Any Date Column from source5 67 8993ast*un0ime8
UpdateDateTime < '$$$SessStartTime'
Fandle :ulls in #+T;
3$$; !) A"D !)DA0- 10*A0-,< -#)*-11I$"
First, declare a <oo Up condition in <oo Up Transformation.
For e1ample,
;$A!#=!: ,column coming from source- G ;$A!# ,column in target table-
Second, drag and drop these two columns into UA#+T; Strategy Transformation.
'hec the )alue coming from source ,;$A!#=!:- with the column in the target table ,;$A!#-. !f
both are e4ual this means that the record already e1ists in the target. So we need to update the
record ,##=UA#+T;-. ;lse will insert the record coming from source into the target
,##=!:S;*T-. See below for UA#+T; Strategy e1pression.
!!F ,,;$A!#=!: G ;$A!#-, ##=UA#+T;, ##=!:S;*T-
"$0-% +lways the Update Strategy e1pression should be based on Arimary eys in the target
!!F ,!S:U<< ,Ser"ice9rder#ate)alue@-,
T9=#+T; ,5@3@3@LMM5,5$$3##3DDDD5-, T*U:'
!!F ,!S:U<< ,:pa:11!d@- or <;:?TF ,*T*!$ ,:pa:11!d@--GM or T9=:U$.;* ,:pa:11!d@- IG
M,5U:>5, :pa:11!d@-
!!F ,!S:U<< ,!nstall$ethod!d-,M,!nstall$ethod!d-
Date:Diff,0*!"C,9=Ser"ice9rder#ate)alue-,0*!"C,9=Ser"ice9rder#ate)alue-, 5##5-
.I30-* C$"DI0I$"
To pass only :9T :U<< +:# :9T SA+';S )+<U;S TF*9U?F T*+:SF9*$+T!9:.
II. ,
A;*F9*$+:'; T!AS !: ?;:;*+<
$ost of the gains in performance deri"e from good database design, thorough 4uery analysis,
and appropriate inde1ing. The largest performance gains can be reali0ed by establishing a good
database design.
=. !pdate 0able 1tatistics in database.
SD.+S; SD:T+N: update all statistics table_name
+dapti"e Ser"ers cost6based optimi0er uses statistics about the tables, inde1es, and columns
named in a 4uery to estimate 4uery costs. !t chooses the access method that the optimi0er
determines has the least cost. .ut this cost estimate cannot be accurate if statistics are not
accurate. Some statistics, such as the number of pages or rows in a table, are updated during
4uery processing. 9ther statistics, such as the histograms on
columns, are only updated when you run the update statistics command or when inde1es are
!f you are ha"ing problems with a 4uery performing slowly, and see help from Technical Support
or a Sybase news group on the !nternet, one of the first 4uestions you are liely be ased is R#id
you run update statistics(S Dou can use the optdiag command ,!: SD.+S;- to see the time
update statistics was last run for each column on which statistics e1ist:
*unning the update statistics commands re4uires system resources. <ie other maintenance
tass, it should be scheduled at times when load on the ser"er is light. !n particular, update
statistics re4uires table scans or leaf6le"el scans of inde1es, may increase !39 contention, may
use the 'AU to perform sorts, and uses the data and procedure caches. Use of these resources
can ad"ersely affect 4ueries running on the ser"er if you run update statistics at times when
usage is high. !n addition, some update statistics commands re4uire shared locs, which can
bloc updates.
T #ropping an inde1 does not drop the statistics for the inde1, since the optimi0er can use column6
le"el statistics to estimate costs, e"en when no inde1 e1ists. !f you want to remo"e the statistics
after dropping an inde1, you must e1plicitly delete them with delete statistics.
T Truncating a table does not delete the column6le"el statistics in sysstatistics. !n many cases,
tables are truncated and the same data is reloaded. Since truncate table does not delete the
column6le"el statistics, there is no need to run update statistics after the table is reloaded, if the
data is the same. !f you reload the table with data that has a different distribution of ey "alues,
you need to run update statistics.
T Dou can drop and re6create inde1es without affecting the inde1 statistics, by specifying M for the
number of steps in the with statistics clause to create inde1. This create inde1 command does not
affect the statistics in sysstatistics ,!: SD.+S;-:
'reate inde1 title=id=i1 on titles ,title=id- with statistics using M "alues
This allows you to re6create an inde1 without o"erwriting statistics that ha"e been edited with
T !f two users attempt to create an inde1 on the same table, with the same columns, at the same
time, one of the commands may fail due to an attempt to enter a duplicate ey "alue in
>. Create Inde?es on ;-< fields. ;eep Inde? statistics up to date.
"$0-% !f data modification performance is poor, you may ha"e too many inde1es. &hile inde1es
fa"or Rselect operationsS, they slow down Rdata modificationsS.
!nde1es are the most important physical design element in impro"ing database performance:
T !nde1es help pre"ent table scans. !nstead of reading hundreds of data pages, a few inde1 pages
and data pages can satisfy many 4ueries.
T For some 4ueries, data can be retrie"ed from a nonclustered inde1 without e"er accessing the
data rows.
T 'lustered inde1es can randomi0e data inserts, a"oiding insert Rhot spotsS on the last page of a
T !nde1es can help a"oid sorts, if the inde1 order matches the order of columns in an order by
!n addition to their performance benefits, inde1es can enforce the uni4ueness of data.
!nde1es are database objects that can be created for a table to speed direct access to specific
data rows. !nde1es store the "alues of the ey,s- that were named when the inde1 was created,
and logical pointers to the data pages or to other inde1 pages.
+dapti"e Ser"er ,SD.+S;- pro"ides two types of inde1es:
T 'lustered inde1es, where the table data is physically stored in the order of the eys on the
T For allpages6loced tables, rows are stored in ey order on pages, and pages are lined in ey
T For data6only6loced tables, inde1es are used to direct the storage of data on rows and pages,
but strict ey ordering is not maintained.
T :onclustered inde1es, where the storage order of data in the table is not related to inde1 eys.
Dou can create only one clustered inde1 on a table because there is only one possible physical
ordering of the data rows. Dou can create up to BEL nonclustered inde1es per table. + table that
has no clustered inde1 is called a RheapS.
@. Drop and *e/create the Inde?es that hurt performance.
#rop !nde1es ,!n Are6Session- before inserting data +:# *e6'reate !nde1es ,!n Aost6Session-
after data is inserted.
"$0-% &ith inde1es, inserting data is slower.
#rop inde1es that hurt performance. !f an application performs data modifications during the day
and generates reports at night, you may want to drop some inde1es in the morning and re6create
them at night. #rop inde1es during periods when fre4uent updates occur and rebuild them before
periods when fre4uent selects occur.
A. Also you can impro'e performance by
Using transaction log thresholds to automate log dumps and Ra"oid running out of
Using thresholds for space monitoring in data segments.
Using partitions to speed loading of data.
B. 0o tune the 1Q3 Query
&e can use RAarallel FintsS in the S;<;'T stmt of S%< %uery. +lso use the table with large no.
of rows last when joining. !n other sense, use the table with less no. of rows as a $+ST;*
source. +lso %ueries that contain 9*#;* .D or ?*9UA .D clauses may benefit from creating
an inde1 on the 9*#;* .D or ?*9UA .D columns. 9nce you optimi0e the 4uery, use the S%<
o"erride option to tae full ad"antage of these modifications.
C. *eisterin (ultiple 1er'ers
+lso performance can be increased by registering multiple ser"ers which point to same
$ther methods to Impro'e )erformance
$ptimiDin the 0aret Database
!f your session writes to a flat file target, you can optimi0e session performance by writing to a flat
file target that is local to the !nformatica Ser"er.
!f your session writes to a relational target, consider performing the following tass to increase
#rop inde1es and ey constraints.
!ncrease checpoint inter"als.
Use bul loading.
Use e1ternal loading.
Turn off reco"ery.
!ncrease database networ pacet si0e.
9ptimi0e 9racle target databases.
Droppin Inde?es and ;ey Constraints
&hen you define ey constraints or inde1es in target tables, you slow the loading of data to those
tables. To impro"e performance, drop inde1es and ey constraints before running your session.
Dou can rebuild those inde1es and ey constraints after the session completes.
!f you decide to drop and rebuild inde1es and ey constraints on a regular basis, you can
create pre6 and post6load stored procedures to perform these operations each time you run the
"ote% To optimi0e performance, use constraint6based loading only if necessary.
Increasin Chec&point Inter'als
The !nformatica Ser"er performance slows each time it waits for the database to perform a
checpoint. To increase performance, consider increasing the database checpoint inter"al.
&hen you increase the database checpoint inter"al, you increase the lielihood that the
database performs checpoints as necessary, when the si0e of the database log file reaches its
2ul& 3oadin on 1ybase and (icrosoft 1Q3 1er'er
Dou can use bul loading to impro"e the performance of a session that inserts a large amount of
data to a Sybase or $icrosoft S%< Ser"er database. 'onfigure bul loading on the Targets dialog
bo1 in the session properties.
&hen bul loading, the !nformatica Ser"er bypasses the database log, which speeds
performance. &ithout writing to the database log, howe"er, the target database cannot perform
rollbac. +s a result, the !nformatica Ser"er cannot perform reco"ery of the session. Therefore,
you must weigh the importance of impro"ed session performance against the ability to reco"er an
incomplete session.
!f you ha"e inde1es or ey constraints on your target tables and you want to enable bul
loading, you must drop the inde1es and constraints before running the session. +fter the session
completes, you can rebuild them. !f you decide to use bul loading with the session on a regular
basis, you can create pre6 and post6load stored procedures to drop and rebuild inde1es and ey
For other databases, e"en if you configure the bul loading option, !nformatica Ser"er
ignores the commit inter"al mentioned and commits as needed.
-?ternal 3oadin on 0eradataE $racleE and 1ybase IQ
Dou can use the ;1ternal <oader session option to integrate e1ternal loading with a session.
!f you ha"e a Teradata target database, you can use the Teradata e1ternal loader utility to bul
load target files.
!f your target database runs on 9racle, you can use the 9racle S%<U<oader utility to bul
load target files. &hen you load data to an 9racle database using a partitioned session, you can
increase performance if you create the 9racle target table with the same number of partitions you
use for the session.
!f your target database runs on Sybase !%, you can use the Sybase !% e1ternal loader
utility to bul load target files. !f your Sybase !% database is local to the !nformatica Ser"er on
your U:!N system, you can increase performance by loading data to target tables directly from
named pipes. Use pmconfig to enable the Sybase!%<ocaltoA$Ser"er option. &hen you enable
this option, the !nformatica Ser"er loads data directly from named pipes rather than writing to a
flat file for the Sybase !% e1ternal loader.
Increasin Database "etwor& )ac&et 1iDe
Dou can increase the networ pacet si0e in the !nformatica Ser"er $anager to reduce target
bottlenec. For Sybase and $icrosoft S%< Ser"er, increase the networ pacet si0e to V> 6 @W>.
For 9racle, increase the networ pacet si0e in tnsnames.ora and listener.ora. !f you increase the
networ pacet si0e in the !nformatica Ser"er configuration, you also need to configure the
database ser"er networ memory to accept larger pacet si0es.
$ptimiDin $racle 0aret Databases
!f your target database is 9racle, you can optimi0e the target database by checing the storage
clause, space allocation, and rollbac segments.
&hen you write to an 9racle database, chec the storage clause for database objects.
$ae sure that tables are using large initial and ne1t "alues. The database should also store
table and inde1 data in separate tablespaces, preferably on different diss.
&hen you write to 9racle target databases, the database uses rollbac segments during
loads. $ae sure that the database stores rollbac segments in appropriate tablespaces,
preferably on different diss. The rollbac segments should also ha"e appropriate storage
Dou can optimi0e the 9racle target database by tuning the 9racle redo log. The 9racle
database uses the redo log to log loading operations. $ae sure that redo log si0e and buffer si0e
are optimal. Dou can "iew redo log properties in the init.ora file.
!f your 9racle instance is local to the !nformatica Ser"er, you can optimi0e performance
by using !A' protocol to connect to the 9racle database. Dou can set up 9racle database
connection in listener.ora and tnsnames.ora.
Impro'in )erformance at mappin le'el
$ptimiDin Datatype Con'ersions
Forcing the !nformatica Ser"er to mae unnecessary datatype con"ersions slows performance.
For e1ample, if your mapping mo"es data from an !nteger column to a #ecimal column, then bac
to an !nteger column, the unnecessary datatype con"ersion slows performance. &here possible,
eliminate unnecessary datatype con"ersions from mappings.
Some datatype con"ersions can impro"e system performance. Use integer "alues in place of
other datatypes when performing comparisons using <ooup and Filter transformations.
For e1ample, many databases store U.S. 0ip code information as a 'har or )archar datatype. !f
you con"ert your 0ip code data to an !nteger datatype, the looup database stores the 0ip code
LECMC6@BCE as LECMC@BCE. This helps increase the speed of the looup comparisons based on
0ip code.
$ptimiDin 3oo&up 0ransformations
!f a mapping contains a <ooup transformation, you can optimi0e the looup. Some of the things
you can do to increase performance include caching the looup table, optimi0ing the looup
condition, or inde1ing the looup table.
'aching <ooups
!f a mapping contains <ooup transformations, you might want to enable looup caching. !n
general, you want to cache looup tables that need less than CMM$..
&hen you enable caching, the !nformatica Ser"er caches the looup table and 4ueries the looup
cache during the session. &hen this option is not enabled, the !nformatica Ser"er 4ueries the
looup table on a row6by6row basis. Dou can increase performance using a shared or persistent
1hared cache. Dou can share the looup cache between multiple transformations. Dou can
share an unnamed cache between transformations in the same mapping. Dou can share a
named cache between transformations in the same or different mappings.
)ersistent cache. !f you want to sa"e and reuse the cache files, you can configure the
transformation to use a persistent cache. Use this feature when you now the looup table does
not change between session runs. Using a persistent cache can impro"e performance because
the !nformatica Ser"er builds the memory cache from the cache files instead of from the
*educing the :umber of 'ached *ows
Use the <ooup S%< 9"erride option to add a &F;*; clause to the default S%< statement. This
allows you to reduce the number of rows included in the cache.
9ptimi0ing the <ooup 'ondition
!f you include more than one looup condition, place the conditions with an e4ual sign first to
optimi0e looup performance.
!nde1ing the <ooup Table
The !nformatica Ser"er needs to 4uery, sort, and compare "alues in the looup condition
columns. The inde1 needs to include e"ery column used in a looup condition. Dou can impro"e
performance for both cached and uncached looups:
Cached loo&ups. Dou can impro"e performance by inde1ing the columns in the looup 9*#;*
.D. The session log contains the 9*#;* .D statement.
!ncached loo&ups. .ecause the !nformatica Ser"er issues a S;<;'T statement for each row
passing into the <ooup transformation, you can impro"e performance by inde1ing the columns in
the looup condition.
Impro'in )erformance at *epository le'el
Tuning Repository Performn!e
The Aower$art and Aower'enter repository has more than VM tables and almost all tables use
one or more inde1es to speed up 4ueries. $ost databases eep and use column distribution
statistics to determine which inde1 to use to e1ecute S%< 4ueries optimally. #atabase ser"ers do
not update these statistics continuously.
!n fre4uently6used repositories, these statistics can become outdated "ery 4uicly and S%< 4uery
optimi0ers may choose a less than optimal 4uery plan. !n large repositories, the impact of
choosing a sub6optimal 4uery plan can affect performance drastically. 9"er time, the repository
becomes slower and slower.
To optimi0e S%< 4ueries, you might update these statistics regularly. The fre4uency of updating
statistics depends on how hea"ily the repository is used. Updating statistics is done table by
table. The database administrator can create scripts to automate the tas.
Dou can use the following information to generate scripts to update distribution statistics.
"ote% +ll Aower$art3Aower'enter repository tables and inde1 names begin with R9A.=S.
$racle Database
Dou can generate scripts to update distribution statistics for an 9racle repository.
To generate scripts for an 9racle repository:
@. *un the following 4ueries:
select 5analy0e table 5, table=name, 5 compute statistics75 from user=tables where
table=name lie 59A.=X5
select 5analy0e inde1 5, !:#;N=:+$;, 5 compute statistics75 from user=inde1es where
!:#;N=:+$; lie 59A.=X5
This produces an output lie the following:
5+:+<DY;T+.<;5 T+.<;=:+$; 5'9$AUT;ST+T!ST!'S75
66666666666666 6666666666666666 666666666666666666666666666666666666666666666666666666666666666666666666666666
analy0e table 9A.=+:+<DY;=#;A compute statistics7
analy0e table 9A.=+TT* compute statistics7
analy0e table 9A.=.+T'F=9.H;'T compute statistics7
B. Sa"e the output to a file.
C. ;dit the file and remo"e all the headers.
Feaders are lie the following:
5+:+<DY;T+.<;5 T+.<;=:+$; 5'9$AUT;ST+T!ST!'S75
66666666666666 6666666666666666 66666666666666666666
E. *un this as an S%< script. This updates repository table statistics.
(icrosoft 1Q3 1er'er
Dou can generate scripts to update distribution statistics for a $icrosoft S%< Ser"er repository.
To generate scripts for a $icrosoft S%< Ser"er repository:
@. *un the following 4uery:
select 5update statistics 5, name from sysobjects where name lie 59A.=X5
This produces an output lie the following:
666666666666666666 666666666666666666
update statistics 9A.=+:+<DY;=#;A
update statistics 9A.=+TT*
update statistics 9A.=.+T'F=9.H;'T
B. Sa"e the output to a file.
C. ;dit the file and remo"e the header information.
Feaders are lie the following:
666666666666666666 666666666666666666
E. +dd a go at the end of the file.
J. *un this as a s4l script. This updates repository table statistics.
Impro'in )erformance at 1ession le'el
$ptimiDin the 1ession
9nce you optimi0e your source database, target database, and mapping, you can focus on
optimi0ing the session. Dou can perform the following tass to impro"e o"erall performance:
*un concurrent batches.
Aartition sessions.
*educe errors tracing.
*emo"e staging areas.
Tune session parameters.
Table @L6@ lists the settings and "alues you can use to impro"e session performance:
Table @L6@. Session Tuning Aarameters
1ettin Default +alue
1uested (inimum
1uested (a?imum
#T$ .uffer Aool
@B,MMM,MMM bytes
P@B $.Q
W,MMM,MMM bytes @BV,MMM,MMM bytes
.uffer bloc si0e
WE,MMM bytes
E,MMM bytes @BV,MMM bytes
!nde1 cache si0e @,MMM,MMM bytes @,MMM,MMM bytes @B,MMM,MMM bytes
#ata cache si0e B,MMM,MMM bytes B,MMM,MMM bytes BE,MMM,MMM bytes
'ommit inter"al @M,MMM rows :3+ :3+
#isabled :3+ :3+
Tracing <e"el :ormal Terse :3+
How to correct and load the reFected files when the session completes
#uring a session, the !nformatica Ser"er creates a reject file for each target instance in the
mapping. !f the writer or the target rejects data, the !nformatica Ser"er writes the rejected row into
the reject file. .y default, the !nformatica Ser"er creates reject files in the KA$.adFile#ir ser"er
"ariable directory.
The reject file and session log contain information that helps you determine the cause of the
reject. Dou can correct reject files and load them to relational targets using the !nformatica reject
loader utility. The reject loader also creates another reject file for the data that the writer or target
reject during the reject loading.
'omplete the following tass to load reject data into the target:
<ocate the reject file.
'orrect bad data.
*un the reject loader utility.

"$0-% Dou cannot load rejected data into a flat file target
+fter you locate a reject file, you can read it using a te1t editor that supports the reject file code
*eject files contain rows of data rejected by the writer or the target database. Though the
!nformatica Ser"er writes the entire row in the reject file, the problem generally centers on one
column within the row. To help you determine which column caused the row to be rejected, the
!nformatica Ser"er adds row and column indicators to gi"e you more information about each
*ow indicator. The first column in each row of the reject file is the row indicator. The
numeric indicator tells whether the row was mared for insert, update, delete, or reject.
'olumn indicator. 'olumn indicators appear after e"ery column of data. The alphabetical
character indicators tell whether the data was "alid, o"erflow, null, or truncated.
The following sample reject file shows the row and column indicators:
*ow Indicators
The first column in the reject file is the row indicator. The number listed as the row indicator tells
the writer what to do with the row of data.
Table @J6@ describes the row indicators in a reject file:
Table @J6@. *ow !ndicators in *eject File
*ow !ndicator $eaning *ejected .y
M !nsert &riter or target
@ Update &riter or target
B #elete &riter or target
C *eject &riter
!f a row indicator is C, the writer rejected the row because an update strategy e1pression mared
it for reject.
!f a row indicator is M, @, or B, either the writer or the target database rejected the row. To narrow
down the reason why rows mared M, @, or B were rejected, re"iew the column indicators and
consult the session log.
Column Indicators
+fter the row indicator is a column indicator, followed by the first column of data, and another
column indicator. 'olumn indicators appear after e"ery column of data and define the type of the
data preceding it.
Table @J6B describes the column indicators in a reject file:
Table @J6B. 'olumn !ndicators in *eject File
Type of data &riter Treats +s
# )alid data.
?ood data. &riter passes it to the target
database. The target accepts it unless a
database error occurs, such as finding a
duplicate ey.
9"erflow. :umeric data e1ceeded the
specified precision or scale for the
.ad data, if you configured the mapping
target to reject o"erflow or truncated data.
:ull. The column contains a null
?ood data. &riter passes it to the target,
which rejects it if the target database does
not accept null "alues.
Truncated. String data e1ceeded a
specified precision for the column, so
the !nformatica Ser"er truncated it.
.ad data, if you configured the mapping
target to reject o"erflow or truncated data.
+fter you correct the target data in each of the reject files, append G.inH to each reFect file you
want to load into the target database. For e1ample, after you correct the reject file,
t=+"gSales=@.bad, you can rename it t=+"
+fter you correct the reject file and rename it to, you can use the reject loader to
send those files through the writer to the target database.
Use the reject loader utility from the command line to load rejected files into target tables. The
synta1 for reject loading differs on U:!N and &indows :T3BMMM platforms.
!se the followin synta? for !"I#%
pmrejldr pmser"er.cfg Pfolder_name:Qsession_name
*eco'erin 1essions
!f you stop a session or if an error causes a session to stop, refer to the session and error logs to
determine the cause of failure. 'orrect the errors, and then complete the session. The method
you use to complete the session depends on the properties of the mapping, session, and
!nformatica Ser"er configuration.
Use one of the following methods to complete the session:
*un the session again if the !nformatica Ser"er has not issued a commit.
Truncate the target tables and run the session again if the session is not reco"erable.
'onsider performing reco"ery if the !nformatica Ser"er has issued at least one commit.
&hen the !nformatica Ser"er starts a reco"ery session, it reads the $)2:1*+*:*-C$+-*<
table and notes the row !# of the last row committed to the target database. The !nformatica
Ser"er then reads all sources again and starts processing from the ne1t row !#. For e1ample, if
the !nformatica Ser"er commits @M,MMM rows before the session fails, when you run reco"ery, the
!nformatica Ser"er bypasses the rows up to @M,MMM and starts loading with row @M,MM@. The
commit point may be different for source6 and target6based commits.
.y default, Aerform *eco"ery is disabled in the !nformatica Ser"er setup. Dou must enable
*eco"ery in the !nformatica Ser"er setup before you run a session so the !nformatica Ser"er can
create and3or write entries in the 9A.=S*)*=*;'9);*D table.
Causes for 1ession .ailure
*eader errors. ;rrors encountered by the !nformatica Ser"er while reading the source
database or source files. *eader threshold errors can include alignment errors while
running a session in Unicode mode.
Writer errors. ;rrors encountered by the !nformatica Ser"er while writing to the target
database or target files. &riter threshold errors can include ey constraint "iolations,
loading nulls into a not null field, and database trigger responses.
0ransformation errors. ;rrors encountered by the !nformatica Ser"er while transforming
data. Transformation threshold errors can include con"ersion errors, and any condition
set up as an ;**9*, such as null input.
.atal -rror
+ fatal error occurs when the !nformatica Ser"er cannot access the source, target, or repository.
This can include loss of connection or target database errors, such as lac of database space to
load data. !f the session uses a :ormali0er or Se4uence ?enerator transformation, the
!nformatica Ser"er cannot update the se4uence "alues in the repository, and a fatal error occurs.
@=.What is meant by loo&up caches?
The informatica ser"er builds a cache in memory when it processes the first row af a data in a
cached loo up transformation.!t allocates memory for the cache based on the amount u
configure in the transformation or session properties.The informatica ser"er stores condition
"alues in the inde1 cache and output "alues in the data cache.
@>. What r the types of loo&up caches?
)ersistent cache% U can sa"e the looup cache files and reuse them the ne1t time the
informatica ser"er processes a looup transformation configured to use the cache.
*ecache from database% !f the persistent cache is not synchroni0ed with he looup table,U can
configure the looup transformation to rebuild the looup cache.
1tatic cache% U can configure a static or readonly cache for only looup table..y default
informatica ser"er creates a static cache.!t caches the looup table and looup "alues in the
cache for each row that comes into the transformation.when the looup condition is true,the
informatica ser"er does not update the cache while it prosesses the looup transformation.
Dynamic cache% !f u want to cache the target table and insert new rows into cache and the
target,u can create a loo up transformation to use dynamic cache.The informatica ser"er
dynamically inerts data to the target table.
1hared cache% U can share the looup cache between multiple transactions.U can share
unnamed cache between transformations inthe same maping.
@@. Difference between static cache and dynamic cache

Static cache #ynamic cache

U can not inert or update the cache. U can insert rows into the cache as u pass
to the target
The informatic ser"er returns a "alue from The informatic ser"er inserts rows into cache
the looup table opr cache when the condition when the condition is false.This indicates that
is true.&hen the condition is not true,the the row is not in the cache or target table.
informatica ser"er returns the default "alue U can pass these rows to the target table.
for connected transformations and null for
unconnected transformations.
@A.Which transformation should we use to normaliDe the C$2$3 and relational sources?
:ormali0er Transformation.
&hen U drag the '9.9< source in to the mapping #esigner worspace,the normali0er
transformation automatically appears,creating input and output ports for e"ery column in the
@B.How the informatica ser'er sorts the strin 'alues in *an&transformation?
&hen the informatica ser"er runs in the +S'!! data mo"ement mode it sorts session data using .inary sortorder.!f
U configure the seeion to use a binary sort order,the informatica ser"er caluculates the binary "alue of each string and
returns the specified number of rows with the higest binary "alues for the string.
@C.What r the ran& caches?
#uring the session ,the informatica ser"er compares an inout row with rows in the datacache.!f the input row out6
rans a stored row,the informatica ser"er replaces the stored row with the input row.The informatica ser"er stores
group information in an inde1 cache and row data in a data cache.
@I.What is the *an&inde? in *an&transformation?
The #esigner automatically creates a *+:>!:#;N port for each *an transformation. The
!nformatica Ser"er uses the *an !nde1 port to store the raning position for each record in a
group. For e1ample, if you create a *an transformation that rans the top J salespersons for
each 4uarter, the ran inde1 numbers the salespeople from @ to J
A>.What r the types of data that passes between informatica ser'er and stored procedure?
C types of data
!nput39ut put parameters
*eturn )alues
Status code.
A@.What is the status code?

Status code pro"ides error handling for the informatica ser"er during the session.The stored procedure issues
a status code that notifies whether or not stored procedure completed sucessfully.This "alue can not seen by the
user.!t only used by the informatica ser"er to determine whether to continue running the session or stop.
BB. What r the types of mapin in ,ettin 1tarted WiDard?
Simple Aass through maping :
<oads a static fact or dimension table by inserting all rows. Use this mapping when you want
to drop all e1isting data from your table before loading new data.
Slowly ?rowing target :

<oads a slowly growing fact or dimension table by inserting new rows. Use this mapping to
load new data when e1isting data does not re4uire updates.
BC. What r the mapins that we use for slowly chanin dimension table?
0ype=% *ows containing changes to e1isting dimensions are updated in the target by o"erwriting
the e1isting dimension. !n the Type @ #imension mapping, all rows contain current dimension
Use the Type @ #imension mapping to update a slowly changing dimension table when you do
not need to eep any pre"ious "ersions of dimensions in the table.
0ype >% The Type B #imension #ata mapping inserts both new and changed dimensions into the
target. 'hanges are traced in the target table by "ersioning the primary ey and creating a
"ersion number for each dimension in the table.
Use the Type B #imension3)ersion #ata mapping to update a slowly changing dimension table
when you want to eep a full history of dimension data in the table. )ersion numbers and
"ersioned primary eys trac the order of changes to each dimension.
0ype @% The Type C #imension mapping filters source rows based on user6defined comparisons
and inserts only those found to be new dimensions to the target. *ows containing changes to
e1isting dimensions are updated in the target. &hen updating an e1isting dimension, the
!nformatica Ser"er sa"es e1isting data in different columns of the same row and replaces the
e1isting data with the updates
BI.What r the different types of 0ype> dimension mapin?
TypeB #imension3)ersion #ata $aping: !n this maping the updated dimension in the
source will gets inserted in target along with a new "ersion number.+nd newly added dimension
in source will inserted into target with a primary ey.

TypeB #imension3Flag current $aping: This maping is also used for slowly changing
dimensions.!n addition it creates a flag "alue for changed or new dimension.
Flag indiactes the dimension is new or newlyupdated.*ecent dimensions will gets sa"ed with
cuurent flag "alue @. +nd updated dimensions r sa"ed with the "alue M.
TypeB #imension3;ffecti"e #ate *ange $aping: This is also one fla"our of TypeB maping used
for slowly changing dimensions.This maping also inserts both new and changed dimensions in to
the target.+nd changes r traced by the effecti"e date range for each "ersion of each dimension.
BJ.How can u reconise whether or not the newly added rows in the source r ets insert in
the taret ?

!n the TypeB maping we ha"e three options to recognise the newly added rows
)ersion number
;ffecti"e date *ange
BK. What r two types of processes that informatica runs the session?
<oad manager Arocess: Starts the session, creates the #T$ process, and sends post6session
email when the session completes.
The #T$ process. 'reates threads to initiali0e the session, read, write, and transform data, and
handle pre6 and post6session operations.
C=. Can u enerate reports in Informatcia?
Des. .y using $etadata reporter we can generate reports in informatica.

C>.What is metadata reporter?

!t is a web based application that enables you to run reports againist repository metadata.
&ith a meta data reporter,u can access information about U5r repository with out ha"ing
nowledge of s4l,transformation language or underlying tables in the repository.
CK.What r the tas&s that 3oadmaner process will do?
$anages the session and batch scheduling: &he u start the informatica ser"er the load
maneger launches and 4ueries the repository for a list of sessions configured to run on the
informatica ser"er.&hen u configure the session the loadmanager maintains list of list of sessions
and session start times.&hen u sart a session loadmanger fetches the session information from
the repository to perform the "alidations and "erifications prior to starting #T$ process.
<ocing and reading the session: &hen the informatica ser"er starts a session lodamaager locs
the session from the repository.<ocing pre"ents U starting the session again and again.
*eading the parameter file: !f the session uses a parameter files,loadmanager reads the
parameter file and "erifies that the session le"el parematers are declared in the file
)erifies permission and pri"elleges: &hen the sesson starts load manger checs whether or not
the user ha"e pri"elleges to run the session.
'reating log files: <oadmanger creates logfile contains the status of session.
IL. What is D0( process?
+fter the loadmanger performs "alidations for session,it creates the #T$ process.#T$ is to
create and manage the threads that carry out the session tass.! creates the master
thread.$aster thread creates and manges all the other threads.
I=. What r the different threads in D0( process?
$aster thread: 'reates and manages all other threads
$aping thread: 9ne maping thread will be creates for each session.Fectchs session and maping
Are and post session threads: This will be created to perform pre and post session operations.
*eader thread: 9ne thread will be created for each partition of a source.!t reads data from
&riter thread: !t will be created to load data to the target.
Transformation thread: !t will be created to tranform data.

I>.What r the data mo'ement modes in informatcia?

#atamo"ement modes determines how informatcia ser"er handles the charector data.U
choose the datamo"ement in the informatica ser"er configuration settings.Two types of
datamo"ement modes a"ialable in informatica.
+S'!! mode
Uni code mode.
I@. What r the out put files that the informatica ser'er creates durin the session runnin?
!nformatica ser"er log: !nformatica ser"er,on uni1- creates a log for all status and error
messages,default name: pm.ser"er.log-.!t also creates an error log for error messages.These
files will be created in informatica home directory.
Session log file: !nformatica ser"er creates session log file for each session.!t writes information
about session into log files such as initiali0ation process,creation of s4l commands for reader and
writer threads,errors encountered and load summary.The amount of detail in session log file
depends on the tracing le"el that u set.
Session detail file: This file contains load statistics for each targets in mapping.Session detail
include information such as table name,number of rows written or rejected.U can "iew this file by
double clicing on the session in monitor window
Aerformance detail file: This file contains information nown as session performance details
which helps U where performance can be impro"ed.To genarate this file select the performance
detail option in the session property sheet.
*eject file: This file contains the rows of data that the writer does notwrite to targets.
'ontrol file: !nformatica ser"er creates control file and a target file when U run a session that uses
the e1ternal loader.The control file contains the information about the target flat file such as data
format and loading instructios for the e1ternal loader.
Aost session email: Aost session email allows U to automatically communicate information about
a session run to designated recipents.U can create two different messages.9ne if the session
completed sucessfully the other if the session fails.
!ndicator file: !f u use the flat file as a target,U can configure the informatica ser"er to create
indicator file.For each target row,the indicator file contains a number to indicate whether the row
was mared for insert,update,delete or reject.
output file: !f session writes to a target file,the informatica ser"er creates the target file based on
file prpoerties entered in the session property sheet.
'ache files: &hen the informatica ser"er creates memory cache it also creates cache files.For
the following circumstances informatica ser"er creates inde1 and datacache files.

+ggreagtor transformation
Hoiner transformation
*an transformation
<ooup transformation
IA.In which circumstances that informatica ser'er creates *eFect files?
&hen it encounters the ##=*eject in update strategy transformation.
)iolates database constraint
Filed in the rows was truncated or o"erflowed.
IB. What is pollin?
!t displays the updated information about the session in the monitor window. The monitor window
displays the status of each session when U poll the informatica ser"er

IC. Can u copy the session to a different folder or repository?
Des. .y using copy session wi0ard u can copy a session in a different folder or repository..ut that
target folder or repository should consists of mapping of that session.
!f target folder or repository is not ha"ing the maping of copying session ,
u should ha"e to copy that maping first before u copy the session
II. What is batch and describe about types of batches?
?rouping of session is nown as batch..atches r two types
Se4uential: *uns sessions one after the other
'oncurrent: *uns session at same time.
!f u ha"e sessions with source6target dependencies u ha"e to go for se4uential batch to start the
sessions one after another.!f u ha"e se"eral independent sessions u can use concurrent batches.
&hch runs all the sessions at the same time.
IJ. Can u copy the batches?
IK.How many number of sessions that u can create in a batch?
+ny number of sessions.
JL.When the informatica ser'er mar&s that a batch is failed?
!f one of session is configured to 2run if pre"ious completes2 and that pre"ious session fails.
J=. What is a command that used to run a batch?
pmcmd is used to start a batch.
J>. What r the different options used to confiure the seMuential batches?
Two options
*un the session only if pre"ious session completes sucessfully. +lways runs the session.
J@. In a seMuential batch can u run the session if pre'ious session fails?
Des..y setting the option always runs the session.
JA. Can u start a batches with in a batch?
U can not. !f u want to start batch that resides in a batch,create a new independent batch and copy the
necessary sessions into the new batch.
JB. Can u start a session inside a batch idi'idually?
&e can start our re4uired session only in case of se4uential case of concurrent batch
we cant do lie this.
JC. How can u stop a batch?
.y using ser"er manager or pmcmd.
JI. What r the session parameters?
Session parameters r lie maping parameters,represent "alues U might want to change between
sessions such as database connections or source files.
Ser"er manager also allows U to create userdefined session parameters.Following r user defined
session parameters.
Database connections
1ource file names% use this parameter when u want to change the name or location of
session source file between session runs
0aret file name% Use this parameter when u want to change the name or location of
session target file between session runs.
*eFect file name% Use this parameter when u want to change the name or location of
session reject files between session runs.
JJ. What is parameter file?
Aarameter file is to define the "alues for parameters and "ariables used in a session.+ parameter
file is a file created by te1t editor such as word pad or notepad.
U can define the following "alues in parameter file
$aping parameters
$aping "ariables
session parameters
JK. How can u access the remote source into !8r session?
*elational source% To acess relational source which is situated in a remote place ,u need to
configure database connection to the datasource.
.ile1ource % To access the remote source file U must configure the FTA connection to the
host machine before u create the session.
Hetroenous % &hen U5r maping contains more than one source type,the ser"er manager creates
a hetrogenous session that displays source options for all types.
KL. What is difference between partionin of relatonal taret and partitionin of file tarets?
!f u parttion a session with a relational target informatica ser"er creates multiple connections
to the target database to write target data concurently.!f u partition a session with a file target
the informatica ser"er creates one target file for each partition.U can configure session properties
to merge these target files.
K=. what r the transformations that restricts the partitionin of sessions?
+d"anced ;1ternal procedure tranformation and ;1ternal procedure transformation: This
transformation contains a chec bo1 on the properties tab to allow partitioning.
+ggregator Transformation: !f u use sorted ports u can not parttion the assosiated source
Hoiner Transformation : U can not partition the master source for a joiner transformation
:ormali0er Transformation
N$< targets.
KA. What r the types of metadata that stores in repository?
Following r the types of metadata that stores in the repository
#atabase connections
?lobal objects
$ultidimensional metadata
*eusable transformations
Sessions and batches
Short cuts
Source definitions
Target defintions
KB. What is power center repository?
The Aower'enter repository allows you to share metadata across repositories to create a data
mart domain. !n a data mart domain, you can create a single global repository to store metadata
used across an enterprise, and a number of local repositories to share the global metadata as
KC. How can u wor& with remote database in informatica?did u wor& directly by usin
To wor with remote datasource u need to connect it with remote connections..ut it is not
preferable to wor with that remote source directly by using remote connections .!nstead u
bring that source into U r local machine where informatica ser"er resides.!f u wor directly with
remote source the session performance will decreases by passing less amount of data across
the networ in a particular time.
KJ. What r the schedulin options to run a sesion?
U can shedule a session to run at a gi"en time or inter"el,or u can manually run the session.
#ifferent options of scheduling
*un only on demand: ser"er runs the session only when user starts session
*un once: !nformatica ser"er runs the session only once at a specified date and time.
*un e"ery: !nformatica ser"er runs the session at regular inter"els as u configured.
'ustomi0ed repeat: !nformatica ser"er runs the session at the dats and times secified in the
repeat dialog bo1.
KK .What is tracin le'el and what r the types of tracin le'el?

Tracing le"el represents the amount of information that informatcia ser"er writes in a log
Types of tracing le"el
)erbose init
)erbose data
=LL. What is difference between stored procedure transformation and e?ternal procedure
!n case of storedprocedure transformation procedure will be compiled and e1ecuted in a
relational data source.U need data base connection to import the stored procedure in to u5r
maping.&here as in e1ternal procedure transformation procedure or function will be e1ecuted out
side of data source.!e u need to mae it as a #<< to access in u r maping.:o need to ha"e data
base connection in case of e1ternal procedure transformation.
=L=. -?plain about *eco'erin sessions?
!f you stop a session or if an error causes a session to stop, refer to the session and error
logs to determine the cause of failure. 'orrect the errors, and then complete the session. The
method you use to complete the session depends on the properties of the mapping, session, and
!nformatica Ser"er configuration.
Use one of the following methods to complete the session:
*un the session again if the !nformatica Ser"er has not issued a commit.
Truncate the target tables and run the session again if the session is not reco"erable.
'onsider performing reco"ery if the !nformatica Ser"er has issued at least one commit.
=L>. -?plain about perform reco'ery?
&hen the !nformatica Ser"er starts a reco"ery session, it reads the 9A.=S*)*=*;'9);*D
table and notes the row !# of the last row committed to the target database. The !nformatica
Ser"er then reads all sources again and starts processing from the ne1t row !#. For e1ample, if
the !nformatica Ser"er commits @M,MMM rows before the session fails, when you run reco"ery, the
!nformatica Ser"er bypasses the rows up to @M,MMM and starts loading with row @M,MM@.
.y default, Aerform *eco"ery is disabled in the !nformatica Ser"er setup. Dou must enable
*eco"ery in the !nformatica Ser"er setup before you run a session so the !nformatica Ser"er can
create and3or write entries in the 9A.=S*)*=*;'9);*D table.
*e% what are the enhancements made to Informatica I.=.= 'ersion when compared to C.>.>
looup on flat file in informaticaZ.@
no such pro"ision in W.@
union all trnasformation in Z.@ not in W.@
transaction control tranformation not in W.@
file reposioty in Z.@

*e% $ne flatefile it contains some data but i want to dont want to load first and last
record it is? can u tell me complete loic? Answer
[ @
put one e1pression after the source 4ualifer
and type this code
for e1ample tae this record
output "ariable name G !FF,SU.ST*,columnname ,@,C-
Similarly put for trailer record condition in the
after this e1pression put the one filter lie this
output "ariable name G5D5
"#$%t is &ifferen!e 'et(een )ie( n& mteri*i+e& )ie(,
)iews contains 4uery whene"er e1ecute "iews it has read from base table
&here as $ "iews loading or replicated taes place only once which gi"es you better 4uery
*efresh m "iews @.on commit and B. on demand
,'omplete, ne"er, fast, force-
-#$%t is 'itmp in&e. (%y it/s use& for D$0,
a bitmap for each ey "alue replaces a list of rowids. .itmap inde1 more efficient for data
warehousing because low cardinality, low updates, "ery efficient for where class
1#$%y nee& stging re &t'se for D$0,
Staging area needs to clean operational data before loading into data warehouse.
'leaning in the sense your merging data which comes from different source
2#Differen!e 'et(een OLTP n& D$0,
9<TA system is basically application orientation ,eg, purchase order it is functionality of an
&here as in #&F concern is subject orient ,subject in the sense custorer, product, item,
\ +pplication 9riented
\ Used to run business
\ #etailed data
\ 'urrent up to date
\ !solated #ata
\ *epetiti"e access
\ 'lerical User
\ Aerformance Sensiti"e
\ Few *ecords accessed at a time ,tens-
\ *ead3Update +ccess
\ :o data redundancy
\ #atabase Si0e @MM$.6@MM ?.
\ Subject 9riented
\ Used to analy0e business
\ Summari0ed and refined
\ Snapshot data
\ !ntegrated #ata
\ +d6hoc access
\ >nowledge User
\ Aerformance rela1ed
\ <arge "olumes accessed at a time,millions-
\ $ostly *ead ,.atch Update-
\ *edundancy present
\ #atabase Si0e @MM ?. 6 few terabytes
=K.<ou transfer =LLLLL rows to taret but some rows et discard how will you trace
them? And where its et loaded?
*ejected records are loaded into bad files. !t has record indicator and column indicator.
*ecord indicator identified by ,M6insert,@6update,B6delete,C6reject- and column indicator
identified by ,#6"alid,96o"erflow,:6null,T6truncated-.
:ormally data may get rejected in different reason due to transformation logic
>A.What are shortcuts? Where it can be used? What are the ad'antaes?
There are B shortcuts,<ocal and global- <ocal used in local repository and global used in
global repository. The ad"antage is reuse an object without creating multiple objects. Say for
e1ample a source definition want to use in @M mappings in @M different folder without creating
@M multiple source you create @M shotcuts.
@A.What are the tracin le'el?
:ormal 8 !t contains only session initiali0ation details and transformation details no. records
rejected, applied
Terse 6 9nly initiali0ation details will be there
)erbose !nitiali0ation 8 :ormal setting information plus detailed information about the
)erbose data 8 )erbose init. Settings and all information about the session
A=.What are stored procedure transformations. )urpose of sp transformation. How did
you o about usin your proFect?
'onnected and unconnected stored procudure.
Unconnected stored procedure used for data base le"el acti"ities such as pre and post load
'onnected stored procedure used in informatica le"el for e1ample passing one parameter as
input and capturing return "alue from the stored procedure.
:ormal 6 row wise chec
Are6<oad Source 6 ,'apture source incremental data for incremental aggregation-
Aost6<oad Source 6 ,#elete Temporary tables-
Are6<oad Target 6 ,'hec dis space a"ailable-
Aost6<oad Target 8 ,#rop and recreate inde1-
C=.What are the commit inter'als?
1ource based commit ,.ased on the no. of acti"e source records,Source 4ualifier- reads.
'ommit inter"al set @MMMM rows and source 4ualifier reads @MMMM but due to transformation
logic CMMM rows get rejected when ZMMM reach target commit will fire, so writer buffer does not
rows held the buffer-
0aret based commit ,.ased on the rows in the buffer and commit inter"al. Target based
commit set @MMMM but writer buffer fills e"ery ZJMM, ne1t time buffer fills @JMMM now commit
statement will fire then BBJMM lie go on.-
C@.How did you schedule sessions in your proFect?
*un once ,set B parameter date and time when session should start-
*un ;"ery ,!nformatica ser"er run session at regular inter"al as we configured, parameter
#ays, hour, minutes, end on, end after, fore"er-
'ustomi0ed repeat ,*epeat e"ery B days, daily fre4uency hr, min, e"ery wee, e"ery month-
*un only on demand,$anually run- this not session scheduling