Sie sind auf Seite 1von 124

Master Informatica Questions and

Answer Set
Version 2.5
The one stop master manual of Informatica™ interview questions and answers

DWBIConcepts.com
www.dwbiconcepts.com – Community of DWBI Professionals

Copyright Notice

Informatica Master Question and Answer Set is copyright © DWBIConcepts 2013.

All rights reserved. No part of this book shall be reproduced, stored in a retrieval system,
or transmitted by any means – electronic, mechanical, photocopying, recording, or oth-
erwise – without written permission from the publisher. No patent liability is assumed
with respect to the use of the information contained herein. Although every precaution
has been taken in the preparation of this book, the publisher and author assume no re-
sponsibility for errors or omissions. Neither is any liability assumed for damages result-
ing from the use of the information contained herein.

Trademarks
All terms mentioned in this book that are known to be trademarks or service marks have
been appropriately capitalized. New Riders Publishing cannot attest to the accuracy of
this information. Use of a term in this book should not be regarded as affecting the valid-
ity of any trademark or service mark.

Warning and disclaimer


Every effort has been made to make this book as complete and as accurate as possible,
but no warranty of fitness is implied. The information is provided on an “as is” basis. The
author and the publisher shall have neither liability nor responsibility to any person or
entity with respect to any loss or damages arising from the information contained in this
book

2
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

How this book should be used

This book contains various questions and answers pertaining to Informatica Power Cen-
ter™ and allied tools as commonly asked in Job Interviews. As such the book is written
for the candidates who are preparing for Job Interviews. It is suggested that the candidate
start preparing from the material at least one week in advance so that s/he can finish
reading the entire content before appearing for the interview. In case the candidate is
stuck with any question or answer, is not clear on something or has a doubt – s/he can
interact with the Experts by using DWBIConcepts forum.

For the help of the readers, we have tagged certain questions accordingly as shown be-
low:

Common / Frequently Asked Questions

Harder Questions

Additional Information

3
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Table of Contents
COPYRIGHT NOICE 2
TRADEMARKS 2
WARNING AND DISCLAIMER 2
HOW THIS BOOK SHOULD BE USED 3

DWBIConcepts
1. AGGREGATOR TRANSFORMATION 13
1. WHAT IS AN AGGREGATOR TRANSFORMATION? 13
2. HOW AN EXPRESSION TRANSFORMATION DIFFERS FROM AGGREGATOR TRANSFORMATION? 13
3. DOES AN AGGREGATOR TRANSFORMATION SUPPORT ONLY AGGREGATE EXPRESSIONS? 13
4. GIVE ONE EXAMPLE FOR EACH OF CONDITIONAL AGGREGATION, NON-AGGREGATE EXPRESSION AND NESTED AGGREGATION. 13
5. HOW DOES AGGREGATOR TRANSFORMATION HANDLE NULL VALUES? 13
6. WHAT ARE THE PERFORMANCE CONSIDERATIONS WHEN WORKING WITH AGGREGATOR TRANSFORMATION? 14
7. WHAT ARE THE USES OF INDEX AND DATA CACHE? 14

DWBIConcepts
8. WHAT DIFFERS WHEN WE CHOOSE SORTED INPUT FOR AGGREGATOR TRANSFORMATION? 14
9. UNDER WHAT CONDITIONS SELECTING SORTED INPUT IN AGGREGATOR WILL STILL NOT BOOST SESSION PERFORMANCE? 15
10. UNDER WHAT CONDITION SELECTING SORTED INPUT IN AGGREGATOR MAY FAIL THE SESSION? 15
11. SUPPOSE WE DO NOT GROUP BY ON ANY PORTS OF THE AGGREGATOR WHAT WILL BE THE OUTPUT. 15
12. WHAT IS THE EXPECTED VALUE IF THE COLUMN IN AN AGGREGATOR TRANSFORMATION IS NEITHER A GROUP BY NOR AN
AGGREGATE EXPRESSION? 15
13. WHAT IS INCREMENTAL AGGREGATION? 15
14. SORTED INPUT FOR AGGREGATOR TRANSFORMATION WILL IMPROVE PERFORMANCE OF MAPPING. HOWEVER, IF SORTED INPUT IS
USED FOR NESTED AGGREGATE EXPRESSION OR INCREMENTAL AGGREGATION, THEN THE MAPPING MAY RESULT IN SESSION FAILURE.

DWBIConcepts
EXPLAIN WHY? 16
15. HOW CAN WE DELETE DUPLICATE RECORD USING INFORMATICA AGGREGATOR? 16
16. SCENARIO IMPLEMENTATION 1 16
17. SCENARIO IMPLEMENTATION 2 18

2. EXPRESSION TRANSFORMATION 19
1. WHAT IS AN EXPRESSION TRANSFORM? 19
2. HOW MANY TYPES OF PORTS ARE THERE IN EXPRESSION TRANSFORM? 19
3. WHAT IS THE EXECUTION ORDER OF THE PORTS IN AN EXPRESSION? 19 DWBIConcepts
4. DESCRIBE THE APPROACH FOR THE REQUIREMENT. SUPPOSE THE INPUT IS: 19
5. HOW CAN WE IMPLEMENT AGGREGATION OPERATION WITHOUT USING AN AGGREGATOR TRANSFORMATION IN INFORMATICA? 20
6. SCENARIO IMPLEMENTATION 1 20
7. SCENARIO IMPLEMENTATION 2 21
8. SCENARIO IMPLEMENTATION 3 22
9. SCENARIO IMPLEMENTATION 4 22
10. SCENARIO IMPLEMENTATION 5 22

3. FILTER TRANSFORMATION 24
1. WHAT IS A FILTER TRANSFORMATION AND WHY IT IS AN ACTIVE ONE? 24
2. WHAT IS THE DIFFERENCE BETWEEN SOURCE QUALIFIER TRANSFORMATIONS SOURCE FILTER OPTION AND FILTER
TRANSFORMATION?
4 24

© www.dwbiconcepts.com – All rights reserved.


www.dwbiconcepts.com – Community of DWBI Professionals

4. JOINER TRANSFORMATION 25
1. WHAT IS A JOINER TRANSFORMATION AND WHY IT IS AN ACTIVE ONE? 25
2. STATE THE LIMITATIONS WHERE WE CANNOT USE JOINER IN THE MAPPING PIPELINE. 25
3. OUT OF THE TWO INPUT PIPELINES OF A JOINER, WHICH ONE WILL WE SET AS THE MASTER PIPELINE? 25
4. WHAT ARE THE DIFFERENT TYPES OF JOINS AVAILABLE IN JOINER TRANSFORMATION? 26
5. DEFINE THE VARIOUS JOIN TYPES OF JOINER TRANSFORMATION. 27
6. DESCRIBE THE IMPACT OF NUMBER OF JOIN CONDITIONS AND JOIN ORDER IN A JOINER. 27

DWBIConcepts
7. HOW DOES JOINER TRANSFORMATION TREAT NULL VALUE MATCHING? 27
8. WHEN WE CONFIGURE THE JOIN CONDITION, WHAT ARE THE GUIDELINES WE NEED TO FOLLOW TO MAINTAIN THE SORT ORDER? 28
9. WHAT ARE THE TRANSFORMATIONS THAT CANNOT BE PLACED BETWEEN THE SORT ORIGIN AND THE JOINER TRANSFORMATION SO
THAT WE DO NOT LOSE THE INPUT SORT ORDER? 28
10. WHAT IS THE USE OF SORTED INPUT IN JOINER TRANSFORMATION? 28
11. CAN WE JOIN TWO TABLES BASED ON A JOIN COLUMN HAVING DIFFERENT DATA TYPE? 29
12. IMPLEMENTATION SCENARIO1 - JOINER TRANSFORMATION IS JOINING TWO TABLES S1 AND S2. S1 HAS 10,000 ROWS AND S2
HAS 1000 ROWS . WHICH TABLE YOU WILL SET MASTER FOR BETTER PERFORMANCE OF JOINER TRANSFORMATION? WHY? 29

DWBIConcepts
5. LOOKUP TRANSFORMATION 30
1. WHAT IS A LOOKUP TRANSFORM? 30
2. WHAT ARE THE DIFFERENCES BETWEEN CONNECTED AND UNCONNECTED LOOKUP? 30
3. WHAT ARE THE DIFFERENT LOOKUP CACHE(S)? 30
4. IS LOOKUP AN ACTIVE OR PASSIVE TRANSFORMATION ? 31
5. WHAT IS THE DIFFERENCE BETWEEN STATIC AND DYNAMIC LOOKUP CACHE? 31
6. WHAT ARE THE USES OF INDEX AND DATA CACHES? 31
7. WHAT IS PERSISTENT LOOKUP CACHE? 31
8. WHAT TYPE OF JOIN DOES LOOKUP SUPPORT? 32

DWBIConcepts
9. EXPLAIN HOW LOOKUP TRANSFORMATION WORKS LIKE SQL LEFT OUTER JOIN. 32
10. WHERE AND WHY DO WE USE UNCONNECTED LOOKUP INSTEAD OF CONNECTED LOOKUP? 32
11. HOW CAN WE IDENTIFY PERSISTENT CACHE FILES IN INFORMATICA SERVER? 33
12. HOW TO CONFIGURE A LOOKUP ON A FLAT FILE WITH HEADER? 33
13. WHAT IS THE DIFFERENCE BETWEEN PERSISTENT CACHE AND SHARED CACHE? 33
14. DESCRIBE HOW TO RETURN MULTIPLE PORT VALUES FROM UNCONNECTED LOOKUP IN INFORMATICA. 34
15. HOW TO MAKE THE PERSISTENT LOOKUP CACHE IN SYNC WITH LOOKUP TABLE? 34
16. IF WE USE PERSISTENT CACHE FOR A DYNAMIC LOOKUP, WILL THE CACHE FILE BE UPDATED OR INSERTED AS REQUIRED? 34
17. IS THERE ANYTHING WRONG IN SHARING A PERSISTENT CACHE BETWEEN STATIC AND DYNAMIC LOOKUP? 34 DWBIConcepts
18. WHAT IS THE DIFFERENCE BETWEEN THE TWO UPDATE PROPERTIES - UPDATE ELSE INSERT, INSERT ELSE UPDATE IN DYNAMIC
LOOKUP CACHE? 35
19. IF THE DEFAULT VALUE FOR THE LOOKUP RETURN PORT IS NOT SET, WHAT WILL BE THE OUTPUT WHEN THE LOOKUP CONDITION
FAILS? 35
20. HOW CAN WE ENSURE DATA IS NOT DUPLICATED IN THE TARGET WHEN THE SOURCE HAS DUPLICATE RECORDS, USING LOOKUP
TRANSFORMATION? 35

6. NORMALIZER TRANSFORMATION 36
1. WHAT IS A NORMALIZER TRANSFORMATION? 36
2. SCENARIO IMPLEMENTATION 1 36
3. WHAT ARE LEVELS IN NORMALIZER TRANSFORMATION? 36
4. WHAT IS THE PURPOSE OF GCID AND GK IN A NORMALIZER TRANSFORMATION? 37

5
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

7. RANK TRANSFORMATION 38
1. WHAT IS A RANK TRANSFORM? 38
2. HOW DOES A RANK TRANSFORM DIFFER FROM AGGREGATOR TRANSFORM FUNCTIONS MAX AND MIN? 38
3. HOW DOES A RANK CACHE WORKS? 38
4. WHAT IS A RANK PORT AND RANKINDEX? 38
5. HOW CAN YOU GET RANKS BASED ON DIFFERENT GROUPS? 38
6. WHAT HAPPENS IF TWO RANK VALUES MATCH? 39

DWBIConcepts
7. WHAT ARE THE RESTRICTIONS OF RANK TRANSFORMATION? 39
8. HOW DOES RANK TRANSFORMATION HANDLE STRING VALUES? 39
9. WHAT IS DENSE RANK AND DOES INFORMATICA SUPPORTS DENSE RANK? 39
10. HOW DO WE ACHIEVE DENSE_RANK IN INFORMATICA? 40
11. SOURCE TABLE HAS 5 ROWS. RANK IN RANK TRANSFORMATION IS SET TO 10. HOW MANY ROWS THE RANK TRANSFORMATION
WILL OUTPUT? 40
12. HOW YOU WILL LOAD UNIQUE RECORD INTO TARGET FLAT FILE FROM SOURCE FLAT FILES HAS DUPLICATE DATA? 40

8. ROUTER TRANSFORMATION 42

DWBIConcepts
1. WHAT IS THE DIFFERENCE BETWEEN ROUTER AND FILTER? 42
2. WHAT IS THE MINIMUM NUMBER OF GROUPS WE CAN DECLARE IN A ROUTER TRANSFORMATION? 42
3. SCENARIO IMPLEMENTATION 1 42
4. SCENARIO IMPLEMENTATION 2 43
5. SCENARIO IMPLEMENTATION 3 44

9. SEQUENCE GENERATOR TRANSFORMATION 45


1. WHAT IS A SEQUENCE GENERATOR TRANSFORMATION? 45
2. DEFINE THE PROPERTIES AVAILABLE IN SEQUENCE GENERATOR TRANSFORMATION IN BRIEF. 45

DWBIConcepts
3. SCENARIO IMPLEMENTATION 1 46
4. SCENARIO IMPLEMENTATION 2 46
5. WHAT ARE THE CHANGES WE OBSERVE WHEN WE PROMOTE A NON-REUSABLE SEQUENCE GENERATOR TO A REUSABLE ONE? AND
WHAT HAPPENS IF WE SET THE NUMBER OF CACHED VALUES TO 0 FOR A REUSABLE TRANSFORMATION? 47
6. HOW SEQUENCE GENERATOR IN THE MAPPING IS HANDLED WHEN WE MIGRATE THE MAPPING FROM ONE ENVIRONMENT TO
ANOTHER? 47
7. SCENARIO IMPLEMENTATION 3 48
8. HOW DO I GET A SEQUENCE GENERATOR TO "PICK UP" WHERE ANOTHER "LEFT OFF"? 48
DWBIConcepts
10. STORED PROCEDURE TRANSFORMATION 49
1. WHAT IS A STORED PROCEDURE TRANSFORMATION? 49
2. HOW MANY TYPES OF STORED PROCEDURE TRANSFORMATION ARE THERE? 49
3. HOW DO WE CALL AN UNCONNECTED STORED PROCEDURE TRANSFORMATION? 49
4. HOW DO WE SET THE EXECUTION ORDER OF PRE-POST LOAD STORED PROCEDURE? 49
5. HOW DO WE SET THE CALL TEXT FOR STORED PROCEDURE TRANSFORMATION? 49
6. HOW DO WE RECEIVE OUTPUT/RETURN PARAMETERS FROM UNCONNECTED STORED PROCEDURE? 50

11. SORTER TRANSFORMATION 51


1. WHAT IS A SORTER TRANSFORMATION? 51
2. WHY IS SORTER AN ACTIVE TRANSFORMATION? 51
3. HOW DOES SORTER HANDLE CASE SENSITIVE SORTING? 51
6
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

4. HOW DOES SORTER HANDLE NULL VALUES? 51


5. HOW DOES A SORTER CACHE WORKS? 51
6. HOW TO DELETE DUPLICATE RECORDS OR RATHER TO SELECT DISTINCT ROWS FOR FLAT FILE SOURCES? 52

12. UNION TRANSFORMATION 53


1. WHAT IS A UNION TRANSFORMATION? 53
2. WHAT ARE THE RESTRICTIONS OF UNION TRANSFORMATION? 53
3. HOW COME UNION TRANSFORMATION IS ACTIVE? 53

DWBIConcepts
13. UPDATE STRATEGY TRANSFORMATION 54
1. WHAT IS UPDATE STRATEGY TRANSFORM? 54
2. WHAT ARE UPDATE STRATEGY CONSTANTS? 54
3. HOW CAN WE UPDATE A RECORD IN TARGET TABLE WITHOUT USING UPDATE STRATEGY? 54
4. WHAT IS DATA DRIVEN? 54
5. WHAT HAPPENS WHEN DD_UPDATE IS DEFINED IN UPDATE STRATEGY AND TREAT SOURCE ROWS AS INSERT IS SELECTED IN
SESSION? 55

DWBIConcepts
6. WHAT ARE THE THREE AREAS WHERE THE ROWS CAN BE FLAGGED FOR PARTICULAR TREATMENT? 55
7. BY DEFAULT OPERATION CODE FOR ANY ROW IN INFORMATICA WITHOUT BEING ALTERED IS INSERT. THEN STATE WHEN DO WE
NEED DD_INSERT? 55
8. WHAT IS THE DIFFERENCE BETWEEN UPDATE STRATEGY AND FOLLOWING UPDATE OPTIONS IN TARGET? 55
9. WHAT IS THE USE OF FORWARD REJECT ROWS IN MAPPING? 56
10. SCENARIO IMPLEMENTATION 1 56

14. JAVA TRANSFORMATION 57


1. SCENARIO IMPLEMENTATION 1 57

DWBIConcepts
2. SCENARIO IMPLEMENTATION 2 57

15. SOURCE QUALIFIER TRANSFORMATION 59


1. WHAT IS A SOURCE QUALIFIER? WHAT ARE THE TASKS WE CAN PERFORM USING A SOURCE QUALIFIER AND WHY IT IS AN ACTIVE
TRANSFORMATION? 59
2. WHAT HAPPENS TO A MAPPING IF WE ALTER THE DATA TYPES BETWEEN SOURCE AND ITS CORRESPONDING SOURCE QUALIFIER? 59
3. SUPPOSE WE HAVE USED THE SELECT DISTINCT AND THE NUMBER OF SORTED PORTS PROPERTY IN THE SOURCE QUALIFIER AND
THEN WE ADD CUSTOM SQL QUERY. EXPLAIN WHAT WILL HAPPEN. 59
4. DESCRIBE THE SITUATIONS WHERE WE WILL USE THE SOURCE FILTER, SELECT DISTINCT AND NUMBER OF SORTED PORTS DWBIConcepts
PROPERTIES OF SOURCE QUALIFIER TRANSFORMATION. 60
5. WHAT WILL HAPPEN IF THE SELECT LIST COLUMNS IN THE CUSTOM OVERRIDE SQL QUERY AND THE OUTPUT PORTS ORDER
IN SOURCE QUALIFIER TRANSFORMATION DO NOT MATCH? 60
6. WHAT HAPPENS IF IN THE SOURCE FILTER PROPERTY OF SQ TRANSFORMATION WE INCLUDE KEYWORD WHERE SAY, WHERE
CUSTOMERS.CUSTOMER_ID > 1000. 60
7. DESCRIBE THE SCENARIOS WHERE WE GO FOR JOINER TRANSFORMATION INSTEAD OF SOURCE QUALIFIER TRANSFORMATION. 60
8. WHAT IS THE MAXIMUM NUMBER WE CAN USE IN NUMBER OF SORTED PORTS FOR SYBASE SOURCE SYSTEM? 61
9. WHAT IS USE OF SOURCE QUALIFIER IN INFORMATICA? CAN WE CREATE A MAPPING WITHOUT A SOURCE QUALIFIER? 61
10. SUPPOSE WE HAVE TWO TABLES OF SAME DATABASE TYPE, RESIDING IN DIFFERENT DATABASE INSTANCE. IF A DATABASE LINK IS
AVAILABLE, HOW CAN WE JOIN THE TWO TABLES USING A SOURCE QUALIFIER IN INFORMATICA PROVIDED THERE ARE VALID JOIN
COLUMNS. 61
11. WHAT IS THE MEANING OF “OUTPUT IS DETERMINISTIC” PROPERTY IN SOURCE QUALIFIER TRANSFORMATION? 61

7
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

12. SCENARIO IMPLEMENTATION 1 62

16. MISCELLANEOUS 63
1. WHAT ARE THE NEW FEATURES OF INFORMATICA 9.X IN DEVELOPER LEVEL? 63
2. NAME THE TRANSFORMATIONS WHICH CONVERTS ONE TO MANY ROWS I.E. INCREASES THE I/P: O/P ROW COUNT. ALSO WHAT IS
THE NAME OF ITS REVERSE TRANSFORMATION? 63
3. HOW MANY WAYS WE CAN FILTER RECORDS? 63
4. WHAT ARE THE TRANSFORMATIONS THAT USE CACHE FOR PERFORMANCE? 63

DWBIConcepts
5. WHAT IS THE FORMULA FOR CALCULATION OF LOOKUP/RANK/AGGREGATOR INDEX & DATA CACHES? 64
6. WHAT IS THE DIFFERENCE BETWEEN INFORMATICA POWERCENTER AND EXCHANGE AND MART? 64
7. HOW DO WE HANDLE DELIMITER CHARACTER AS A PART OF THE DATA IN A DELIMITED SOURCE FILE? 65
8. WE HAVE JUST RECEIVED SOURCE FILES FROM UNIX. WE WANT TO STAGE THAT DATA TO ETL PROCESS. WHAT ARE THE POINTS
WE NEED TO LOOK FOR? 65
9. WHAT IS THE DIFFERENCE BETWEEN JOINER AND LOOKUP. PERFORMANCE WISE WHICH ONE IS BETTER TO USE. 65
10. WHAT IS THE B2B IN INFORMATICA? HOW CAN WE USE IT IN INFORMATICA? 66
11. WHAT IS CDC, SCD AND MD5 IN INFORMATICA? 66

DWBIConcepts
12. HOW CAN WE IMPLEMENT AN SCD TYPE2 MAPPING WITHOUT USING A LOOKUP TRANSFORMATION? 67
13. HOW DOES JOINER AND LOOKUP TRANSFORMATION TREAT NULL VALUE MATCHING? 67
14. DOES MICROSOFT SQL SERVER SUPPORTS BULK LOADING? IF YES, WHAT HAPPENS WHEN YOU SPECIFY BULK MODE AND DATA
DRIVEN FOR SQL SERVER TARGET 67
15. HOW CAN YOU UTILIZE COM COMPONENTS IN INFORMATICA? 67
16. WHAT IS SQL TRANSFORMATION IN INFORMATICA? 67
17. WHAT IS A XML SOURCE QUALIFIER? 68
18. WHAT IS THE “METADATA EXTENSIONS” TAB IN INFORMATICA? 68
19. DESCRIBE SOME OF THE ETL BEST PRACTICES 69

DWBIConcepts
20. IS THERE A SCOPE OF CLOUD COMPUTING IN DATA WAREHOUSING TECHNOLOGY? 69

17. MAPPING 71
1. SCENARIO IMPLEMENTATION 1 71
2. WHAT ARE MAPPING PARAMETERS AND VARIABLES? 71
4. WHAT ARE THE DEFAULT VALUES FOR VARIABLES? 72
5. WHAT DOES FIRST COLUMN OF BAD FILE (REJECTED ROWS) INDICATES? 72
6. OUT OF 100000 SOURCE ROWS SOME ROWS GET DISCARD AT TARGET, HOW WILL YOU TRACE THEM AND WHERE IT GETS LOADED?
72
DWBIConcepts
7. WHAT IS REJECT LOADING? 72
8. WHY INFORMATICA WRITER THREAD MAY REJECT A RECORD? 74
9. WHY TARGET DATABASE CAN REJECT A RECORD? 74
10. DESCRIBE VARIOUS STEPS FOR LOADING REJECT FILE? 74
11. VARIABLE V1 HAS VALUES SET AS 5 IN DESIGNER (DEFAULT), 10 IN PARAMETER FILE, AND 15 IN REPOSITORY. WHILE RUNNING
SESSION WHICH VALUE INFORMATICA WILL READ? 74
12. WHAT ARE SHORTCUTS? WHERE IT CAN BE USED? WHAT ARE THE ADVANTAGES? 74
13. CAN WE HAVE AN INFORMATICA MAPPING WITH TWO PIPELINES, WHERE ONE FLOW IS HAVING A TRANSACTION CONTROL
TRANSFORMATION AND ANOTHER NOT. EXPLAIN WHY? 75
14. HOW CAN WE IMPLEMENT REVERSE PIVOTING USING INFORMATICA TRANSFORMATIONS? 75
15. IS IT POSSIBLE TO UPDATE A TARGET TABLE WITHOUT ANY KEY COLUMN IN TARGET? 75

8
18. MAPPLET 77

© www.dwbiconcepts.com – All rights reserved.


www.dwbiconcepts.com – Community of DWBI Professionals

1. WHAT IS A MAPPLET? 77
2. WHAT IS THE DIFFERENCE BETWEEN REUSABLE TRANSFORMATION AND MAPPLET? 77
3. WHAT ARE THE TRANSFORMATIONS THAT ARE NOT SUPPORTED IN MAPPLET? 77
4. IS IT POSSIBLE TO CONVERT REUSABLE TRANSFORMATION TO A NON-REUSABLE ONE? 77
5. WHAT IS THE USE OF MAPPLET & WORKLET IN PROJECT? 78
6. IS IT POSSIBLE TO HAVE A MAPPLET WITHIN A MAPPLET AND WORKLET WITHIN A WORKLET? 78

19. SESSION 79

DWBIConcepts
1. WHAT IS SESSION AND BATCHES? 79
2. WHAT ARE VARIOUS SESSION TRACING LEVELS? 79
3. CAN WE COPY A SESSION TO NEW FOLDER OR NEW REPOSITORY? 79
4. IS IT POSSIBLE TO STORE ALL THE INFORMATICA SESSION LOG INFORMATION IN A DATABASE TABLE? NORMALLY THE SESSION LOG IS
STORED AS A BINARY COMPRESSION .BIN FILE IN SESSLOGS DIRECTORY. CAN WE STORE THE SAME INFORMATION IN DATABASE TABLES
FOR FUTURE ANALYSIS? 79
5. CAN WE CALL A SHELL SCRIPT FROM SESSION PROPERTIES? 80
6. CAN WE CHANGE THE SOURCE AND TARGET TABLE NAMES IN SESSION LEVEL? 81

DWBIConcepts
7. HOW TO WRITE FLAT FILE COLUMN NAMES IN TARGET? 81
8. WHAT ARE THE ERROR TABLES PRESENT IN INFORMATICA? 81
9. WHAT ARE THE ALTERNATE WAYS TO STOP A SESSION WITHOUT USING “STOP ON ERRORS” OPTION SET TO 1 IN SESSION
PROPERTIES? 81
10. SUPPOSE A SESSION FAILS AFTER LOADING OF 10,000 RECORDS IN THE TARGET. HOW CAN WE LOAD THE RECORDS FROM 10,001
WHEN WE RUN THE SESSION NEXT TIME? 82
11. DEFINE THE TYPES OF COMMIT INTERVALS APART FROM USER DEFINED? 82
12. SUPPOSE SESSION IS CONFIGURED WITH COMMIT INTERVAL OF 10,000 ROWS AND SOURCE HAS 50,000 ROWS EXPLAIN THE
COMMIT POINTS FOR SOURCE BASED COMMIT & TARGET BASED COMMIT. ASSUME APPROPRIATE VALUE WHEREVER REQUIRED? 82

DWBIConcepts
13. HOW TO CAPTURE PERFORMANCE STATISTICS OF INDIVIDUAL TRANSFORMATION IN THE MAPPING AND EXPLAIN SOME
IMPORTANT STATISTICS THAT CAN BE CAPTURED? 83
14. HOW CAN WE PARAMETERIZE SUCCESS OR FAILURE EMAIL LIST? 83
15. IS IT POSSIBLE THAT A SESSION FAILED BUT STILL THE WORKFLOW STATUS IS SHOWING SUCCESS? 83
16. WHAT IS BUSY PERCENTAGE? 83
17. CAN WE WRITE A PL/SQL BLOCK IN PRE AND POST SESSION OR IN TARGET QUERY OVERRIDE? 84
18. WHENEVER A SESSION RUNS DOES THE DATA GETS OVERWRITTEN IN A FLAT FILE TARGET? IS IT POSSIBLE TO KEEP THE EXISTING
DATA AND ADD THE NEW DATA TO THE TARGET FILE? 84
19. CAN WE USE THE SAME SESSION TO LOAD A TARGET TABLE IN DIFFERENT DATABASES HAVING SAME TARGET DEFINITION? 84
DWBIConcepts
20. HOW DO YOU REMOVE THE CACHE FILES AFTER THE TRANSFORMATION? 84
21. WHY DOESN'T A RUNNING SESSION QUIT WHEN ORACLE OR SYBASE RETURN FATAL ERRORS? 84

20. WORKFLOW 86
1. WHAT IS THE DIFFERENCE BETWEEN STOP AND ABORT OPTIONS IN WORKFLOW? 86
2. RUNNING INFORMATICA WORKFLOW CONTINUOUSLY – HOW TO RUN A WORKFLOW CONTINUOUSLY UNTIL A CERTAIN CONDITION
IS MET? 86
3. HOW DO WE SEND EMAILS FROM INFORMATICA AFTER THE SUCCESSFUL COMPLETION OF ONE SESSION? THE EMAIL WILL CONTAIN
THE JOB NAME/ SESSION START TIME AND SESSION END TIME IN THE MESSAGE BODY. 87
4. SCENARIO IMPLEMENTATION 1 87
5. HOW CAN WE SEND TWO SEPARATE EMAILS AFTER A SUCCESSFUL SESSION RUN? 87
6. WHAT IS COLD START IN INFORMATICA? 88
7. SCENARIO IMPLEMENTATION 2 88
9
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

8. WE KNOW THERE ARE 3 OPTIONS FOR SESSION RECOVERY STRATEGY - RESTART TASK, FAIL TASK AND CONTINUE RUNNING THE
WORKFLOW, RESUME FROM LAST CHECKPOINT WHENEVER A SESSION FAILS. HOW DO WE RESTART A WORKFLOW AUTOMATICALLY
WITHOUT ANY MANUAL INTERVENTION IN THE EVENT OF SESSION FAILURE? 89
9. WHAT IS THE DIFFERENCE REAL-TIME AND CONTINUOUS WORKFLOWS? 89
11. SCENARIO IMPLEMENTATION 3 89
12. HOW DO WE SEND A SESSION FAILURE MAIL WITH THE WORKFLOW OR SESSION LOG AS ATTACHMENT? 90
13. EXPLAIN DEADLOCK IN INFORMATICA AND HOW DO WE RESOLVE IT? 90
14. SCENARIO IMPLEMENTATION 4 90

DWBIConcepts
15. HOW CAN WE PASS A VALUE FROM ONE WORKFLOW TO ANOTHER? 91

21. ADMINISTRATION 92
1. WHAT IS LOAD MANAGER? 92
2. WHAT IS DTM PROCESS? HOW MANY THREADS IT CREATES TO PROCESS DATA, EXPLAIN EACH THREAD IN BRIEF? 92
3. CAN YOU CREATE A FOLDER WITHIN DESIGNER? 92
4. HOW DO YOU TAKE CARE OF SECURITY USING A REPOSITORY MANAGER? 93
5. WHAT ARE THE DIFFERENT USES OF A REPOSITORY MANAGER? 93

DWBIConcepts
6. WHAT ARE 2 MODES OF DATA MOVEMENT IN INFORMATICA SERVER? 93
7. WHAT IS CODE PAGE USED FOR? 93
8. WHAT IS CODE PAGE COMPATIBILITY? 94
9. WHAT IS DEFAULT BLOCK BUFFER SIZE? 94
10. WHAT IS DEFAULT LM SHARED MEMORY SIZE? 94
11. DEFINE SERVER CONCEPTS WITH RESPECT TO MEMORY BUFFERS 94
12. WHAT ARE THE TWO PROGRAMS THAT COMMUNICATE WITH THE INFORMATICA SERVER? 95

22. COMMAND LINE ARGUMENTS 96

DWBIConcepts
1. WHAT IS PMCMD COMMANDS? 96
2. WHAT IS PMREP COMMANDS? 96
3. HOW DO WE START & STOP SESSION FROM PMCMD COMMAND LINE? 96

23. METADATA REPOSITORY 97


1. IS THERE ANY METADATA QUERY TO FIND THE LIST OF INFORMATICA FOLDER NAME, WORKFLOW NAMES WHICH ARE MIGRATED IN
A PARTICULAR QUARTER? 97
3. WRITE A METADATA QUERY TO IDENTIFY THE SESSIONS HAVING TRUNCATE OPTION ENABLED 97
4. WHERE CAN I FIND A HISTORY / METRICS OF THE LOAD SESSIONS THAT HAVE OCCURRED IN INFORMATICA? 97 DWBIConcepts
5. HOW TO EXTRACT THE WORKFLOW MONITOR RECORD INFORMATION FROM INFORMATICA METADATA REPOSITORY? 98

24. REPOSITORY MANAGER 100


1. DESCRIBE THE STEPS FOR EXPORT AND IMPORT? 100
2. WHAT ARE THE VARIOUS METHODS OF CODE MIGRATION OR WHICH IS THE BEST WAY OF DEPLOYMENT? 100
3. WHAT ARE THE VARIOUS OPTIONS FOR ETL CODE MIGRATION 101
4. WHAT IS LABELING IN INFORMATICA? 101
5. SUPPOSE HAVING INFORMATICA VERSION CONTROL IN PLACE, CAN WE REVERT BACK AN OBJECT TO A STATE OF TWO PREVIOUS
VERSION. 102
6. WHAT DO WE MEAN BY TEAM BASED DEVELOPMENT IN INFORMATICA? 102

25. SCENARIO QUESTIONS 104

10
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

1. SUPPOSE WE HAVE TEN SOURCE FLAT FILES OF SAME STRUCTURE. HOW CAN WE LOAD ALL THE FILES IN TARGET DATABASE IN A
SINGLE BATCH RUN USING A SINGLE MAPPING? 104
2. SUPPOSE WE HAVE TWO SOURCE QUALIFIER TRANSFORMATIONS SQ1 AND SQ2 CONNECTED TO TARGET TABLES TGT1 AND TGT2
RESPECTIVELY. HOW DO YOU ENSURE TGT2 IS LOADED AFTER TGT1? 104
3. SUPPOSE WE HAVE A SOURCE QUALIFIER TRANSFORMATION THAT POPULATES TWO TARGET TABLES. HOW DO YOU ENSURE TGT2
IS LOADED AFTER TGT1? 106
4. SUPPOSE WE HAVE THE EMP TABLE AS OUR SOURCE. IN THE TARGET WE WANT TO VIEW THOSE EMPLOYEES WHOSE SALARY ARE
GREATER THAN OR EQUAL TO THE AVERAGE SALARY FOR THEIR DEPARTMENTS. DESCRIBE YOUR MAPPING APPROACH. 106

DWBIConcepts
5. HOW CAN WE PERFORM CHANGED DATA CAPTURE BASED ON LOAD SEQUENCE NUMBER (INTEGER) COLUMN PRESENT IN THE
SOURCE TABLE? 110
6. SCENARIO IMPLEMENTATION 1 111
7. HOW CAN WE LOAD ‘X’ RECORDS (USER DEFINED RECORD NUMBERS) OUT OF ‘N’ RECORDS FROM SOURCE DYNAMICALLY,
WITHOUT USING FILTER AND SEQUENCE GENERATOR TRANSFORMATION? 112
8. SUPPOSE WE HAVE ‘N’ NUMBER OF ROWS IN THE SOURCE AND WE HAVE TWO TARGET TABLES. HOW CAN WE LOAD ‘N/2’ I.E. FIRST
HALF THE SOURCE DATA INTO ONE TARGET AND THE REMAINING HALF INTO THE NEXT TARGET? 112
9. SUPPOSE WE HAVE A FLAT FILE WHICH HAS A HEADER RECORD WITH ‘FILE CREATION DATE’, AND DETAILED DATA RECORDS.
DESCRIBE THE APPROACH TO LOAD THE 'FILE CREATION DATE' COLUMN ALONG WITH EACH AND EVERY DETAILED RECORD. 113

DWBIConcepts
10. SCENARIO IMPLEMENTATION 2 113
11. SUPPOSE WE HAVE A FLAT FILE WHICH CONTAINS JUST A NUMERIC VALUE. WE NEED TO POPULATE THIS VALUE IN ONE COLUMN
OF THE TARGET TABLE FOR EVERY SOURCE RECORD. HOW CAN WE ACHIEVE THIS? 113
12. HOW WILL YOU LOAD A SOURCE FLAT FILE INTO A STAGING TABLE WHEN THE FILE NAME IS NOT FIXED? THE FILE NAME IS LIKE
SALES_2013_02_22.TXT, I.E. DATE IS APPENDED AT THE END OF THE FILE AS A PART OF FILE NAME. 114
13. SOLVE THE BELOW SCENARIO USING INFORMATICA AND DATABASE SQL. 114
14. SUPPOSE WE HAVE A COLUMN IN SOURCE WITH VALUES AS BELOW: 115
15. CAN WE PASS THE VALUE OF A MAPPING VARIABLE BETWEEN 2 PIPELINES UNDER THE SAME MAPPING? IF NOT HOW CAN WE

DWBIConcepts
ACHIEVE THIS? 116
16. SCENARIO IMPLEMENTATION 3 116
17. SCENARIO IMPLEMENTATION 4 117
18. IMPLEMENT SLOWLY CHANGING DIMENSION OF TYPE 2 WHICH WILL LOAD CURRENT RECORD IN CURRENT TABLE AND OLD DATA
IN LOG TABLE. 118

26. PERFORMANCE TUNING 119


1. WHICH ONE IS FASTER CONNECTED OR UNCONNECTED LOOKUP? 119
2. HOW WE CAN IMPROVE PERFORMANCE OF INFORMATICA NORMALIZATION TRANSFORMATION. 119
DWBIConcepts
3. HOW TO IMPROVE THE SESSION PERFORMANCE? 119
4. HOW DO YOU IDENTIFY THE BOTTLENECKS IN MAPPINGS? 120
5. HOW DO YOU HANDLE PERFORMANCE ISSUES IN INFORMATICA? WHERE CAN YOU MONITOR THE PERFORMANCE? 121
6. WHAT ARE PERFORMANCE COUNTERS? 122
7. HOW CAN WE INCREASE SESSION PERFORMANCE? 122
8. SCENARIO IMPLEMENTATION 1 124

11
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Topic Matrix:

Serial Number Topics Questions


1 Aggregator 17
2 Expression 10
3 Filter 2
4 Joiner 12

DWBIConcepts
5 Lookup 20
6 Normalizer 4
7 Rank 12
8 Router 5
9 Sequence Generator 8
10 Stored Procedure 6
11 Sorter 6
12 Union 3
13 Update Strategy 10
14 Java 2

DWBIConcepts
15 Source Qualifier 12
16 Miscellaneous 20
17 Mapping 12
18 Mapplet 6
19 Session 22
20 Workflow 15
21 Administration 12
22 Command Line Arguments 3
23 Metadata Repository 5
24 Repository Manager 6

DWBIConcepts
25 Scenario Questions 18
26 Performance Tuning 8

DWBIConcepts

12
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

1. Aggregator Transformation

1. What is an Aggregator Transformation?

Answer:

An aggregator is an Active, Connected transformation which performs aggregate calculations like AVG,

DWBIConcepts
COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM and VARIANCE.

2. How an Expression Transformation differs from Aggregator Transformation?

Answer:

An Expression Transformation performs calculation on a row-by-row basis, whereas an Aggregator Trans-


formation performs calculations on groups.

DWBIConcepts
3. Does an Aggregator Transformation support only aggregate expressions?

Answer:

Apart from aggregate expressions, aggregator transformation supports non-aggregate expressions and con-
ditional clauses.

DWBIConcepts
4. Give one example for each of Conditional Aggregation, Non-Aggregate expression and
Nested Aggregation.

Answer:

 Use conditional clauses in the aggregate expression to reduce the number of rows used in the ag-
gregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.
 SUM (SALARY, JOB = ‘CLERK’) DWBIConcepts
 Use non-aggregate expressions in group by ports to modify or replace groups.
 IIF (PRODUCT = ‘Brown Bread’, ‘Bread’, PRODUCT)

 Nested aggregation expression can include one aggregate function within another aggregate func-
tion.
 MAX (COUNT (PRODUCT))

5. How does Aggregator Transformation handle NULL values?

Answer:

13
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

By default, the aggregator transformation treats null values as NULL in aggregate functions. But
we can specify to treat null values in aggregate functions as NULL or zero.

6. What are the performance considerations when working with Aggregator Transfor-
mation?

DWBIConcepts
Answer:

 Filter the unnecessary data before aggregating it. Place a Filter transformation in the mapping be-
fore the aggregator transformation to reduce unnecessary aggregation.
 Improve performance by connecting only the necessary input/output ports to subsequent transfor-
mations, thereby reducing the size of the data cache.
 Use Sorted input which reduces the amount of data cached and improves session performance.

Aggregator performance improves dramatically if records are sorted before passing to the aggregator and

DWBIConcepts
“Sorted Input” option under aggregator properties is checked. The record set should be sorted on those col-
umns that are used in Group By operation.

It is often a good idea to sort the record set in database level (click here to see why?) e.g. inside
a source qualifier transformation, unless there is a chance that already sorted records from
source qualifier can again become unsorted before reaching aggregator.

DWBIConcepts
7. What are the uses of index and data cache?

Answer:

The group data is stored in index files whereas Row data stored in data files.

8. What differs when we choose Sorted Input for Aggregator Transformation?


DWBIConcepts
Answer:

Integration Service creates the index and data caches files in memory to process the Aggregator transfor-
mation. If the Integration Service requires more space as allocated for the index and data cache sizes in the
transformation properties, it stores overflow values in cache files i.e. paging to disk.

One way to increase session performance is to increase the index and data cache sizes in the transformation
properties.

But when we check Sorted Input the Integration Service uses memory to process an Aggregator transfor-
mation it does not use cache files.

14
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

9. Under what conditions selecting Sorted Input in aggregator will still not boost session per-
formance?

Answer:

 Incremental Aggregation, session option is enabled.


 The aggregate expression contains nested aggregate functions.
 When session property, Treat Source rows as is set to data driven.

DWBIConcepts
10. Under what condition selecting Sorted Input in aggregator may fail the session?

Answer:

 If the input data is not sorted correctly, the session will fail.
 Also if the input data is properly sorted, the session may fail if the sort order by ports and the group
by ports of the aggregator are not in the same order.

DWBIConcepts
11. Suppose we do not group by on any ports of the aggregator what will be the output.

Answer:

If we do not use an input port in group-by neither in aggregate expression, the Integration Ser-

DWBIConcepts
vice will return only the last row value of the column for the input rows.

For example, if we have 100 rows coming from source then aggregator will output only the last record (100 th
record)

12. What is the expected value if the column in an aggregator transformation is neither a
group by nor an aggregate expression? DWBIConcepts

Answer:

Integration Service produces one row for each group based on the group by ports. The columns which are
neither part of the key nor aggregate expression will return the corresponding value of last record of the
group received.

However, if we specify particularly the FIRST function, the Integration Service then returns the value of the
specified first row of the group. So default is the LAST function.

13. What is Incremental Aggregation?

15
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Answer:

We can enable the session option, Incremental Aggregation for a session that includes an Aggregator Trans-
formation. When the Integration Service performs incremental aggregation, it actually passes changed
source data through the mapping and uses the historical cache data to perform aggregate calculations in-
crementally.

14. Sorted input for aggregator transformation will improve performance of mapping. How-

DWBIConcepts
ever, if sorted input is used for nested aggregate expression or incremental aggregation,
then the mapping may result in session failure. Explain why?

Answer:

In case of a nested aggregation, there are multiple levels of sorting associated as each aggregation function
will require one sorting pass, and after the first level of aggregation, the sort order of the group by column
may get jumbled up, so before the second level of aggregation, Informatica must internally sort it again.
However, if we already indicate that input is sorted, Informatica will not do this sorting - resulting into fail-

DWBIConcepts
ure.

In incremental aggregation, the aggregate calculations are stored in historical cache on the server. In this his-
torical cache the data may not be in sorted order. If we give sorted input, the records come as presorted for
that particular run but in the historical cache the data may not be in the sorted order.

15. How can we delete duplicate record using Informatica Aggregator?

DWBIConcepts
Answer:

One way to handle duplicate records in source batch run is to use an Aggregator Transformation and using
the Group By checkbox on the ports having duplicate occurring data. Here you can have the flexibility to se-
lect the last or the first of the duplicate column value records.

16. Scenario Implementation 1


DWBIConcepts
Suppose in our Source Table we have data as given below:

Student Name Subject Name Marks


Sam Maths 100
Tom Maths 80
Sam Physical Science 80
John Maths 75
Sam Life Science 70
John Life Science 100
John Physical Science 85
Tom Life Science 100
Tom Physical Science 85

16
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

We want to load our Target Table as:

Student Name Maths Life Science Physical Science


Sam 100 70 80
John 75 100 85
Tom 80 100 85

Describe your approach.

DWBIConcepts
Answer:

Here our scenario is to convert many rows to one row, and the transformation which will help us to achieve
this is Aggregator.

Our Mapping will look like this:

DWBIConcepts
We will sort the source data based on STUDENT_NAME ascending followed by SUBJECT ascending.

DWBIConcepts
DWBIConcepts

Now based on STUDENT_NAME in GROUP BY clause the following output subject columns are populated as

 MATHS: MAX( MARKS, SUBJECT = ’Maths’ )


 LIFE_SC: MAX( MARKS, SUBJECT = ’Life Science’ )
 PHY_SC: MAX( MARKS, SUBJECT = ’Physical Science’ )

17
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DWBIConcepts
DWBIConcepts
17. Scenario Implementation 2
Source:

100 XYZ AAA


100 XYZ BBB

DWBIConcepts
100 XYZ CCC

The expected output data:

100 XYZ AAA BBB CCC

Which transformations are used for this?

Answer:

Use an Aggregator transformation with variable. DWBIConcepts

18
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

2. Expression Transformation

1. What is an Expression Transform?

Answer:

Expression is a Passive connected transformation used to calculate values in a single row before you write to

DWBIConcepts
the target. We can use the Expression transformation to perform any non-aggregate calculations. We can al-
so use the Expression transformation to test conditional statements before you output the results to target
tables or other transformations.

For example, we might need to adjust employee salaries, concatenate first and last names, or convert strings
to numbers.

2. How many types of ports are there in Expression transform?

DWBIConcepts
Answer:

There are three types of ports- INPUT, OUTPUT, and VARIABLE

3. What is the execution order of the ports in an expression?

DWBIConcepts
Answer:

 All ports are executed TOP TO BOTTOM in a serial physical ordering fashion, but they are done in the
following groups:
 All input ports are pushed values first.
 Then all variables are executed (top to bottom physical ordering in the expression).
 Last - all output expressions are executed to push values to output ports

DWBIConcepts
You can utilize this to your advantage, by placing lookups in to variables, then using the variables
"later" in the execution cycle.

4. Describe the approach for the requirement. Suppose the input is:

Col1 Col2
10 a
20 b
30 c
40

19
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

50 d

The desired output is:

Col1 Col2
10 a
20 a,b
30 a,b,c

DWBIConcepts
40 a,b,c
50 a,b,c,d

Answer: Use an Expression transformation:-

Port Name Port Type Expression


Col1 I/O
Col2 I
V_Seq V CUME(1)
V_Col2 V IIF (V_Seq = 1, Col2, IIF ( ISNULL (Col2), Prev_Col2, Prev_Col2 || ',' || Col2))

DWBIConcepts
Prev_Col2 V V_Col2
Out_Col2 O Prev_Col2

Keep in mind the string length of the variable and output ports.

CUME function is used to calculate the cumulative amount based on the argument of the cumulative func-
tion. This means, if we call CUME with argument 1, e.g. CUME(1); then on the first call it will re-
turn 1; on the second call, it will return 2; on the third call, it will return 3 and so on. Since
Informatica process data row by row, this means that when the first row is processed CUME(1)
will return 1; for the next row, it will return 2 and so on.

DWBIConcepts
5. How can we implement aggregation operation without using an Aggregator Transfor-
mation in Informatica?

Answer:

We will use the very basic concept of the Expression Transformation, that at a time we can ac-
cess the previous row data as well as the currently processed data in an expression transfor-
mation. What we need is simple Sorter, Expression and Filter transformation to achieve aggre- DWBIConcepts
gation at Informatica level.

For detailed understanding visit Aggregation without Aggregator.

6. Scenario Implementation 1
Source

Col1 Col2
A W
B R
C E
A R

20
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

B E

Target

Col1 READ WRITE EXECUTE


A 1 1 0
B 1 0 1
C 0 0 1

DWBIConcepts
In this scenario Source values in Col2 W, R, E means read write and execute.

Answer:

Take an Expression transformation followed by Aggregator transformation.

In Expression Transformation:

Port Name Port Type Expression

DWBIConcepts
Col1 I/O
Col2 I/O
Read O IIF ( Col2 = 'R', 1, 0 )
Write O IIF ( Col2 = 'W', 1, 0 )
Execute O IIF ( Col2 = 'E', 1, 0 )

In Aggregator Transform:

Col 1 I/O GROUP BY

DWBIConcepts
Read I/O MAX (Read)
Write I/O MAX (Write)
Execute I/O MAX (Execute)

7. Scenario Implementation 2

Source data is like below:


DWBIConcepts
Id name1 name2
10 A B
10 C D
20 E F

Desired Target data is like below

Id name
10 AB
10 CD
20 EF

21
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Answer:

Use Expression Transformation to concatenate both values as- name = name1 || name2

8. Scenario Implementation 3

DWBIConcepts
Suppose we have a field in source file named as DATA. We need to mark those records having 9 characters
such that the first 2 characters must be alphabets i.e.(A-Z) and the rest 7 characters must be alphanumeric
i.e.(A-Z) or (0-9) for the DATA field as output. And the records which don’t match the condition should be
marked as “Invalid”. How do we implement this?
E.g.

DATA OUTPUT
AB345GH6756 AB345GH67
CD56789PJ CD56789PJ

DWBIConcepts
56CHJK97889 Invalid
DG//*67DF Invalid

Answer:

Use the below logic in an output port of an Expression Transformation in Informatica:-

IIF( REG_MATCH( SUBSTR(DATA,1,2), '[[:alpha:]]{2}' ) = 1


ANDREG_MATCH( SUBSTR(DATA,3,7), '[[:alnum:]]{7}' ) = 1, SUBSTR(DATA, 1,
9), 'Invalid' )

DWBIConcepts
9. Scenario Implementation 4
How do we convert a Date field coming as data type string from a flat file?

Answer:

Use Date Conversion Functions:-


IIF( IS_DATE( Column1 ) = 1, TO_DATE( Column1 , 'YYYYMMDD' ),
NULL )
DWBIConcepts

In the above example, we have assumed the format of the date field is ‘YYYYMMDD’. If the format is some-
thing else (e.g. YYYY-MM-DD), we need to specify the same

10. Scenario Implementation 5

Source:

Col1 Col2
1 B
2 C

22
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

3 D
4 E

Target

Col1 Col2 Col3 Col4


1 B 2 C
3 D 4 E

DWBIConcepts
Describe the approach to the above scenario where the source 1st record loaded to target col1,col2 then
2nd record loaded to col3,col4 again 3rd record to col1,col2 and so on.

Answer:

Use an Expression transformation:

DWBIConcepts
Port Port Type Expression
Name
Col1 I
Col2 I
V_ID V 1 – MOD (Col1, 2)
O_ID O V_ID
O_Col1 O V_Col1
O_Col2 O V_Col2
O_Col3 O Col1
O_Col4 O Col2

DWBIConcepts
V_Col1 V Col1
V_Col2 V Col2

Next use a Filter transformation with condition O_ID = 1

Next map O_Col1, O_Col2, O_Col3, O_Col4 to Col1, Col2, Col3, Col4 of the target respectively.

DWBIConcepts

23
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

3. Filter Transformation

1. What is a Filter Transformation and why it is an Active one?

Answer:

A Filter transformation is an Active and Connected transformation that can filter rows in a mapping.

DWBIConcepts
Only the rows that meet the Filter Condition pass through the Filter transformation to the next transfor-
mation in the pipeline. TRUE and FALSE are the implicit return values from any filter condition we set. If the
filter condition evaluates to NULL, the row is assumed to be FALSE. The numeric equivalent of FALSE is zero
(0) and any non-zero value is the equivalent of TRUE.

As an ACTIVE transformation, the Filter transformation may change the number of rows passed through it. A
filter condition returns TRUE or FALSE for each row that passes through the transformation, de-
pending on whether a row meets the specified condition. Only rows that return TRUE pass
through this transformation. Discarded rows do not appear in the session log or reject files.

DWBIConcepts
2. What is the difference between Source Qualifier transformations Source filter option and
filter transformation?

Answer:

SQ Source Filter Filter Transformation


Source Qualifier transformation filters rows when Filter transformation filters rows from

DWBIConcepts
read from a source. within a mapping

Source Qualifier transformation can only filter rows Filter transformation filters rows coming
from relational sources. from any type of source system in the map-
ping level.
Source Qualifier limits the row set extracted from a Filter transformation limits the row set
source. sent to a target.

Source Qualifier reduces the number of rows used To maximize session performance, in-
throughout the mapping and hence it provides better clude the Filter transformation as close to
DWBIConcepts
performance. the sources in the mapping as possible to
filter out unwanted data early in the flow of
data from sources to targets.

The filter condition in the Source Qualifier transfor- Filter Transformation can define a condi-
mation only uses standard SQL as it runs in the database. tion using any statement or transformation
function that returns either a TRUE or FALSE
value.

24
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

4. Joiner Transformation

1. What is a Joiner Transformation and why it is an Active one?

Answer:

A Joiner is an Active and Connected transformation used to join two source data streams coming from same

DWBIConcepts
or heterogeneous databases or files.

The Joiner transformation joins sources with at least one matching column. The Joiner transformation uses
a condition that matches one or more pairs of columns between the two sources.

In the Joiner transformation, we must configure the transformation properties namely Join Condition, Join
Type and optionally Sorted Input option to improve Integration Service performance.

The join condition contains ports from both input sources that must match for the Integration Service to join
two rows. Depending on the join condition and the type of join selected, the Integration Service
either adds the row to the result set or discards the row. Because of this reason, the number of

DWBIConcepts
rows in Joiner output may not be equal to the number of rows in Joiner Input. This is why Joiner
is considered an Active transformation.

2. State the limitations where we cannot use Joiner in the mapping pipeline.

Answer:

DWBIConcepts
The Joiner transformation accepts input from most transformations. However, following are the
limitations:

 Joiner transformation cannot be used when either of the input pipelines contains an Update Strate-
gy transformation.
 Joiner transformation cannot be used if we connect a Sequence Generator transformation directly
before the Joiner transformation.
DWBIConcepts

3. Out of the two input pipelines of a joiner, which one will we set as the master pipeline?

Answer:

During a session run, the Integration Service compares each row of the master source against the
detail source. The master and detail sources need to be configured for optimal performance.

When the Integration Service processes an unsorted Joiner transformation, it blocks the detail source while
it caches rows from the master source. Once the Integration Service finishes reading and caching all master
rows, it unblocks the detail source and reads the detail rows. This is why if we have the source containing
fewer input rows in master, the cache size will be smaller, thereby improving the performance.
25
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

For a Sorted Joiner transformation, use the source with fewer duplicate key values as the master source for
optimal performance and disk storage. When the Integration Service processes a sorted Joiner transfor-
mation, it caches rows for one hundred keys at a time. If the master source contains many rows with the
same key value, the Integration Service must cache more rows, and performance can be slowed.

Blocking logic is possible if master and detail input to the Joiner transformation originate from dif-
ferent sources. Otherwise, it does not use blocking logic. Instead, it stores more rows in the cache.

DWBIConcepts
4. What are the different types of Joins available in Joiner Transformation?

Answer:

In SQL, a join is a relational operator that combines data from multiple tables into a single result set. The
Joiner transformation is similar to an SQL join except that data can originate from different types of sources.

The Joiner transformation supports the following types of joins:

DWBIConcepts
 Normal
 Master Outer
 Detail Outer
 Full Outer

DWBIConcepts
DWBIConcepts

A normal or master outer join performs faster than a full outer or detail outer join.

26
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

5. Define the various Join Types of Joiner Transformation.

Answer:

 In a normal join, the Integration Service discards all rows of data from the master and detail source

DWBIConcepts
that do not match, based on the join condition.
 A master outer join keeps all rows of data from the detail source and the matching rows from the
master source. It discards the unmatched rows from the master source.
 A detail outer join keeps all rows of data from the master source and the matching rows from the
detail source. It discards the unmatched rows from the detail source.
 A full outer join keeps all rows of data from both the master and detail sources.

6. Describe the impact of number of join conditions and join order in a Joiner.

DWBIConcepts
Answer:

We can define one or more conditions based on equality between the specified master and detail sources.
Both ports in a condition must have the same data type.

If we need to use two ports in the join condition with non-matching data types we must convert the data
types so that they match. The Designer validates data types in a join condition.

Additional ports in the join condition, increases the time necessary to join two sources.

DWBIConcepts
The order of the ports in the join condition can impact the performance of the Joiner transformation. If we
use multiple ports in the join condition, the Integration Service compares the ports in the order we specified.

Only equality operator is available in joiner join condition.

7. How does Joiner transformation treat NULL value matching?


DWBIConcepts
Answer:

The Joiner transformation does not match null values.

For example, if both EMP_ID1 and EMP_ID2 contain a row with a null value, the Integration Service does not
consider them a match and does not join the two rows.

To join rows with null values, replace null input with default values in the Ports tab of the joiner, and then
join on the default values.

27
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

If a result set includes fields that do not contain data in either of the sources, the Joiner transfor-
mation populates the empty fields with null values. If we know that a field will return a NULL and
we do not want to insert NULLs in the target, set a default value on the Ports tab for the corre-
sponding port.

8. When we configure the join condition, what are the guidelines we need to follow to main-
tain the sort order?

DWBIConcepts
Suppose we configure Sorter transformations in the master and detail pipelines with the following sorted
ports in order: ITEM_NO, ITEM_NAME and PRICE.

Answer:

If we have sorted both the master and detail pipelines in order of the ports say ITEM_NO, ITEM_NAME and
PRICE we must ensure that:

 Use ITEM_NO in the First Join Condition.


 If we add a Second Join Condition, we must use ITEM_NAME.

DWBIConcepts
 If we want to use PRICE as a Join Condition apart from ITEM_NO, we must also use ITEM_NAME in
the Second Join Condition.
 If we skip ITEM_NAME and join on ITEM_NO and PRICE, we will lose the input sort order and the In-
tegration Service fails the session.

9. What are the transformations that cannot be placed between the sort origin and the Join-
er transformation so that we do not lose the input sort order?

DWBIConcepts
Answer:

The best option is to place the Joiner transformation directly after the sort origin to maintain sorted data.
However do not place any of the following transformations between the sort origin and the Joiner transfor-
mation:

 Custom
 Unsorted Aggregator
 Normalizer
 Rank
 Union transformation
DWBIConcepts
 XML Parser transformation
 XML Generator transformation
 Mapplet [if it contains any one of the above mentioned transformations]

10. What is the use of sorted input in joiner transformation?

Answer:

It is recommended to Join sorted data when possible. We can improve session performance by con-
figuring the Joiner transformation to use sorted input. When we configure the Joiner transformation
to use sorted data, it improves performance by minimizing disk input and output. We see

28
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

great performance improvement when we work with large data sets.

For an unsorted Joiner transformation, designate as the master source the source with fewer rows.

For optimal performance and disk storage, designate the master source as the source with the fewer rows.
During a session, the Joiner transformation compares each row of the master source against the de-
tail source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which
speeds the join process.

DWBIConcepts
11. Can we join two tables based on a join column having different data type?
For example table 1 EMPNO (string) and table 2 EMPNUM (number)

Answer:

Yes possible in this case. If we are using Joiner, we should be able to do this explicit conversion in an expres-
sion transformation before joining the tables.

DWBIConcepts
12. Implementation Scenario1 - Joiner transformation is joining two tables s1 and s2. s1 has
10,000 rows and s2 has 1000 rows . Which table you will set master for better perfor-
mance of joiner transformation? Why?

Answer:

DWBIConcepts
Set table S2 as Master table because informatica server has to keep master table in the cache so if it is 1000
in cache will get performance instead of having 10000 rows in cache.

DWBIConcepts

29
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

5. Lookup Transformation

1. What is a Lookup transform?

Answer:

The transform is used to look up data in a flat file, relational table, views, or synonym. The informatica server

DWBIConcepts
queries the lookup table based on the lookup ports in the transformation. It compares lookup transfor-
mation port values to lookup table column values based on the lookup condition. The result is passed to
other transformations and the target.

Uses:

 Get related value


 Perform a calculation
 Update slowly changing dimension tables.

DWBIConcepts
2. What are the differences between Connected and Unconnected Lookup?

Answer:

The differences are illustrated in the below table:

DWBIConcepts
Connected Lookup Unconnected Lookup
Connected lookup participates in dataflow and re- Unconnected lookup receives input values
ceives input directly from the pipeline from the result of a LKP: expression in an-
other transformation
Connected lookup can use both dynamic and static Unconnected Lookup cache can NOT be
cache dynamic
Connected lookup can return more than one col- Unconnected Lookup can return only one
umn value ( output port ) column value i.e. output port
Connected lookup caches all lookup columns Unconnected lookup caches only the DWBIConcepts
lookup output ports in the lookup condi-
tions and the return port
Supports user-defined default values (i.e. value to Does not support user defined default val-
return when lookup conditions are not satisfied) ues

3. What are the different lookup cache(s)?

Answer:

Informatica Lookups can be cached or un-cached (No cache). And Cached lookup can be either static or dy-
namic.

30
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

A static cache is one which does not modify the cache once it is built and the data remains same during the
session run.

On the other hand, a dynamic cache is refreshed during the session run by inserting or updating the records
in cache based on the incoming source data.

By default, Informatica cache is static cache.

A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the
cache even after the completion of session run or deletes it.

DWBIConcepts
4. Is lookup an active or passive transformation?

Answer:

From Informatica 9x, Lookup transformation can be configured as an "Active" transformation.

Find out How to configure lookup as active transformation.

DWBIConcepts
However, in the earlier versions of Informatica, lookup is a passive transformation.

5. What is the difference between Static and Dynamic Lookup Cache?

Answer:

We can configure a Lookup transformation to cache the underlying lookup table. In case of static or read-

DWBIConcepts
only lookup cache the Integration Service caches the lookup table at the beginning of the session and does
not update the lookup cache while it processes the Lookup transformation. Rows are not added dynamically
in the cache.

In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup
cache and passes the data to the target. The dynamic cache is synchronized with the target. It basically,
caches the rows as and when it is passed.

In case you are wondering why we need to make lookup cache dynamic, read this article on dynamic lookup.

DWBIConcepts
6. What are the uses of index and data caches?

Answer:

The conditions are stored in index cache and records from the lookup are stored in data cache

7. What is Persistent Lookup Cache?

Answer:

31
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

If the cache generated for a Lookup needs to be preserved for subsequent use then persistent cache is used.
It will not delete the index and data files. It is useful only if the lookup table remains constant.

Lookups are cached by default in Informatica. Lookup cache can be either non-persistent or persistent. The
Integration Service saves or deletes lookup cache files after a successful session run based on, whether the
Lookup cache is checked as persistent or not.

8. What type of join does Lookup support?

DWBIConcepts
Answer:

Lookup is just similar like SQL LEFT OUTER JOIN.

9. Explain how lookup transformation works like SQL Left Outer Join.

DWBIConcepts
Answer:

Lookup means if the source input column value matches the lookup table comparison column value then it
will Return valid values from the lookup table else it will return NULL.

Let’s consider the EMP table as Source and DEPT table as lookup. We want to extract the location of each
employee based on his or her department number. So if the Location details are not available in the DEPT
table, still we want to have all the other information of the employee coming from the source EMP table,
apart from NULL as location and load in our target table.

DWBIConcepts
So the equivalent SQL query looks like below:-

SELECT EMP.*, DEPT.LOC


FROM EMP LEFT OUTER JOIN DEPT
ON EMP.DEPTNO = DEPT.DEPTNO

Hence Lookup is associated with the Source table as Left Outer Join.

10. Where and why do we use Unconnected Lookup instead of Connected Lookup?
DWBIConcepts

Answer:

The best part of unconnected lookup is that, we can call the lookup based on some condition and
not every time. I.e. based on some condition met we can invoke the unconnected lookup in an
expression transformation else not. By this we may optimize the performance of a flow.

We may consider unconnected lookup as a function in any procedural language. It takes multiple parameters
as input and returns one values, and can be used repeatedly. Same way unconnected lookup can be used in
any scenario where we need to use the lookup repeatedly either in single or multiple transformation.

With the unconnected lookup, we get the performance benefit of not caching the same data multiple times.
Also it is a good coding practice.

32
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

11. How can we Identify Persistent Cache Files in Informatica Server?

Answer:

 Cache files are generated in the Cache directory of the Informatica Server for transformations like
Aggregator, Joiner, Lookup, Rank & Sorter.
 Two types of cache files are generated i.e. the data and index files exception being Sorter transfor-
mation.

DWBIConcepts
 Most Important point is that Informatica automatically deletes all the generated .dat and .idx cache
files after a session run is finished.
 So the files that are present in the Cache directory are basically the Persistent Cache files of Lookup
transformation, Aggregator Cache files of Incremental Aggregation sessions or if the session run was
not successfully completed.
 Informatica generated cache files are named as:
PMAGG*.idx, PMAGG*.dat, PMJNR*.idx, PMJNR*.dat, PMLKP*.idx, PMLKP*.dat.
 Often while handling big data cache Informatica creates multiple index and data files due to paging
and appends a number to the end of the files e.g. PMAGG*.dat0, PMAGG*.idx0, PMAGG*.dat1,
PMAGG*.idx1.

DWBIConcepts
So if we have followed any particular naming convention for Lookup Persistent Cache Name e.g. ta-
ble_name_PC or the table names have a convention like GDW_ then use shell commands accordingly to
identify the cache files in server.

In this context you can revisit Lookup Persistent Cache and Incremental Aggregation article

12. How to configure a Lookup on a flat file with header?

DWBIConcepts
Answer:

When we try to create a lookup transformation, we have the option to select the location of the Lookup Ta-
ble from any of Source, Target, Source Qualifier, Import from Relational Table or Import from Flat File.

So after selecting the flat file as lookup from the desired location, the edit Transformation tab of the lookup
will have the Flat file information to choose between Delimited or Fixed width and advanced properties to
modify like Column Delimiters, Code Page and obviously Number of initial rows to skip.
Set Number of initial rows to skip as 1. Set the Lookup condition as required.

Apart from that go to the Mapping tab of the corresponding session and select the lookup transformation to
DWBIConcepts
configure the Lookup source file directory and filename and Lookup source file type i.e. Direct or Indirect.

13. What is the difference between persistent cache and shared cache?

Answer:

Persistent cache is a type of Informatica lookup cache in which the cache file is stored in disk. We
can configure the session to re-cache if necessary. It will be used only if we are sure that lookup
table will not change between sessions.
It will be used if your mapping uses any static tables as lookup mostly.

33
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

If the persistent cache is shared across mappings, we call it as shared cache (named). We will provide a name
for this cache file.

If the lookup table is used in more than one transformation/mapping then the cache built for the first lookup
can be used for the others. It can be used across mappings.

For Shared cache we have to give the name in cache file name prefix property. Use the same name it in dif-
ferent lookup where we want to use the cache.

DWBIConcepts
Unshared cache: Within the mapping if the lookup table is used in more than one transformation then the
cache built for the first lookup can be used for the others. It cannot be used across mappings.

14. Describe how to return multiple port values from unconnected lookup in Informatica.

Answer:

Informatica Unconnected Lookup by default supports only one return port.

DWBIConcepts
So alternatively we can write a Lookup SQL override with the required ports values concatenated into a sin-
gle string as return port value.

Call the Unconnected lookup from the expression transformation and use various output ports to retrieve
the lookup values based on the concatenated return value. Use SUBSTR, INSTR functions to extract the col-
umn values from the concatenated return field.

15. How to make the persistent lookup cache in sync with lookup table?

DWBIConcepts
Answer:

To make the persistent cache in sync with the lookup table simply enable Re-cache option of the lookup
transformation to rebuild the lookup cache from lookup table again. While loading the target dimension ta-
ble we can choose to make the lookup cache dynamic and recache-persistent so that once dimension is
loaded the persistent cache file is in sync and available during Fact table loading.

16. If we use persistent cache for a dynamic lookup, will the cache file be updated or inserted DWBIConcepts
as required?

Answer:

Having persistent cache will not impact the dynamic cache anyway in doing insert & updates to the cache
file. Just that cache file will have a proper name assigned using persistent named cache and it can be reused
later.

17. Is there anything wrong in sharing a persistent cache between static and dynamic lookup?

Answer:

34
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Static & Dynamic lookup cannot share the same persistent cache.

18. What is the difference between the two update properties - update else insert, insert else
update in dynamic lookup cache?

Answer:

DWBIConcepts
I

In Dynamic Cache:

 Update else Insert: In this scenario, if incoming record already exists in lookup cache then the record
is going to be updated in the cache and also the target else it will be inserted.
 Insert else Update: In this scenario, if incoming record does not exist in lookup cache then the record
is going to be inserted in the cache and also the target else it will be updated.

These options play a role in the performance part. If we know the nature of the source data we can set the

DWBIConcepts
update option accordingly. Suppose if the maximum source data is destined for insert we will select Insert
else Update, otherwise we will go for Update else Insert. Also, if the number of duplicate records coming
from Source is greater or there are few potential duplicates in source then we go for Update Else Insert or
Insert Else Update respectively for better performance.

19. If the default value for the lookup return port is not set, what will be the output when the
lookup condition fails?

DWBIConcepts
Answer:

NULL will be returned from lookup transformation on lookup condition failure.

20. How can we ensure data is not duplicated in the target when the source has duplicate
records, using lookup transformation?
DWBIConcepts
Answer:

Using Dynamic lookup cache we can ensure duplicate records are not inserted in the target. That is through
Using Dynamic Lookup Cache of the target table and associating the input ports with the lookup port and
checking the Insert Else Update option will help to eliminate the duplicate records in source and hence load-
ing unique records in the target.

For more details check, Dynamic Lookup Cache

35
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

6. Normalizer Transformation

1. What is a Normalizer transformation?

Answer:

The normalizer transformation normalizes records from COBOL and relational sources, allowing you to or-

DWBIConcepts
ganize the data according to your own needs. A Normalizer transformation can appear anywhere in a data
flow when you normalize a relational source. Use a Normalizer transformation instead of the Source Qualifi-
er transformation when you normalize COBOL source. When you drag a COBOL source into the Mapping De-
signer Workspace, the Normalizer transformation appears, creating input and output ports for every col-
umns in the source.

2. Scenario Implementation 1

DWBIConcepts
Suppose in our Source Table we have data as given below:

Student Name Math Life Science Physical Science


Sam 100 70 80
John 75 100 85
Tom 80 100 85

We want to load our Target Table as:

DWBIConcepts
Student Name Subject Name Marks
Sam Math 100
Sam Life Science 70
Sam Physical Science 80
John Math 75
John Life Science 100
John Physical Science 85
Tom Math 80
Tom Life Science 100
Tom Physical Science 85 DWBIConcepts

Describe your approach.

Answer:

Here to convert the Rows to Columns we have to use the Normalizer Transformation followed by an Expres-
sion Transformation to decode the column taken into consideration. For more details on how the mapping is
performed please visit Working with Normalizer.

3. What are levels in Normalizer transformation?

36
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Answer:

The VSAM Normalizer transformation is the Source Qualifier for a COBOL source definition. A COBOL can
contain multiple-occurring data (Group of columns of same type) and multiple types of records in the same
file. Mostly level is for that use. The Normalizer tab defines the structure of the source data. A group of col-
umns might define a record in a COBOL source or it might define a group of multiple-occurring fields in the
source.

The column level number identifies groups of columns in the data. Level numbers define a data hierarchy.

DWBIConcepts
Columns in a group have the same level number and display sequentially below a group-level column. A
group-level column has a lower level number, and it contains no data.

4. What is the purpose of GCID and GK in a Normalizer transformation?

Answer:

DWBIConcepts
Let’s take an example:

Source data is:

Name FOOD HOUSERENT TRANSPORT


Saurav 1000 2000 500
Jenny 2000 2500 700

DWBIConcepts
When we set the OCCURS property of the Normalizer to 3, the Normalizer creates 3 input ports to get data
from the source. Say the 3 columns FOOD, HOUSERENT and TRANSPORT is connected to the 3 input ports of
the Normalizer. Then the GCID gets 3 values 1, 2 and 3 corresponding to the connected input columns for
FOOD, HOUSERENT and TRANSPORT. Going forward it generates 3 rows for each input columns values of a
single source row.

On the other hand GK will keep a sequence value starting from 1 to number of source records. It holds the
sequence number of the source records being processed.

Below will help to visualize output data from the Normalizer in GCID and GK fields:
DWBIConcepts

Name EXPENSEHEAD GCID_EXPENSEHEAD EXPENSE GK_EXPENSEHEAD


Saurav FOOD 1 1000 1
Saurav HOUSERENT 2 2000 1
Saurav TRANSPORT 3 500 1
Jenny FOOD 1 2000 2
Jenny HOUSERENT 2 500 2
Jenny TRANSPORT 3 700 2

37
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

7. Rank Transformation

1. What is a Rank Transform?

Answer:

Rank is an Active Connected transformation used to select a set of top or bottom values of data. It basically

DWBIConcepts
filters the required number of records from the top or from the bottom.

2. How does a Rank Transform differ from Aggregator Transform functions MAX and MIN?

Answer:

Like the Aggregator transformation, the Rank transformation also groups information. The Rank Transform
allows us to select a group of top or bottom values, not just one value as in case of Aggregator MAX, MIN

DWBIConcepts
functions.

3. How does a Rank Cache works?

Answer:

During a session, the Integration Service compares an input row with rows in the data cache. If the input row
out-ranks a cached row, the Integration Service replaces the cached row with the input row. If we configure

DWBIConcepts
the Rank transformation to rank based on different groups, the Integration Service ranks incrementally for
each group it finds. The Integration Service creates an index cache to stores the group information and data
cache for the row data.

4. What is a RANK port and RANKINDEX?

Answer:
DWBIConcepts
Rank port is an input/output port used to specify the column for which we want to rank the
source values. By default Informatica creates an output port RANKINDEX for each Rank trans-
formation. It stores the ranking position for each row in a group.

5. How can you get ranks based on different groups?

Answer:

Rank transformation lets us group information. We can configure one of its input/output ports as a group by
port. For each unique value in the group port, the transformation creates a group of rows falling within the
rank definition (top or bottom, and a particular number in each rank).

38
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

6. What happens if two rank values match?

Answer:

If two rank values match, they receive the same value in the rank index and the transformation skips the
next value.

DWBIConcepts
7. What are the restrictions of Rank Transformation?

Answer:

 We can connect ports from only one transformation to the Rank transformation.
 We can select the top or bottom rank.
 We need to select the Number of records in each rank.
 We can designate only one Rank port in a Rank transformation.

DWBIConcepts
8. How does Rank transformation handle string values?

Answer:

Rank transformation can return the strings at the top or the bottom of a session sort order.
When the Integration Service runs in Unicode mode, it sorts character data in the session using
the selected sort order associated with the Code Page of Integration Service which may be

DWBIConcepts
French, German, etc. When the Integration Service runs in ASCII mode, it ignores this setting and
uses a binary sort order to sort character data.

9. What is Dense Rank and does Informatica supports Dense Rank?

Answer:

When multiple rows share the same rank the next rank in the sequence is not consecutive. On the other DWBIConcepts
hand DENSE RANK assigns consecutive ranks.

Take the following example: Let’s say we want to see the top 2 highest salary of each department.

DEPTNO SAL RANK DENSE_RANK


10 400 1 1
10 400 1 1
10 300 3 2
10 100 4 3
20 550 1 1
20 550 2 2
20 150 2 2
30 200 1 1

39
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

40 600 1 1

So the normal RANK will generate the result set where we can miss rank (here RANK = 2 is missing for de-
partment 10) for due to sharing of same ranks between multiple records. On the other hand the DENSE
RANK will generate all the consecutive ranks.

Informatica RANK transform performs a simple RANK, not DENSE RANK. So using Informatica RANK trans-
form we may miss consecutive ranks.

DWBIConcepts
10. How do we achieve DENSE_RANK in Informatica?

Answer:

In order to achieve the DENSE RANK functionality in Informatica we will use the combination of Sorter, Ex-
pression and Filter transformation. Based on the previous example data set, let’s say we want to get the top
2 highest salary of each department as per DENSE RANK.

DWBIConcepts
 Use a SORTER transformation.
DEPTNO ASC, SAL DESC

 After the sorter place an EXPRESSION transformation.

PORT_NAME TYPE EXPRESSION


DEPT I/O
SAL I/O
V_COMP V IIF (DEPT <> V_DEPT_PREV, 1, IIF (DEPT = V_DEPT_PREV AND SAL <>

DWBIConcepts
V_SAL_PREV, RANK+1, RANK))
RANK O V_COMP
V_DEPT_PREV V DEPT
V_SAL_PREV V SAL

 Next use a FILTER transformation.


FILTER CONDITION: RANK < 3

11. Source table has 5 rows. Rank in rank transformation is set to 10. How many rows the DWBIConcepts
rank transformation will output?

Answer:

5 Rank

12. How you will load unique record into target flat file from source flat files has duplicate da-
ta?

Answer:
40
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

In rank transformation using group by port (Group the records) and then set no. of rank 1. Rank transfor-
mation returns one value from the group. That value will be a unique one.

DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts

41
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

8. Router Transformation

1. What is the difference between Router and Filter?

Answer:

Following differences can be note:

DWBIConcepts
Router Filter
Router transformation divides the incoming rec- Filter transformation restricts or
ords into multiple groups based on some condi- blocks the incoming record set based
tion. Such groups can be mutually inclusive (Dif- on one given condition.
ferent groups may contain same record)

Router transformation itself does not block any Filter transformation does not have a
record. If a certain record does not match any of default group. If one record does not
the routing conditions, the record is routed to de- match filter condition, the record is

DWBIConcepts
fault group blocked

Router acts like CASE... WHEN statement in SQL Filter acts like WHERE condition is
(Or Switch ()... statement in C) SQL.

In filter transformation the records are filtered based on the condition and rejected rows are discarded. In
Router the multiple conditions are placed and the rejected rows can be assigned to a port.

DWBIConcepts
2. What is the minimum number of groups we can declare in a Router transformation?

Answer:

We can define minimum 1 group condition for a Router transformation, and it will create automatically an-
other group called Default to pass those records that do not conform to the Router condition for the group
defined.

DWBIConcepts
3. Scenario Implementation 1
Loading Multiple Target Tables Based on Conditions- Suppose we have some serial numbers in a flat file
source. We want to load the serial numbers in two target files one containing the EVEN serial numbers and
the other file having the ODD ones.

Answer:

After the Source Qualifier place a Router Transformation. Create two Groups namely EVEN and ODD, with
filter conditions as:

 MOD(SERIAL_NO,2)=0
 MOD(SERIAL_NO,2)=1

Then output the two groups into two flat file targets.

42
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DWBIConcepts
DWBIConcepts
4. Scenario Implementation 2
Suppose we have a source table and we want to load three target tables based on source rows such that first
row moves to first target table, second row in second target table, third row in third target table, fourth row

DWBIConcepts
again in first target table so on and so forth. Describe your approach.

Answer:

We can clearly understand that we need a Router transformation to route or filter source data to the three
target tables. Now the question is what will be the filter conditions.

First of all we need an Expression Transformation where we have all the source table columns and along
with that we have another i/o port say seq_num, which gets sequence numbers for each source row from
the port NEXTVAL of a Sequence Generator start value 0 and increment by 1.
DWBIConcepts
Now the filter condition for the three router groups will be:

 MOD(SEQ_NUM,3)=1 connected to 1st target table


 MOD(SEQ_NUM,3)=2 connected to 2nd target table
 MOD(SEQ_NUM,3)=0 connected to 3rd target table

43
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DWBIConcepts
DWBIConcepts
5. Scenario Implementation 3
How can we distribute and load ‘n’ number of Source records equally into two target tables, so that each
have ‘n/2’ records?

Answer:

DWBIConcepts
 After Source Qualifier use an expression transformation.
 In the expression transformation create a counter variable

V_COUNTER = V_COUNTER + 1 (Variable port)


O_COUNTER = V_COUNTER (o/p port)

This counter variable will get incremented by 1 for every new record which comes in.

 Router Transformation:
DWBIConcepts
Group_ODD: IIF(MOD(O_COUNTER, 2) = 1)
Group_EVEN: IIF(MOD(O_COUNTER, 2) = 0)

Half of the record (all odd number record) will go to Group_ODD and rest to Group_EVEN.

 Finally the target tables.

44
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

9. Sequence Generator Transformation

1. What is a Sequence Generator Transformation?

Answer:

A Sequence Generator is a Passive and Connected transformation that generates numeric values.

DWBIConcepts
It is used to create unique primary key values, replace missing primary keys, or cycle through a sequential
range of numbers.

This transformation by default contains two OUTPUT ports only, namely CURRVAL and NEXTVAL. We can-
not edit or delete these ports neither we cannot add ports to this unique transformation. We can create ap-
proximately two billion unique numeric values with the widest range from 1 to 2147483647.

2. Define the Properties available in Sequence Generator transformation in brief.

DWBIConcepts
Answer:

Sequence Generator Description


Properties
Start Value Start value of the generated sequence that we want the Integration
Service to use if we use the Cycle option. If we select Cycle, the In-
tegration Service cycles back to this value when it reaches the end
value. Default is 0.

DWBIConcepts
Increment By Difference between two consecutive values from the NEXTVAL
port. Default is 1.
End Value Maximum value generated by Sequence Generator. After reaching
this value the session will fail if the sequence generator is not con-
figured to cycle. Default is 2147483647.
Current Value Current value of the sequence. Enter the value we want the Inte-
gration Service to use as the first value in the sequence. Default is
1.
Cycle If selected, when the Integration Service reaches the configured
end value for the sequence, it wraps around and starts the cycle DWBIConcepts
again, beginning with the configured Start Value.
Number of Cached Number of sequential values the Integration Service caches at a
Values time. Default value for a standard Sequence Generator is 0. Default
value for a reusable Sequence Generator is 1,000.
Reset Restarts the sequence at the current value each time a session
runs. This option is disabled for reusable Sequence Generator
transformations.

45
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

3. Scenario Implementation 1
Suppose we have a source table populating two target tables. We connect the NEXTVAL port of the Se-
quence Generator to the surrogate keys of both the target tables.
Will the Surrogate keys in both the target tables be same? If not how can we flow the same sequence values
in both of them.

Answer:

When we connect the NEXTVAL output port of the Sequence Generator directly to the surrogate key col-

DWBIConcepts
umns of the target tables, the Sequence number will not be the same.

A block of sequence numbers is sent to one target tables surrogate key column. The second target receives a
block of sequence numbers from the Sequence Generator transformation only after the first target table re-
ceives the block of sequence numbers.

Suppose we have 5 rows coming from the source, so the targets will have the sequence values as TGT1
(1,2,3,4,5) and TGT2 (6,7,8,9,10). [Taken into consideration Start Value 0, Current value 1 and Increment by
1]

Now suppose the requirement is like that we need to have the same surrogate keys in both the targets.

DWBIConcepts
Then the easiest way to handle the situation is to put an Expression transformation in between the Se-
quence Generator and the Target tables. The Sequence Generator will pass unique values to the expression
transformation, and then the rows are routed from the expression transformation to the targets.

DWBIConcepts
DWBIConcepts

4. Scenario Implementation 2
Suppose we have 100 records coming from the source. Now for a target column population we used a Se-
quence generator.
Suppose the Current Value is 0 and End Value of Sequence generator is set to 80. What will happen?

Answer:

End Value is the maximum value the Sequence Generator will generate. After it reaches the End value the
session fails with the following error message:

46
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

TT_11009 Sequence Generator Transformation: Overflow error.

Failing of session can be handled if the Sequence Generator is configured to Cycle through the sequence, i.e.
whenever the Integration Service reaches the configured end value for the sequence; it wraps around and
starts the cycle again, beginning with the configured Start Value.

5. What are the changes we observe when we promote a non-reusable Sequence Generator

DWBIConcepts
to a reusable one? And what happens if we set the Number of Cached Values to 0 for a
reusable transformation?

Answer:

When we convert a non-reusable sequence generator to reusable one we observe that the Number of
Cached Values is set to 1000 by default.

And the Reset property is disabled.

DWBIConcepts
When we try to set the Number of Cached Values property of a Reusable Sequence Generator to 0 in the
Transformation Developer we encounter the following error message:

The number of cached values must be greater than zero for reusable sequence transformation.

6. How Sequence Generator in the mapping is handled when we migrate the mapping from
one environment to another?

DWBIConcepts
Answer:

While promoting the Informatica Objects using Copy Folder Wizard we have the option to choose to retain
existing values or to replace them with values from the source folder.
Generally we Retain the current values for the Sequence Generator transformation in the destination fold-
er, else we may end up having duplicate values for the sequence generated column and may result to ses-
sion failure.

Find the below Informatica Metadata query which gives the list of the current value of Sequence Generator
transform:
DWBIConcepts
SELECT
OPB_SUBJECT.SUBJ_NAME AS "FOLDER NAME",
OPB_MAPPING.MAPPING_NAME AS "MAPPING NAME",
REP_WIDGET_INST.INSTANCE_NAME AS "SEQ NAME",
OPB_WIDGET_ATTR.ATTR_VALUE AS "CURRENT VALUE"
FROM REP_WIDGET_INST
INNER JOIN OPB_MAPPING ON
(REP_WIDGET_INST.MAPPING_ID = OPB_MAPPING.MAPPING_ID)
INNER JOIN OPB_WIDGET_ATTR ON
(REP_WIDGET_INST.WIDGET_TYPE = OPB_WIDGET_ATTR.WIDGET_TYPE AND
REP_WIDGET_INST.WIDGET_ID = OPB_WIDGET_ATTR.WIDGET_ID)
INNER JOIN OPB_SUBJECT ON
(OPB_MAPPING.SUBJECT_ID = OPB_SUBJECT.SUBJ_ID )
WHERE
REP_WIDGET_INST.WIDGET_TYPE_NAME like 'Sequence%'
AND OPB_WIDGET_ATTR.ATTR_ID = 4 --Current Value
ORDER BY OPB_MAPPING.MAPPING_NAME

47
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

7. Scenario Implementation 3
Consider we have two mappings that populate a single target table from two different source systems. Both
the mappings have Sequence Generator transform to generate surrogate key in the target table. How can
we ensure that the surrogate key generated is consistent and does not generate duplicate values when pop-
ulating data from two different mappings?

Answer:

DWBIConcepts
We should use a Reusable Sequence Generator in both the mappings to generate the target surrogate keys.

8. How do I get a Sequence Generator to "pick up" where another "left off"?

Answer:

Use an unconnected lookup on the Sequence ID of the target table. Set the properties to "LAST VALUE", in-

DWBIConcepts
put port is an ID. the condition is: SEQ_ID >= input_ID. Then in an expression set up a variable port: connect
a NEW self-resetting sequence generator to a new input port in the expression. The variable port's expres-
sion should read: IIF( v_seq = 0 OR ISNULL(v_seq) = true, :LKP.lkp_sequence(1), v_seq). Then, set up an
output port. Change the output port's expression to read: v_seq + input_seq (from the resetting sequence
generator). Thus you have just completed an "append" without a break in sequence numbers.

DWBIConcepts
DWBIConcepts

48
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

10. Stored Procedure Transformation

1. What is a Stored Procedure Transformation?

Answer:

DWBIConcepts
Stored Procedure is a Passive transformation used to execute stored procedures pre-built on the database
through Informatica ETL. It can also be used to call functions to return calculated values.

2. How many types of Stored Procedure transformation are there?

Answer:

There are two types of Stored Procedure transformation based on calling, Connected and Uncon-

DWBIConcepts
nected. Based on the execution order they can be classified as Source Pre Load, Source Post Load,
Normal, Target Pre Load and Target Post Load.

Normal Stored Procedure transformation can be configured as both connected and unconnected whereas
Pre-Post Load Stored Procedures are unconnected ones.

3. How do we call an Unconnected Stored Procedure transformation?

DWBIConcepts
Answer:

The unconnected Stored Procedure transformation is called from expression transformation using the
:SP.<Stored_Procedure_Name>(Argument1, Argument2).

Conditional execution of a Stored Procedure is possible using Unconnected Stored Procedure unlike the con-
nected one.

4. How do we set the Execution order of Pre-Post Load Stored Procedure?


DWBIConcepts
Answer:

We set the execution order using the Stored Procedure Plan from the mapping property.

5. How do we set the Call Text for Stored Procedure transformation?

Answer:

Once we specify the Stored Procedure Type other than Normal, the Call Text Attribute in the Properties tab
gets enabled. Here we have to specify how the procedure has to be called along with arguments to be
passed. E.g. <Stored_Procedure_Name>(Argument1, Argument2).

49
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

6. How do we receive output/return parameters from Unconnected Stored Procedure?

Answer:

Configure the expression to send any input parameters and capture any output parameters or return value
You must know whether the parameters shown in the Expression Editor are input or output parameters. You

DWBIConcepts
insert variables or port names between the parentheses in the exact order that they appear in the stored
procedure itself. The datatypes of the ports and variables must match those of the parameters passed to the
stored procedure.

For example, when you click the stored procedure, something similar to the following appears:

:SP.GET_NAME_FROM_ID()

This particular stored procedure requires an integer value as an input parameter and returns a string value
as an output parameter. How the output parameter or return value is captured depends on the number of

DWBIConcepts
output parameters and whether the return value needs to be captured.

If the stored procedure returns a single output parameter or a return value (but not both), you should use
the reserved variable PROC_RESULT as the output variable. In the previous example, the expression would
appear as:

:SP.GET_NAME_FROM_ID(inID, PROC_RESULT)

InID can be either an input port for the transformation or a variable in the transformation. The value of
PROC_RESULT is applied to the output port for the expression.

DWBIConcepts
If the stored procedure returns multiple output parameters, you must create variables for each output pa-
rameter. For example, if you created a port called varOUTPUT2 for the stored procedure expression, and a
variable called varOUTPUT1, the expression would appears as:

:SP.GET_NAME_FROM_ID (inID, varOUTPUT1, PROC_RESULT)

The value of the second output port is applied to the output port for the expression, and the value of the
first output port is applied to varOUTPUT1. The output parameters are returned in the order they are de-
clared in the stored procedure itself.
DWBIConcepts
With all these expressions, the datatypes for the ports and variables must match the datatypes for the in-
put/output variables and return value.

50
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

11. Sorter Transformation

1. What is a Sorter Transformation?

Answer:

Sorter is an Active Connected transformation used to sort data in ascending or descending order according

DWBIConcepts
to specified sort keys. The Sorter transformation contains only input/output ports.

2. Why is Sorter an Active Transformation?

Answer:

This is because we can select the “distinct” option in the sorter property. When the Sorter transformation is
configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Inte-

DWBIConcepts
gration Service discards duplicate rows compared during the sort operation. The number of Input
Rows will vary as compared with the Output rows and hence it is an Active transformation.

3. How does Sorter handle Case Sensitive sorting?

Answer:

The Case Sensitive property determines whether the Integration Service considers case when sorting data.

DWBIConcepts
When we enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than
lowercase characters.

4. How does Sorter handle NULL values?

Answer:

We can configure the way the Sorter transformation treats null values. Enable the property Null Treated DWBIConcepts
Low if we want to treat null values as lower than any other value when it performs the sort operation. Disa-
ble this option if we want the Integration Service to treat null values as higher than any other value.

5. How does a Sorter Cache works?

Answer:

The Integration Service passes all incoming data into the Sorter Cache before Sorter transfor-
mation performs the sort operation.

The Integration Service uses the Sorter Cache Size property to determine the maximum amount

51
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

of memory it can allocate to perform the sort operation. If it cannot allocate enough memory, the Integra-
tion Service fails the session. For best performance, configure Sorter cache size with a value less than or
equal to the amount of available physical RAM on the Integration Service machine.

If the amount of incoming data is greater than the amount of Sorter cache size, the Integration Service tem-
porarily stores data in the Sorter transformation work directory. The Integration Service requires disk space
of at least twice the amount of incoming data when storing data in the work directory.

DWBIConcepts
6. How to delete duplicate records or rather to select distinct rows for flat file sources?

Answer:

Since the source system is a Flat File you will not be able to select the distinct option in the source qualifier
as it will be disabled due to flat file source table. Hence the next approach may be we use a Sorter Trans-
formation and check the Distinct option. When we select the distinct option all the columns will the selected

DWBIConcepts
as keys, in ascending order by default.

DWBIConcepts
DWBIConcepts

52
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

12. Union Transformation

1. What is a Union Transformation?

Answer:

Union is an Active, Connected non-blocking multiple input group transformation used to merge data from

DWBIConcepts
multiple pipelines or sources into one pipeline branch. Similar to the UNION ALL SQL statement, the Union
transformation does not remove duplicate rows.

2. What are the restrictions of Union Transformation?

Answer:

 All input groups and the output group must have matching ports. The precision, data type, and scale

DWBIConcepts
must be identical across all groups.
 We can create multiple input groups, but only one default output group.
 The Union transformation does not remove duplicate rows.
 We cannot use a Sequence Generator or Update Strategy transformation upstream from a Union
transformation.
 The Union transformation does not generate transactions.

3. How come union transformation is active?

DWBIConcepts
Answer:

Active transformations are those that may change the number or position of rows in the data
stream. Any transformation that splits or combines data streams or reduces, expands or sorts da-
ta is an active transformation because it cannot be guaranteed that when data passes through the
transformation the number of rows and their position in the data stream are always unchanged.

Union is an active transformation because it combines two or more data streams into one. Though the total
number of rows passing into the Union is the same as the total number of rows passing out of it, and the se-
quence of rows from any given input stream is preserved in the output, the positions of the rows are not DWBIConcepts
preserved, i.e. row number 1 from input stream 1 might not be row number 1 in the output stream. Union
does not even guarantee that the output is repeatable.

For Union, number of input rows does not match with the number of output rows. Consider, we have two
sources with 10 and 20 rows individually. For each of this input Source we are getting 30 output rows. We
could probably consider this like a Joiner with 10 and 20 rows with Full Outer Join, with no matching col-
umns, which will give you all the rows as output.

It is a debatable Topic as why UNION transformation is Active. Union Transformation is derived from
Multigroup External transformation. As Multigroup External transformation is Active, Union transformation
can be termed as active.

53
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

13. Update Strategy Transformation

1. What is Update Strategy transform?

Answer:

Update strategy defines the sources to be flagged for insert, update, delete, and reject at the targets.

DWBIConcepts
2. What are Update Strategy Constants?

Answer:

 DD_INSERT - 0
 DD_UPDATE - 1

DWBIConcepts
 DD_DELETE - 2
 DD_REJECT - 3

3. How can we update a record in target table without using Update strategy?

Answer:

DWBIConcepts
A target table can also be updated without using “Update Strategy”. For this, we need to define
the key in the target table in Informatica level and then we need to connect the key and the
field we want to update in the mapping Target. In the session level, we should set the target
property as “Update as Update” and enable the “Update” check-box.

Let's assume we have a target table "Customer" with fields as "Customer ID", "Customer Name" and "Cus-
tomer Address". Suppose we want to update "Customer Address" without an Update Strategy. Then we
have to define "Customer ID" as primary key in Informatica level and we will have to connect Customer ID
and Customer Address fields in the mapping. If the session properties are set correctly as described above,
then the mapping will only update the customer address field for all matching customer IDs.
DWBIConcepts

4. What is Data Driven?

Answer:

Update strategy defines the sources to be flagged for insert, update, delete, and reject at the targets.
Treat input rows as Data Driven: This is the default session property option selected while using an Update
Strategy transformation in a mapping.
The integration service follows the instructions coded in mapping to flag the rows for insert, update, delete
or reject.

54
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

5. What happens when DD_UPDATE is defined in update strategy and Treat source rows as
INSERT is selected in Session?

Answer:

If in Session anything other than DATA DRIVEN is mentioned then Update strategy in the mapping is ignored.

DWBIConcepts
6. What are the three areas where the rows can be flagged for particular treatment?

Answer:

 In Mapping – Update Strategy


 In Session - Treat Source Rows As
 In Session - Target Insert / Update / Delete Options.

DWBIConcepts
7. By default operation code for any row in Informatica without being altered is INSERT.
Then state when do we need DD_INSERT?

Answer:

When we handle data insertion, updating, deletion and/or rejection in a single mapping, we use
Update Strategy transformation to flag the rows for Insert, Update, Delete or Reject. We flag it

DWBIConcepts
by either providing the values 0, 1, 2, 3 respectively or by DD_INSERT, DD_UPDATE, DD_DELETE
or DD_REJECT in the Update Strategy transformation. By default the transform has the value '0'
and hence it performs insertion.

Suppose we want to perform insert or update target table in a single pipeline. Then we can write the below
expression in update strategy transformation to insert or update based on the incoming row.
IIF (LKP_EMPLOYEE_ID IS NULL, DD_INSERT, DD_UPDATE)

If we can use more than one pipeline then, it’s not a problem. For the Insert part we don’t even need an Up-
date Strategy transform explicitly (DD_INSERT), we can map it straight away. DWBIConcepts

8. What is the difference between update strategy and following update options in target?
Update as Update - Update as Insert - Update else Insert Even if we do not use update strategy we can still
update the target by setting, for example Update as Update and treating target rows as data driven. So
what's the difference here?

Answer:

The operations for the following options will be done in the Database Level.

 Update as Update
 Update as Insert
 Update else Insert

55
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

It will write a 'select' statement on the target table and will compare with the source. Accordingly if the rec-
ord already exits it will do an update else it will insert. On the other hand the update strategy the operations
will be done at the Informatica level itself.

Update strategy also gives conditional update option - wherein based on some condition you can update/ in-
sert even reject the rows. Such conditional options are not available in target based updates (wherein it will
either “update” or it will perform “update else insert” based on the keys defined in Informatica level)

DWBIConcepts
9. What is the use of Forward Reject rows in Mapping?

Answer:

If DD_REJECT is selected in the Update Strategy, then we need to select this option to generate the Reject/
Bad file.

DWBIConcepts
10. Scenario Implementation 1
Suppose we have source employee table and we want to load employees who belong to department 10 to
Target 1, 20 to Target 2 and 30 to Target 3. Describe the approach without using FILTER or ROUTER Trans-
formations.

Answer:

We will use three separate Update Strategy transformations before each of the target tables (T1, T2, T3),
and provide below condition in their expression editor:

DWBIConcepts
UPD_T1: IIF (DEPTNO = 10, DD_INSERT, DD_REJECT)
UPD_T2: IIF (DEPTNO = 20, DD_INSERT, DD_REJECT)
UPD_T3: IIF (DEPTNO = 30, DD_INSERT, DD_REJECT)

DWBIConcepts

56
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

14. Java Transformation

1. Scenario Implementation 1
Source:

Col1 Col2
A 3
B 2

DWBIConcepts
C 2

Target:

Col1 Col2
A 3
A 3
A 3
B 2
B 2

DWBIConcepts
C 2
C 2

Answer:

Using Java transformation in Informatica we can generate as many records required as per the requirement.
Here goes the Java code.
In_Col1 = Col1;
In_Col2 = Col2;

DWBIConcepts
for (int i = 0, i < In_Col2, i++) {
Out_Col1 = In_Col1;
Out_Col2 = In_Col2;
generaterows();
}

2. Scenario Implementation 2
How can I replace characters e.g. A to Z in a particular string to its ASCII value?
E.g. Input String-AB123C1; Output string-6566123671 DWBIConcepts
Answer:

If the INPUT string is fixed size of 9 characters, Use the below code as expression in an Output port of an
Informatica Expression transformation.
Alternatively you can use Informatica User-Defined Function with the INPUT string as an Argument:

IIF( IS_NUMBER( SUBSTR( INPUT, 1, 1 ) ) = 1, SUBSTR( INPUT, 1, 1 ),


TO_CHAR( ASCII( SUBSTR( INPUT, 1, 1 ) ) ) ) ||
IIF( IS_NUMBER( SUBSTR( INPUT, 2, 1 ) ) = 1, SUBSTR( INPUT, 2, 1 ),
TO_CHAR( ASCII( SUBSTR( INPUT, 2, 1 ) ) ) ) ||
IIF( IS_NUMBER( SUBSTR( INPUT, 3, 1 ) ) = 1, SUBSTR( INPUT, 3, 1 ),
TO_CHAR( ASCII( SUBSTR( INPUT, 3, 1 ) ) ) ) ||
IIF( IS_NUMBER( SUBSTR( INPUT, 4, 1 ) ) = 1, SUBSTR( INPUT, 4, 1 ),
TO_CHAR( ASCII( SUBSTR( INPUT, 4, 1 ) ) ) ) ||
IIF( IS_NUMBER( SUBSTR( INPUT, 5, 1 ) ) = 1, SUBSTR( INPUT, 5, 1 ),
57
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

TO_CHAR( ASCII( SUBSTR( INPUT, 5, 1 ) ) ) ) ||


IIF( IS_NUMBER( SUBSTR( INPUT, 6, 1 ) ) = 1, SUBSTR( INPUT, 6, 1 ),
TO_CHAR( ASCII( SUBSTR( INPUT, 6, 1 ) ) ) ) ||
IIF( IS_NUMBER( SUBSTR( INPUT, 7, 1 ) ) = 1, SUBSTR( INPUT, 7, 1 ),
TO_CHAR( ASCII( SUBSTR( INPUT, 7, 1 ) ) ) ) ||
IIF( IS_NUMBER( SUBSTR( INPUT, 8, 1 ) ) = 1, SUBSTR( INPUT, 8, 1 ),
TO_CHAR( ASCII( SUBSTR( INPUT, 8, 1 ) ) ) ) ||
IIF( IS_NUMBER( SUBSTR( INPUT, 9, 1 ) ) = 1, SUBSTR( INPUT, 9, 1 ),
TO_CHAR( ASCII( SUBSTR( INPUT, 9, 1 ) ) ) )

As per the requirement we want to convert just the Characters in an input String to its ASCII equivalent not

DWBIConcepts
the Digits.

If the requirement were to convert a single character to ASCII equivalent in Informatica, then
the ASCII in-built function of Informatica would have been helpful. E.g. ASCII(inp_chr)
But single this is a string and we need the ASCII equivalent of each characters in the string i.e.
parse each characters; concept of loop comes in picture. So use Informatica JAVA transformation.

Use Informatica Passive Java transformation:

DWBIConcepts
I have the i/p column name as INPUT and o/p value from Java transform as OUTPUT port created.
On the Java Code tab of Java transformation use the below java code:-

String inp = INPUT;


String ch;
String out="";

for (int i = 0; i < inp.length(); i++) {


ch= inp.substring(i, i+1);
char c = inp.charAt(i);
if(! Character.isDigit(c)) {

DWBIConcepts
int j = (int) c;
out = out + j;
} else
out = out + ch;
}
OUTPUT = out;

DWBIConcepts

58
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

15. Source Qualifier Transformation

1. What is a Source Qualifier? What are the tasks we can perform using a Source Qualifier
and why it is an ACTIVE transformation?

Answer:

DWBIConcepts
A Source Qualifier is an Active and Connected transformation that reads the rows from a relational database
or flat file source.

 We can configure the SQ to join [Both INNER as well as OUTER JOIN] data originating from the same
source database.
 We can use a source filter to reduce the number of rows the Integration Service queries.
 We can specify a number for sorted ports and the Integration Service adds an ORDER BY clause to
the default SQL query.
 We can choose Select Distinct option for relational databases and the Integration Service adds a SE-
LECT DISTINCT clause to the default SQL query.

DWBIConcepts
 Also we can write Custom/Used Defined SQL query which will override the default query in the
Source Qualifier by changing the default settings of the transformation properties for relational da-
tabases.
 Also we have the option to write Pre as well as Post SQL statements to be executed before and after
the Source Qualifier query in the source database.

Since the transformation provides us with the property Select Distinct, when the Integration Service adds a
SELECT DISTINCT clause to the default SQL query, which in turn affects the number of rows returned by the
Database to the Integration Service and hence it is an Active transformation.

DWBIConcepts
2. What happens to a mapping if we alter the data types between Source and its corre-
sponding Source Qualifier?

Answer:

The Source Qualifier transformation displays the Informatica data types. The transformation data types de-
termine how the source database binds data when the Integration Service reads it.

Now if we alter the data types in the Source Qualifier transformation or the data types in the Source defini- DWBIConcepts
tion and Source Qualifier transformation do not match, the Designer marks the mapping as invalid when
we save the mapping.

3. Suppose we have used the Select Distinct and the Number of Sorted Ports property in the
Source Qualifier and then we add Custom SQL Query. Explain what will happen.

Answer:

Whenever we add Custom SQL or SQL override query it overrides the User-Defined Join, Source Filter, Num-
ber of Sorted Ports, and Select Distinct settings in the Source Qualifier transformation. Hence only the user
defined SQL Query will be fired in the database and all the other options will be ignored.

59
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

4. Describe the situations where we will use the Source Filter, Select Distinct and Number of
Sorted Ports properties of Source Qualifier transformation.

Answer:

Source Filter option is used basically to reduce the number of rows the Integration Service queries, so as to

DWBIConcepts
improve performance.

Select Distinct option is used when we want the Integration Service to select unique values from a source.
Filtering out unnecessary data earlier in the data flow, will improve performance.

Number Of Sorted Ports option is used when we want the source data to be in a sorted fashion, so as to use
the same in some following transformations like Aggregator or Joiner, those when configured for sorted in-
put will improve the performance.

DWBIConcepts
5. What will happen if the SELECT list COLUMNS in the Custom override SQL Query and the
OUTPUT PORTS order in Source Qualifier transformation do not match?

Answer:

Mismatch or changing the order of the list of selected columns in the SQL Query override of Source Qualifier
to that of the connected transformation output ports may result is unexpected value result for ports if data
types matches by chance, else will lead to session failure.

DWBIConcepts
6. What happens if in the Source Filter property of SQ transformation we include keyword
WHERE say, WHERE CUSTOMERS.CUSTOMER_ID > 1000.

Answer:

We use Source filter to reduce the number of source records. If we include the string WHERE in the source
filter, the Integration Service fails the session. In the above case, the correct syntax will be CUSTOM-
ERS.CUSTOMER_ID > 1000
DWBIConcepts

7. Describe the scenarios where we go for Joiner transformation instead of Source Qualifier
transformation.

Answer:

While joining Source Data of heterogeneous sources as well as to join flat files we will use the Joiner trans-
formation. Use the Joiner transformation when we need to join the following types of sources:

 Join data from different Relational Databases.


 Join data from different Flat Files.
 Join relational sources and flat files.

60
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

8. What is the maximum number we can use in Number of Sorted Ports for Sybase source
system?

Answer:

Sybase supports a maximum of 16 columns in an ORDER BY clause. So if the source is Sybase, do not sort

DWBIConcepts
more than 16 columns.

9. What is use of Source Qualifier in Informatica? Can we create a mapping without a source
qualifier?

Answer:

Source Qualifier is used to convert the data types of Heterogeneous Source Objects supported by

DWBIConcepts
Informatica to Native Informatica data types, after which Informatica processes the following ob-
jects in a mapping with consistent Informatica data types.

Also for relational table Source Qualifier helps to join multiple tables from the same database and also al-
lows doing Pre or Post SQL operations.

We cannot create a mapping without Source Qualifier; it is the first transformation in Informatica that is at-
tached with the source tables or source flat file instance.

DWBIConcepts
10. Suppose we have two tables of same database type, residing in different Database in-
stance. If a Database Link is available, how can we join the two tables using a Source
Qualifier in Informatica provided there are valid join columns.

Answer:

Source Qualifier Override:-

SELECT e.empno, e.ename, s.salary, s.comm


FROM emp e, sal@dblinkname s DWBIConcepts
WHERE e.empno=s.empno

It is advisable to create a Public Synonym at Database for the remote tables so that we can avoid using the
syntax : TableName@DBLinkName

11. What is the meaning of “output is deterministic” property in source qualifier transfor-
mation?

Answer:

Output is deterministic means we are informing Informatica that the output does not change (for
the same input) across every session run. Why is this required? Consider the source is relational

61
and we have enabled the session for recovery. The session fails and we resume the session. In this

© www.dwbiconcepts.com – All rights reserved.


www.dwbiconcepts.com – Community of DWBI Professionals

case if we have set the source as deterministic, then the session would have created a cache (on the disc) of
the source during normal run to be used for recovery. This saves time during recovery because we need not
issue the SQL command to the source database again.

If this was not set, then the source data cache is not created during normal run and SQL will be reissued dur-
ing recovery. In some cases, if this property is not set you will not be able to enable recovery for the session.

DWBIConcepts
12. Scenario Implementation 1
How to delete duplicate rows present in relational database using Informatica? Suppose we have duplicate
records in Source System and we want to load only the unique records in the Target System eliminating the
duplicate rows. What will be the approach?

Answer:

Assuming that the source system is a Relational Database, to eliminate duplicate records, we can check
the Distinct option of the Source Qualifier of the source table and load the target accordingly.

DWBIConcepts
DWBIConcepts
DWBIConcepts

62
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

16. Miscellaneous

1. What are the new features of Informatica 9.x in developer level?

Answer:

From a developer's perspective, some of the new features in Informatica 9.x are as follows:

DWBIConcepts
 Now Lookup can be configured as an active transformation - it can return multiple rows on success-
ful match.
 Now you can write SQL override on un-cached lookup also. Previously you could do it only on
cached lookup.
 You can control the size of your session log. In a real-time environment you can control the session
log file size or time.
 Database deadlock resilience feature - this will ensure that your session does not immediately fail if
it encounters any database deadlock, it will now retry the operation again. You can configure num-
ber of retry attempts.

DWBIConcepts
 Cache can be updated based on a condition or expression.
 New interface for admin console, now onwards called Informatica Administrator. (Create connection
objects, grant permission on database connections, deploy or configure deployment units from the
Informatica Administrator)
 PowerCenter licensing now onwards based on the number of CPUs and repositories.

2. Name the transformations which converts one to many rows i.e. increases the I/P: O/P
row count. Also what is the name of its reverse transformation?

DWBIConcepts
Answer:

Normalizers as well as Router Transformations are two Active transformations which can increase the num-
ber of input rows to output rows.

Aggregator Transformation performs the reverse action of Normalizer transformation.

3. How many ways we can filter records? DWBIConcepts

Answer:

 Source Qualifier
 Filter transformation
 Router transformation
 Update strategy

4. What are the transformations that use cache for performance?

Answer:
63
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Aggregator, Sorter, Lookups, Joiner and Rank transformations use cache.

5. What is the formula for calculation of Lookup/Rank/Aggregator index & data caches?

Answer:

 Index cache size = Total no. of rows * size of the column in the lookup condition (50 * 4)

DWBIConcepts
 Aggregator/Rank transformation Data Cache size = (Total no. of rows * size of the column in the
lookup condition) + (Total no. of rows * size of the connected output ports)

 Aggregator Index cache: #Groups ((Σ column size) + 7)


 Aggregate data cache: #Groups ((Σ column size) + 7)

 Lookup Index Cache : #Rows in lookup table [(Σ column size) + 16)

DWBIConcepts
 Lookup Data Cache: #Rows in lookup table [(Σ column size) + 8]

 Joiner Index Cache: #Master rows [(Σ column size) + 16)


 Joiner Data Cache: #Master row [(Σ column size) + 8]

 Rank Index Cache : #Groups ((Σ column size) + 7)


 Rank Data Cache: #Group [(#Ranks * (Σ column size + 10)) + 20]

DWBIConcepts
6. What is the difference between Informatica PowerCenter and Exchange and Mart?

Answer:

PowerCenter:

 PowerCenter can have many repositories.


 It supports the Global Repository and networked local repositories.
 PowerCenter can connect to all native legacy source systems such as Mainframe, ERP, CRM, EAI
(TIBCO, MSMQ, JMQ)
 High Availability and Load sharing on multiple servers in the grid.
DWBIConcepts
 Informatica Session level Partioning is available.
 Informatica Pushdown Optimizer is available.

PowerMart:

 PowerMart supports only one repository.


 PowerMart can connect to Relational and flat file sources.

PowerExchange:

 PowerExchange Client and PowerExchange ODBC are PowerExchange interfaces to extract and load
data for a variety of data types on a variety of platforms relational, non-relational, and changed data
in batch-mode or real-time using PowerCenter.

64
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 The PowerExchange Client for PowerCenter is installed with PowerCenter and integrates
PowerExchange(Separate License for the required source system; Check Sources->Import from
PowerExchange) and PowerCenter to extract relational, non-relational, and changed data.

7. How do we handle delimiter character as a part of the data in a delimited source file?

Answer:

DWBIConcepts
For delimiter files the delimiter is the separator that identifies the data values of fields present in
the file.
So ideally if the data file contains the delimiter character as a part of the data in a field value,
the field value either remains within double or single quotes or an escape character precedes
the delimiter that is actually to be treated as a normal character.

To handle the same flat-files in Informatica, use the following options as per the data file format while defin-
ing the file structure.

1. Select Optional Quotes to Double or Single Quote. The column delimiters within the quote characters are

DWBIConcepts
ignored.

2. Escape Character used to escape the delimiter or quote character.


Escape character preceding the delimiter character in an unquoted string or the quote character in a quoted
string is treated as regular character.

8. We have just received source files from UNIX. We want to stage that data to ETL process.
What are the points we need to look for?

DWBIConcepts
Answer:

When a source flat file is loaded to a staging database table, generally we focus on the below items:

 Define proper file-format for the input file (Delimited/Fixed-width), Code Page etc.
 Header information having any Processing date to be checked with sysdate or some other business
logic.
 Check the detail records count in the file with the information in the Trailer information if any. DWBIConcepts
 Sum of any measure fields of detail records matches with Header/Trailer information if any.
 In case of Indirect Loading we can add the filename and record number in file as part of columns in
the staging table.

Basically everything depends on your/business requirement.

9. What is the difference between Joiner and Lookup. Performance wise which one is better
to use.

Answer:

Joiner:

65
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 Only “=” operator can be used in join condition


 Supports normal, full outer, left/right outer join
 Active transformation
 No SQL override
 Connected transformation
 No dynamic cache
 Heterogeneous source

DWBIConcepts
Lookup:

 =, <, <=, >, >=, != operators can be used in join condition


 Supports left outer join
 Earlier a Passive transformation, 9 onwards an Active transformation (Can return more than 1 rec-
ords in case of multiple match)
 Supports SQL override
 Connected/Unconnected
 Supports dynamic cache update
 Relational/FlatFile source/target
 Pipeline Lookup

DWBIConcepts
Selection between these two transformations is completely dependent on project requirement. It’s a debat-
able topic to conclude which one among these two serves good in terms of performance.

10. What is the B2B in Informatica? How can we use it in Informatica?

DWBIConcepts
Answer:

B2B allows to parse and read unstructured data such as PDF, EXCEL, HTML etc. It has the capability to read
binary data such as Messages, EBCDIC File etc. and has a very large list of supported formats.

B2B Data Transformation Studio is the Developer tool, by which the parsing of (reading) the unstructured da-
ta is done. B2B mostly gives the output as an XML file.

B2B Data Transformation is integrated with Informatica PowerCenter using a Transformation "Unstructured
Data Transformation", This transformation can receive the output of B2B Data Transformation studio and
load into any Target supported by PowerCenter. DWBIConcepts

11. What is CDC, SCD and MD5 in Informatica?

Answer:

 CDC - Changed Data Capture. How, only the changed data is captured from the Source System.
 SCD- Slowly Changing Dimension. How, history data is maintained in the Dimension tables.
 MD5- MD5 Checksum Encoding. It generates 32 character HEX code encoding, can be used to decide
Insert/Update strategy for target records.

66
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

12. How can we implement an SCD Type2 mapping without using a lookup transformation?

Answer:

The entire implementation will be same as that using a lookup. The only thing we need to replace the
Lookup transformation with a Joiner transformation. In the Joiner transformation the Source table will be
used as Master and the Target table as Detail. The join condition will be same as that of lookup condition
and the join type being Detail Outer Join.

DWBIConcepts
13. How does Joiner and Lookup transformation treat NULL value matching?

Answer:

A NULL value is not equal to another NULL value in Joiner whereas, Lookup transformation matches null val-
ues.

DWBIConcepts
14. Does Microsoft SQL server supports bulk loading? If yes, What happens when you specify
bulk mode and data driven for SQL server target

Answer:

Yes MS SQL Server supports Bulk Loading. But if we select Treat Source Rows as Data Driven with the Target
Load Type as Bulk then the session will fail. We have to select Normal Load with Data Driven source records.

DWBIConcepts
15. How can you utilize COM components in Informatica?

Answer:

By writing C+, VB, VC++ code in External Stored Procedure Transformation

DWBIConcepts
16. What is SQL transformation in Informatica?

Answer:

A SQL transformation can processes any SQL queries midstream in an Informatica pipeline. It supports
mostly all the DDL, DML, DCL, TCL.

For quick reference following are some important notes:-

 We can configure the SQL transform in two modes that makes it Active/Passive.

 Active, Query mode fires the SQL query in the database defined in the transformation.

 Script mode, which is the Passive, one can call external SQL scripts to be executed.

67
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 Query mode can be configured to handle Static SQL Query (i.e. the SQL query is the same with bind
variables) or Dynamic SQL Query (i.e. different query statements for each input row).

 In case of Dynamic Query when we substitute the entire SQL query of the Query_Port is called Full
Query or portion of the query statement called Partial Query.

 We can configure the SQL transformation to connect to a database with a Static Connection (i.e. se-
lecting a particular connection object) or Dynamic Connection (i.e. based on the logic it will dynami-
cally select the connection object to connect to a database).

DWBIConcepts
Also we can pass the entire database connection information (i.e. username,password, connectstring,
codepage) called Full Database Connection.

17. What is a XML source qualifier?

Answer:

DWBIConcepts
The XML source qualifier represents the data elements that the Informatica server reads when it runs a ses-
sion with XML sources.

18. What is the “metadata extensions” tab in Informatica?

Answer:

PowerCenter allows end users and partners to extend the metadata stored in the repository by associating

DWBIConcepts
information with individual objects in the repository. That why it’s called Metadata Extension.

For example, when we create a mapping, we can store the information like the mapping functionality, busi-
ness user information, CR information. Similarly for Session we can store schedule information, contact per-
son for failed session information. We basically associate the information with repository metadata using
metadata extensions.

When we create reusable metadata extensions for a repository object using the Repository Manager, the
metadata extension becomes part of the properties of that type of object. For example, we can create a re-
usable metadata extension for source definition called SourceCreator. When we create or edit any source DWBIConcepts
definition in the Designer, the SourceCreator extension appears on the Metadata Extensions tab. anyone
who creates or edits a source can enter the name of the person that created the source into this field.

PowerCenter Client applications can contain the following types of metadata extensions:-

 Vendor-defined. Third-party application vendors create vendor-defined metadata extensions. We


can view and change the values of vendor-defined metadata extensions, but we cannot create, de-
lete, or redefine them.
 User-defined. We create user-defined metadata extensions using PowerCenter. We can create, edit,
delete, and view user-defined metadata extensions. We can also change the values of user-defined
extensions.

All metadata extensions exist within a domain. We see the domains when we create, edit, or view metadata

68
extensions. Vendor-defined metadata extensions exist within a particular vendor domain. If we use third-

© www.dwbiconcepts.com – All rights reserved.


www.dwbiconcepts.com – Community of DWBI Professionals

party applications or other Informatica products, we may see domains such as Ariba or PowerExchange for
Siebel. We cannot edit vendor-defined domains or change the metadata extensions in them.

User-defined metadata extensions exist within the User Defined Metadata Domain. When we create
metadata extensions for repository objects, we add them to this domain.

Both vendor and user-defined metadata extensions can exist for the repository objects- Source definitions,
Target definitions, Transformations, Mappings, Mapplets, Sessions, Tasks, Workflows, Worklets.

DWBIConcepts
19. Describe some of the ETL Best Practices

Answer:

A lot of best practices may be applicable to a certain tool and pointless for the other. In a very high level and
in a very tool independent way-

DWBIConcepts
 Naming conventions for ETL objects
 Naming conventions for Database objects
 Parameterization of connections (so that things are easy for moving from 1 environment to other)
 Maintaining of ETL job log - ideally automated maintenance through logging of job run
 Handling of rejected records (and logging)
 Data reconciliation
 Meta data management- e.g. - maintaining Meta data columns in tables (Use of Audit columns e.g.
load date/ load user/ batch id etc.)
 Error reporting
 ETL job Performance evaluation

DWBIConcepts
Following generic coding standards
 Documentation
 Decomposing complex logic in multiple ETL stages - load balancing (pushdown optimization wherev-
er applicable) etc.
 Removal of unwanted ports from different transformations used in a mapping
 Using Shortcuts for source, target and lookups
 Using mapplet, worklet as and when required
 Write some comments for every transformation
 Use Decode function rather that “if than else”
 make sure that the sorted data is moved into the aggregator transformation
 If the target table is having indexes, loading data into such tables will decrease the performance; in DWBIConcepts
such situations, use pre SQL to drop the index before loading the data into target tables and once
the data is loaded then, re-create the index using post SQL.

20. Is there a scope of cloud computing in Data warehousing technology?

Answer:

This is not only possible; in fact, this is the way to go for many of the providers of the modern day BI tools.
There are certain advantages and benefits of using cloud computing for Business Intelligence applications
and this is a big topic of discussion today. I will quickly touch upon a few points that will substantiate the
need of Cloud BI and in the future I will try to make a comprehensive article post in this website with more
details. First, if you see the current state of BI - there are these typical characteristics:
69
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 High Infrastructure requirement, leading to high upfront investment


 High development cost (needs special talent) as well high maintenance cost
 Unpredictable workload (data volume), and skewed business growth pattern

All these lead to the issues of longer cycle time and limited adoption of BI solutions. Now cloud platform,
as opposed to typical in-house software platform, is basically an alternative delivery method for the
software service. When you deliver the software or platform or infrastructure (as a service) through
cloud, you can instantly start to get the following benefits:

DWBIConcepts
 Lower entry cost
 Lower maintenance cost (pay as you use)
 Faster deployment
 Reduced risk
 Lower TCO (total cost of ownership)
 Multiple deployment model etc. etc.

Moreover, Small and medium enterprises (SMEs) can easily adapt to this model given their typical con-
straints of small business. Companies like Pentaho etc. are already “in” with their products in SaaS (soft-
ware as a service) model of cloud computing. But cloud models like SaaS has some typical problems (e.g.
no flexibility of design, security concerns etc.).

DWBIConcepts
As opposed to SaaS model, we have another cloud model called PaaS - Platform as a service - which has
the benefit of design flexibility. PaaS is very suitable for custom applications and even enterprise level BI
applications. This cloud service is being offered by almost everyone in the BI market - - BusinessObjects -
SAS - Microsoft Azure (check here: http://en.wikipedia.org/wiki/SQL_Azure ) - Vertica - Greenplum etc.

DWBIConcepts
DWBIConcepts

70
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

17. Mapping

1. Scenario Implementation 1
Suppose we have a source port called ename with data type varchar(20) and the corresponding target port
as ename with varchar(20). The data type is now altered to varchar(50) in both source and target database.
Describe the changes required to modify the mapping.

Answer:

DWBIConcepts
Reimport the source and target definition. Next open the mapping and Right click on the source port ename
and use "Propagate Attribute" option. This option allows us to change the properties of one port across mul-
tiple transformations without manually modifying the port in each and every transformation. We can choose
the direction of propagation (forward / backward / both) and can also select attributes of propagation e.g.
data type, scale, precision etc.

2. What are mapping parameters and variables?

DWBIConcepts
Answer:

A mapping parameter is a user definable constant that takes up a value before running a session.
It can be used in SQ expressions, Expression transformation etc.

A mapping variable is also defined similar to the parameter except that the value of the variable is subjected
to change. It picks up the value in the following order.

DWBIConcepts
 From the Session parameter file
 As stored in the repository object in the previous run
 As defined in the initial values in the designer
 Data type Default values

3. Which type of variables or parameters can be declared in parameter file?


$, $$, $$$ - Can all be declared or not.
DWBIConcepts
Answer:

There is a difference between variable and parameter.

 Variable, as the name suggests, is like a variable value which can change within a session run.

 Parameters are fixed and their values don't change during session run.

 $ - for session level parameters which can be declared in parameter files.

 $$ - for mapping level parameters which can be declared in parameter files.

 $$$- Inbuilt Informatica system variables that cannot be declared in parameter files

E.g. $$$SessStartTime these are constant throughout the mapping and cannot be changed.

71
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Read this article to get a detail understanding:http://www.dwbiconcepts.com/etl/14-etl-


informatica/74-stop-hardcoding-follow-parameterization-technique.html

4. What are the default values for variables?

Answer:

DWBIConcepts
 String = Null
 Number = 0
 Date = 1/1/1753

5. What does first column of bad file (rejected rows) indicates?

DWBIConcepts
Answer:

 First Column - Row indicator (0, 1, 2, 3)


 Second Column – Column Indicator (D, O, N, T)

6. Out of 100000 source rows some rows get discard at target, how will you trace them and
where it gets loaded?

DWBIConcepts
Answer:

 Rejected records are loaded into bad files. It has record indicator and column indicator.
 Record indicator identified by (0-insert,1-update,2-delete,3-reject) and
 Column indicator identified by (D-valid,O-overflow,N-null,T-truncated).
 Normally data may get rejected in different reason due to transformation logic

7. What is Reject loading? DWBIConcepts

Answer:

During a session, the Informatica server creates a reject file for each target instance in the mapping. If the
writer or the target rejects data, the Informatica server writes the rejected row into reject file. The reject file
and session log contain information that helps you determine the cause of the reject. You can correct reject
files and load them to relational targets using the Informatica reject load utility. The reject loader also cre-
ates another reject file for the data that the writer or target reject during the reject loading.

Reject Loading

During a session, the server creates a reject file for each target instance in the mapping. If the writer of the

72
target rejects data, the server writers the rejected rows into the reject file. You can correct those rejected

© www.dwbiconcepts.com – All rights reserved.


www.dwbiconcepts.com – Community of DWBI Professionals

data and re-load them to relational targets, using the reject loading utility. (You cannot load rejected data in-
to a flat file target) Each time, you run a session, the server appends a rejected data to the reject file.

Locating the BadFiles

 $PMBadFileDir / Filename.bad

When you run a partitioned session, the server creates a separate reject file for each partition.

DWBIConcepts
Reading Rejected data

Ex: 3,D,1,D,D,0,D,1094345609,D,0,0.00

To help us in finding the reason for rejecting, there are two main things.

 Row indicator - Row indicator tells the writer, what to do with the row of wrong data.

Row indicator Meaning Rejected By

o 0 Insert Writer or target

DWBIConcepts
o 1 Update Writer or target
o 2 Delete Writer or target
o 3 Reject Writer

If a row indicator is 3, the writer rejected the row because an update strategy expression marked it
for reject.

 Column indicator - Column indicator is followed by the first column of data, and another column in-
dicator. They appears after every column of data and define the type of data preceding it

DWBIConcepts
Column Indicator Meaning Writer Treats as

o D Valid Data Good Data. The target accepts it unless a database error occurs, such as finding
duplicate key.
o Overflow Bad Data.
o N Null Bad Data.
o T Truncated Bad Data

NOTE: NULL columns appear in the reject file with commas marking their column.
DWBIConcepts

Correcting Reject File

Use the reject file and the session log to determine the cause for rejected data. Keep in mind that correcting
the reject file does not necessarily correct the source of the reject. Correct the mapping and target database
to eliminate some of the rejected data when you run the session again. Trying to correct target rejected
rows before correcting writer rejected rows is not recommended since they may contain misleading column
indicator. For example, a series of “N” indicator might lead you to believe the target database does not
accept NULL values, so you decide to change those NULL values to Zero. However, if those rows also had a 3
in row indicator. Column, the row was rejected b the writer because of an update strategy expression, not
because of a target database restriction. If you try to load the corrected file to target, the writer will again re-
ject those rows, and they will contain inaccurate 0 values, in place of NULL values.
73
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

8. Why Informatica writer thread may reject a record?

Answer:

 Data overflowed column constraints


 An update strategy expression

9. Why target database can reject a record?

DWBIConcepts
Answer:

 Data contains a NULL column


 Database errors, such as key violations

10. Describe various steps for loading reject file?

DWBIConcepts
Answer:

 After correcting the rejected data, rename the rejected file to reject_file.in
 The rejloader used the data movement mode configured for the server. It also used the code page of
server/OS. Hence do not change the above, in middle of the reject loading
 Use the reject loader utility Pmrejldr pmserver.cfg [folder name] [session name]

DWBIConcepts
11. Variable v1 has values set as 5 in designer (default), 10 in parameter file, and 15 in reposi-
tory. While running session which value Informatica will read?

Answer:

Informatica read value 15 from repository

DWBIConcepts
12. What are shortcuts? Where it can be used? What are the advantages?

Answer:

There are 2 shortcuts (Local and global) Local used in local repository and global used in global repository.
The advantage is reusing an object without creating multiple objects. Say for example a source definition
want to use in 10 mappings in 10 different folders without creating 10 multiple source you create 10
shortcuts.

74
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

13. Can we have an Informatica mapping with two pipelines, where one flow is having a
Transaction Control transformation and another not. Explain why?

Answer:

No it is not possible. Whenever we have a Transaction Control transformation in a mapping, the session
commit type is ‘User Defined’. Whereas for a pipeline without the Transaction Control transform, the session
expects the commit type to be either Source based or Target based.

DWBIConcepts
Hence we cannot have both the pipelines in a single mapping; rather we have to develop single mappings for
each of the pipelines.

14. How can we implement Reverse Pivoting using Informatica transformations?

Answer:

DWBIConcepts
Pivoting can be done using Normalizer transformation. For reverse-pivoting we will need to use an aggrega-
tor transformation like below:

From,

Col1 Col2
A 10
B 20

DWBIConcepts
To,

Col1 Col2
A B
10 20

can be done using one Expression transformation and one Aggregator transformation:

In Expression transform, create two ports, o_col_a, o_col_b. DWBIConcepts


o_col_a = IIF (col1="A", ColB, 0)
o_col_b = IIF (col1="B", ColB, 0)

Next in the aggregator transform, take the MAX () of o_col_a, o_col_b and map it to target A and B columns.
(We may need to take SUM (), instead of MAX () if we have multiple A, B rows)

15. Is it possible to update a Target table without any key column in target?

Answer:

75
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Yes it is possible to update the target table either by defining keys at Informatica level in Warehouse
designer or by using Update Override.

DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts

76
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

18. Mapplet

1. What is a Mapplet?

Answer:

Mapplets are reusable objects that represent collection of transformations.

DWBIConcepts
2. What is the difference between Reusable transformation and Mapplet?

Answer:

Any Informatica Transformation created in the Transformation Developer or a non-reusable pro-


moted to reusable transformation from the mapping designer which can be used in multiple

DWBIConcepts
mappings is known as Reusable Transformation. When we add a reusable transformation to a
mapping, we actually add an instance of the transformation. Since the instance of a reusable transformation
is a pointer to that transformation, when we change the transformation in the Transformation Developer, its
instances reflect these changes.

A Mapplet is a reusable object created in the Mapplet Designer which contains a set of transformations and
lets us reuse the transformation logic in multiple mappings. A Mapplet can contain as many transformations
as we need. Like a reusable transformation when we use a mapplet in a mapping, we use an instance of the
mapplet and any change made to the mapplet in Mapplet Designer, is inherited by all instances of the
mapplet.

DWBIConcepts
3. What are the transformations that are not supported in Mapplet?

Answer:

 Normalizer
 Cobol sources
 XML sources
 XML Source Qualifier DWBIConcepts
 Target definitions
 Pre- and Post- session Stored Procedures
 Other Mapplet

4. Is it possible to convert reusable transformation to a non-reusable one?

Answer:

Reusable transformations are created in the Transformation Developer.


Another way is to promote a non-reusable transformation in a Mapping/Mapplet to reusable one.

77
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

**Converting a non-reusable transformation into a reusable transformation is not reversible.

But we can use the reusable transformation as a non-reusable one in any mapping or mapplet by dragging
the selected Reusable Transform from the Repository Navigator and press the Ctrl key just before dropping
the object in the Mapplet/Mapping designer.

The same applies for creating a non-reusable session from a reusable one in the Worklet/Workflow designer.

DWBIConcepts
5. What is the use of Mapplet & Worklet in project?

Answer:

Mapplet and Worklets allow you to create reusable objects and thus make your informatica code reusable.

Just like a procedure or function in a procedural language, we can build a mapplet or worklet, to incorporate
a business logic, which can be used again and again in different mapping and workflow.

DWBIConcepts
Mapplet can be created in PowerCenter Designer and reused in mapings. Worklet can be created in Work-
flow Manager and reused in Workflows.

6. Is it possible to have a mapplet within a mapplet and worklet within a worklet?

Answer:

Informatica does not support mapplet within a mapplet transformation but it supports worklet within a

DWBIConcepts
worklet.

DWBIConcepts

78
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

19. Session

1. What is Session and Batches?

Answer:

DWBIConcepts
SESSION - A Session is a set of instructions that tells the Informatica Server / Integration Service, how and
when to move data from Sources to Targets. After creating the session, we can use either the server manag-
er or the command line program pmcmd to start or stop the session.

BATCHES - It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica
Server. There Are Two Types Of Batches:

 SEQUENTIAL - Run Session One after the Other.


 CONCURRENT - Run Session at the Same Time.

DWBIConcepts
2. What are various session tracing levels?

Answer:

Normal - default Logs initialization and status information, errors encountered, skipped rows
due to transformation errors, summarizes session results but not at the row level.

DWBIConcepts
Terse - Log initialization, error messages, notification of rejected data.

Verbose Initialization - In addition to normal tracing levels, it also logs additional initialization information,
names of index and data files used and detailed transformation statistics.

Verbose Data - In addition to verbose initialization, it records row level logs.

DWBIConcepts
3. Can we copy a session to new folder or new repository?

Answer:

Yes we can copy session to new folder or repository, provided the corresponding Mapping is already in the
folder or repository.

4. Is it possible to store all the Informatica session log information in a database table?
Normally the session log is stored as a binary compression .bin file in SessLogs directory.
Can we store the same information in database tables for future analysis?

79
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Answer:

It is not possible to store all the session log information in some table. Along with error related in-
formation we may get some other session related information from metadata repository tables like
REP_SESS_LOG.

To capture error data, we can configure the session as below:


Go to Session->Config Object-> Error Handling Section

DWBIConcepts
Give the setting-
Error Log Type: Relational Database.
Error Log Type: Give the Database Connection, where we want to store the error tables.
Error Log Table Name Prefix: Prefix for the error tables. By default, Informatica creates 4 different error ta-
bles. If we provide a prefix here the error tables will be created with the same prefix in the database.
Log Row Data: This option is used to log the data at the point where the error happened.
Log Source Row Data: Capture the source date for the error record.
Log Source Row Data: Error data will be stored into a single column of the database table. We can specify
the delimiter for the source data here.

DWBIConcepts
List of Error tables created by Informatica:

PMERR_DATA. Stores data and metadata about a transformation row error and its corresponding source
row.
PMERR_MSG. Stores metadata about an error and the error message.
PMERR_SESS. Stores metadata about the session.
PMERR_TRANS. Stores metadata about the source and transformation ports, such as name and data type,
when a transformation error occurs.

The above tables are specifically used to store the information about exception (error) records - e.g. records

DWBIConcepts
in the reject file.
We can use this as a base for error handling strategy. But this does not contain all the information that are
present in session log - like performance details (thread busy percentage), details of the transformation in-
voked in the session etc. We can also check the contents of REP_SESS_LOG view under Informatica reposito-
ry schema; however, that too does not contain all the information.

5. Can we call a shell script from session properties?


DWBIConcepts
Answer:

The Integration Service can execute shell commands at the beginning or at the end of the session. The Work-
flow Manager provides the following types of shell commands for each Session task:

 Pre-session command
 Post-session success command
 Post-session failure command

Use any valid UNIX command or shell script for UNIX nodes, or any valid DOS or batch file for Windows
nodes. Configure the session to run the pre- or post-session shell commands.

80
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

6. Can we change the Source and Target table names in Session level?

Answer:

Yes, we can change the source and target table names in the session level. Go to the session and navigate to
the mapping tab. Select the source/target to be changed- for target mention new table name in
“Target Table Name” & for source choose “Source Table Name”.

DWBIConcepts
One more suitable method would be to parameterize the source and target table name. We can
run the same mapping concurrently using different parameter files. We have to enable concurrent run mode
in the Workflow level. Also find more information regarding parameterization.

7. How to write flat file column names in target?

Answer:

There are two options available in session properties to take care of this requirement. For this, Go to Map-

DWBIConcepts
ping Tab Target Properties and Choose the header option as Output Field names OR Use Header Command
output File.

Option 1, will create your output file with a header record and the column heading names will be same as
your Target transformation port names.

Option 2, we can create our command to generate the header record text. We can use an 'echo' command
here to get this created. Here is an example
echo '"Employee ID"|"Department ID"'

DWBIConcepts
It is recommended using the second option as it gives more flexibility for writing the column names.

8. What are the ERROR tables present in Informatica?

Answer:

 PMERR_DATA- Stores data and metadata about a transformation row error and its corresponding
source row.
 PMERR_MSG- Stores metadata about an error and the error message. DWBIConcepts
 PMERR_SESS- Stores metadata about the session.
 PMERR_TRANS- Stores metadata about the source and transformation ports, such as name and data
type, when a transformation error occurs.

9. What are the alternate ways to stop a session without using “STOP ON ERRORS” option
set to 1 in session properties?

Answer:

We can also use the functions STOP () or ERROR () in an expression transformation to stop the execution of a
session based on some user-defined conditions.

81
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

10. Suppose a session fails after loading of 10,000 records in the target. How can we load the
records from 10,001 when we run the session next time?

Answer:

If we configure the Session for Normal load rather than Bulk load & by using Recovery Strategy

DWBIConcepts
in the Session Properties & selecting the Option “Resume from last Check point”, then we can
run the Session from the last Commit Interval.

In this case if we specify the Commit Interval as 10,000 & the Integration Service issues a commit after load-
ing 10,000 records then you can load the records from 10,001.

If 9999 rows were loaded and the session fails and Integration Service did not issue any commit as the Com-
mit Interval in this case is 10,000 then we cannot perform Recovery. In this case truncate the Target Table &
Restart the session.

DWBIConcepts
11. Define the types of Commit intervals apart from user defined?

Answer:

The different commit intervals are:

Target-based commit. The Informatica Server commits data based on the number of target rows and the key

DWBIConcepts
constraints on the target table. The commit point also depends on the buffer block size and the commit in-
terval.

Source-based commit. The Informatica Server commits data based on the number of source rows. The
commit point is the commit interval you configure in the session properties.

12. Suppose session is configured with commit interval of 10,000 rows and source has 50,000
rows explain the commit points for source based commit & target based commit. Assume DWBIConcepts
appropriate value wherever required?

Answer:

 Target Based commit (First time Buffer size full 7500 next time 15000)
Commit Every 15000, 22500, 30000, 40000, 50000

 Source Based commit(Does not affect rows held in buffer)


Commit Every 10000, 20000, 30000, 40000, 50000

82
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

13. How to capture performance statistics of individual transformation in the mapping and
explain some important statistics that can be captured?

Answer:

Use tracing level Verbose data.

DWBIConcepts
14. How can we parameterize success or failure email list?

Answer:

We can parameterize the email user list and modify the values in parameter file.
Use $PMSuccessEmailUser, $PMFailureEmailUser.

DWBIConcepts
Also we can use pmrep command to update the email task:

updateemailaddr
-d <folder_name>
-s <session_name>
-u <success_email_address>
-f <failure_email_address>

15. Is it possible that a session failed but still the workflow status is showing success?

DWBIConcepts
Answer:

If the workflow completes successfully it will show the execution status of success irrespective of whether
any session within the workflow failed or not. The workflow success status has nothing to do with session
failure. If and only if we set the session general option in the workflow designer Fail Parent if this task fails,
then only the workflow status will display as failed on session failure.

DWBIConcepts
16. What is Busy Percentage?

Answer:

Duration of time the thread was occupied compared to total run time of the mapping.

So let’s say, we have one writer thread - this thread is internally responsible for writing data to the target ta-
ble/ file. Now if our mapping runs for 100 seconds but the time taken by the mapping to write the data to
the target is only 20 seconds (because other time it was busy in reading/ transforming the data), then busy
percentage of the writer thread is 20%

83
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

17. Can we write a PL/SQL block in pre and post session or in target query override?

Answer:

Yes we can. Remember always to put a backslash (\) before any semi-colon ( ; ) we use in the PL-SQL block.

18. Whenever a session runs does the data gets overwritten in a flat file target? Is it possible

DWBIConcepts
to keep the existing data and add the new data to the target file?

Answer:

Normally with every session run target file data will be overwritten, except if we select “Append if Exist” (8x
onwards) option for the Target session Property which will append the new data to the existing data in the
flat file target.

DWBIConcepts
19. Can we use the same session to load a target table in different databases having same
target definition?

Answer:

Yes we can use the same session to load same target definition in different databases with the help of the
Parameterization; i.e. using different parameter files with different values for the parameterized Target Con-
nection object $DBConnection_TGT and Owner/Schema name Table Name Prefix with
$Param_Tgt_Tablename. To run the single workflow with the session, to load two different database target

DWBIConcepts
tables we can consider using Concurrent workflow Instances with different parameter files.

Even we can load two instance of the same target connected in the same pipeline. At the session level use
different relational connection object created for different Databases.

20. How do you remove the cache files after the transformation?

Answer: DWBIConcepts
After session complete, DTM remove cache memory and deletes caches files. In case using persistent cache
and Incremental aggregation then caches files will be saved.

21. Why doesn't a running session QUIT when Oracle or Sybase return fatal errors?

Answer:

The session will only QUIT when its threshold: "Stop on errors" is set to 1. Otherwise the session will contin-
ue to run.

84
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

22. If we have written a source override query in source qualifier in mapping level but have
modified the query in session level SQL override then how integration service behaves.

Answer:

Informatica Integration Service treats the Session Level Query as final during the session run. If both the que-
ries are different Integration Service will consider the Session level query for execution and will ignore the

DWBIConcepts
Mapping level query.

DWBIConcepts
DWBIConcepts
DWBIConcepts

85
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

20. Workflow

1. What is the difference between STOP and ABORT options in Workflow?

Answer:

When we issue the STOP command on the executing session task, the Integration Service stops

DWBIConcepts
reading data from source. It continues processing, writing and committing the data to targets. If
the Integration Service cannot finish processing and committing data, we can issue the abort
command.

In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish pro-
cessing and committing data within the timeout period, it kills the DTM process and terminates the session.

We can stop or abort tasks, worklets within a workflow from the Workflow Monitor or Control
task in the workflow or from command task by using pmcmd stop or abort command. We can also
call the ABORT function from mapping level.

DWBIConcepts
When we stop or abort a task, the Integration Service stops processing the task and any other tasks in the
path of the stopped or aborted task. The Integration Service however continues processing concurrent tasks
in the workflow. If the Integration Service cannot stop the task, we can abort the task.

The Integration Service aborts any workflow if the Repository Service process shuts down.

2. Running Informatica Workflow continuously – How to run a workflow continuously until a


certain condition is met?

DWBIConcepts
Answer:

We can schedule a workflow to run continuously. A continuous workflow starts as soon as the In-
tegration Service initializes. If we schedule a real-time session to run as a continuous workflow,
the Integration Service starts the next run of the workflow as soon as it finishes the first. When
the workflow stops, it restarts immediately.

Alternatively for normal batch scenario we can create conditional-continuous workflow as below.

Suppose wf_Bus contains the business session that we want to run continuously until a certain conditions is DWBIConcepts
meet before it stops, may be presence of file or particular value of workflow variable etc.

So modify the workflow as Start-Task followed by Decision Task which evaluates a condition to be TRUE or
FALSE. Based on this condition the workflow will run or stop.

Next use the Link Task to link the business session for $Decision.Condition=TRUE.

For the other part use a Command Task for $Decision.Condition=FALSE.

In the command task create a command to call a dummy workflow using pmcmd functionality. e.g.
"C:\Informatica\PowerCenter8.6.0\server\bin\pmcmd.exe" startworkflow -sv
IS_info_repo8x -d Domain_hp -u info_repo8x -p info_repo8x -f WorkFolder
wf_dummy

Next create the dummy workflow name it as wf_dummy. Place a Command Task after the Start Task.

86
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Within the command task put the pmcmd command as


"C:\Informatica\PowerCenter8.6.0\server\bin\pmcmd.exe" startworkflow -sv
IS_info_repo8x -d Domain_sauravhp -u info_repo8x -p info_repo8x -f
WorkFolder wf_Bus

In this way we can manage to run a workflow continuously. So the basic concept is to use two workflows and
make them call each other.

DWBIConcepts
3. How do we send emails from Informatica after the successful completion of one session?
The email will contain the job name/ session start time and session end time in the mes-
sage body.

Answer:

The first thing is to have "mail" utility configured in the Informatica server (UNIX/WINDOWS).

After that, we will use the Informatica Email Task. We can create a email task and call it in the session level

DWBIConcepts
“On Success Email”. Here we can use Informatica pre-build variables like- mapping name (%m), session start
time (%b) etc.

4. Scenario Implementation 1
How to pass a value calculated in mapping variable to the email message. The email will be sent in HTML
format with a predefined message in which one value will be populated from one mapping variable. Sup-
pose, the predefined message is:
<html> <body>

DWBIConcepts
The last transaction service ID is: <informatica_variable>
</body> </html>
In the place of <informatica_variable>, the value of the mapping variable at the end of the session will go.

Answer:

We cannot use a mapping variable in Workflow or Session level. It is local to a mapping. Instead, we have to
use a Workflow variable for this purpose. But, we cannot pass the value of the Mapping Variable to the
Workflow variable directly from your mapping.

1) Write the calculated value in some Flat File using your mapping say "value.txt".
DWBIConcepts
2) Create a shell script say "mail.sh" to send the 2nd mail. Read the value from the "value.txt" into a variable
in "mail.sh". Use this variable in the body of the mail.

3) Create a Cmd task in the WF level. Call this "mail.sh" in that Cmd task.

4) Use this Cmd task upstream of your actual session and link it on its success.

5. How can we send two separate emails after a successful session run?

Answer:

87
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

The problem is we cannot call two email tasks from one session i.e. from session level “On Success Email”.
So, for the second email we can create another Email Task following the Session using and link them using
Link Task with execution condition as status=SUCCEEDED.

6. What is Cold Start in Informatica?

DWBIConcepts
Answer:

In general terms, “Cold Start” means ‘To start a program from the very beginning, without being able to con-
tinue the processing that was occurring previously when the system was interrupted.’

With respect to Informatica, we can resume a stopped or failed real-time session. To resume a session, we
must restart or recover the session. The Integration Service can recover a session automatically if you ena-
bled the session for automatic task recovery. When you restart a session, the Integration Service resumes
the session based on the real-time source. Depending on the real-time source, it restarts the session with or
without recovery.

DWBIConcepts
We can restart a task or workflow in cold start mode. When you restart a task or workflow in cold start
mode, the Integration Service discards the recovery information and restarts the task or workflow.

For e.g. if a workflow failed in between and we don't want to recover data because we manually did all clean
up of data in the impacted target tables. If workflow recovery is enabled then we can opt for a cold start
which will skip recovery task. Cold start will remove all recover data if any stored when session failed.

 When we restart a stopped or failed task or workflow that has recovery enabled in cold start mode,
the Integration Service discards the recovery information and restarts the task or workflow.
 Cold Start Task, Cold Start Workflow or Cold Start Workflow from Task commands can be executed

DWBIConcepts
from the Workflow Manager, Workflow Monitor, or pmcmd command line programs.
 If we restart a session in cold start mode, targets may receive duplicate rows.
 So avoid cold start and restart the session with recovery to prevent data duplication.
 So if recovery is not enabled in a session, then there is no difference between cold start and restart.

7. Scenario Implementation 2
Email - I have a llist of 10 peoples in email after session failure. can we edit the list emails dynamically - I
mean can we add or delete email ID without touching the mapping.
DWBIConcepts
Answer:

We can parameterize the email user list and modify the values in parameter file. Use $PMSuccessEmailUser,
$PMFailureEmailUser. Also you can use pmrep command to update the email task:
updateemailaddr -d <folder_name> -s <session_name> -u <suc-
cess_email_address> -f <failure_email_address>

You can create a distribution list and use that DL in the session failure cmd. What so ever emails will be listed
in the DL will receive the mail. Later on you can add/remove the emails in the DL depending upon your re-
quirement.

88
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

8. We know there are 3 options for Session recovery strategy - Restart task, Fail task and
continue running the workflow, Resume from last checkpoint whenever a session fails.
How do we restart a workflow automatically without any manual intervention in the
event of session failure?

Answer:

Select “Automatically recover terminated tasks” option in workflow properties. Also we can specify the max-

DWBIConcepts
imum number of auto attempts in the workflow property “Maximum automatic recovery attempts”.

9. What is the difference Real-time and continuous workflows?


Answer:

Real-time Workflow is source XML Message triggered workflow, whereas if any workflow which runs contin-
uously using two workflows and command line arguments to call each other.

DWBIConcepts
11. Scenario Implementation 3
Suppose we have two workflows workflow 1 (wf1) having two sessions (s1, s2) and workflow 2 (wf2) having
three sessions (s3, s4, s5) in the same folder, like below

wf1: s1, s2
wf2: s3, s4, s5

How can we run s1 first then s3 after that s2 next s4 and s5 without using pmcmd command or unix script?

DWBIConcepts
Answer:

Use Command Task or Post Session Command to create touch file and use Event Wait Task to wait for the
file (Filewatch Name).

Combination of Command Task and Event Wait will help to solve the problem.

WF1----->S1------>CMD1----->EW2------>S2------->CMD3
WF2----->EW1--->S3--------->CMD2----->EW3---->S4------>S5

So run both the workflows, session s1 starts and after successful execution calls command task cmd1. cmd1 DWBIConcepts
generates a touch file say s3.txt

After that the execution passes to event wait ew2. Immediately event wait ew1 will start to process session
s3 after the file s3.txt was generated. Next after success of session s3 it will pass the control to command
task cmd2 which in turn will generate a touch file say s2.txt and passes the control to event wait task ew3.
Immediately at the same time the event wait ew2 gets started after receiving the event wait file s2.txt and
passes the control to session s2. After completion of session s2 it triggers command task cmd3 which in turn
generates a wait file s4.txt and the workflow wf1 ends. On the other hand the event wait ew3 gets triggered
with wait file s4.txt in place and calls the session s4 which in turn after success triggers the last session s5
and the workflow wf2 completes.

89
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

12. How do we send a session failure mail with the workflow or session log as attachment?

Answer:

Design an Informatica email task to send email communication in the event of session failure and used email
variable %g to attach the corresponding session log.
Email Variables:
(%g) - To attach session log.

DWBIConcepts
(%a<>) - To attach any file, Absolute path needs to be given <>.

13. Explain deadlock in Informatica and how do we resolve it?

Answer:

In Database level deadlock normally occurs when two concurrent user sessions are trying to ap-
ply a DML command for same row in a table. Say for example, below query got executed by us-

DWBIConcepts
er1 in session1

update emp set deptno=20 where deptno=10;

Before user1 is commits the transaction, if user2 from session2 execute the same query as below , it causes
deadlock error.

update emp set deptno=30 where deptno=10;

DWBIConcepts
In informatica normally deadlock occurs when two sessions are updating or deleting records from a table in
parallel, (parallel insert is not a problem). One option to avoid deadlock is to identify those sessions and
make them sequential. Another option is to make use of the session level properties such as ‘deadlock retry
limits’ and ‘deadlock recovery option’

14. Scenario Implementation 4


Busy Percentage is given by (runtime-idle time) * 100 / runtime.
If a thread is having 0 idle time, which means more Busy Percentage. So do we need to tune that thread
component? DWBIConcepts
Why is it like that? So does it means we need to tune the thread whose busy percentage (BP) is more or the
one having more idle time.

Answer:

3 persons are asked to run 1 mile each. Each one of them is allotted 20 minutes of time. First person com-
pletes 1 mile in 5 minutes and stands idle other 15 minutes of his allotted time. The 2nd person completes it
in 10 minute and sits idle the rest 10 minute. The last one takes all 20 minutes and idle for 0 minutes. Who is
the worst performer?

Isn't it the last person who had no idle time? It's the same for a thread with 0 idle time.

90
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

15. How can we pass a value from one workflow to another?

Answer:

Pass the Workflow variable value to a session variable in pre-assignment and then next to mapping parame-
ter.
Next develop a mapping to generate a parameter file with the desired value as a workflow variable that can
be passes to the next workflow using this parameter file.

DWBIConcepts
Alternatively, develop the mapping to store the value in a flat file or Database table. Next create another
mapping to use that in the next workflow by passing it to the session in post-assignment and then to work-
flow level if required.

DWBIConcepts
DWBIConcepts
DWBIConcepts

91
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

21. Administration

1. What is Load Manager?

Answer:

The load Manager performs the following tasks

DWBIConcepts
 Manages session and batch scheduling.
 Locks the session and read session properties.
 Reads the parameter file.
 Expand the server and session variables and parameters.
 Verify permissions and privileges.
 Validate source and target code pages.
 Create the session log file.
 Create the Data Transformation Manager which executes the session.

DWBIConcepts
2. What is DTM process? How many threads it creates to process data, explain each thread
in brief?

Answer:

After the load manager performs validations for the session, it creates the DTM process. The DTM process is

DWBIConcepts
the second process associated with the session run. The primary purpose of the DTM process
is to create and manage threads that carry out the session tasks. The DTM allocates process
memory for the session and divide it into buffers. This is also known as buffer memory. It cre-
ates the main thread, which is called the master thread. The master thread creates and man-
ages all other threads. If we partition a session, the DTM creates a set of threads for each par-
tition to allow concurrent processing. When Informatica server writes messages to the session log it includes
thread type and thread ID. Following are the types of threads that DTM creates:

 MASTER THREAD - Main thread of the DTM process. Creates and manages all other threads. DWBIConcepts
 MAPPING THREAD - One Thread to Each Session. Fetches Session and Mapping Information.
 Pre and Post Session Thread - One Thread Each To Perform Pre and Post Session Operations.
 READER THREAD - One Thread for Each Partition for Each Source Pipeline.
 WRITER THREAD - One Thread for Each Partition If Target Exist in the Source pipeline Write to the
Target.
 TRANSFORMATION THREAD - One or More Transformation Thread For Each Partition.

3. Can you create a folder within designer?

Answer:

92
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Not possible

4. How do you take care of security using a repository manager?

Answer:

DWBIConcepts
 Using repository privileges, folder permission and locking.
 Repository privileges(Session operator, Use designer, Browse repository, Create session and batches,
Administer repository, administer server, super user)
 Folder permission(owner, groups, users)
 Locking(Read, Write, Execute, Fetch, Save)

5. What are the different uses of a repository manager?

DWBIConcepts
Answer:

Repository manager used to create repository which contains metadata the Informatica uses to transform
data from source to target. And also it use to create informatica users and folders and copy, backup and re-
store the repository

6. What are 2 modes of data movement in Informatica Server?

DWBIConcepts
Answer:

The data movement mode depends on whether Informatica Server should process single byte or multi-byte
character data. This mode selection can affect the enforcement of code page relationships and code page
validation in the Informatica Client and Server.

 Unicode – IS allows 2 bytes for each character and uses additional byte for each non-ascii character
(such as Japanese characters) DWBIConcepts
 ASCII – IS holds all data in a single byte

The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes
into effect once you restart the Informatica Server.

7. What is Code Page used for?


Answer:

A code page contains the encoding to specify characters in a set of one or more languages. An encoding is
the assignment of a number to a character in the character set. Code Page is used to identify characters that
might be in different languages. If you are importing Japanese data into mapping, then u must select the
Japanese code page for the source data.

93
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

8. What is Code Page Compatibility?


Answer:

Compatibility between code pages is used for accurate data movement when the Informatica Sever runs in
the Unicode data movement mode. If the code pages are identical, then there will not be any data loss. One
code page can be a subset or superset of another. For accurate data movement, the target code page must
be a superset of the source code page.

DWBIConcepts
Superset - A code page is a superset of another code page when it contains the character encoded in the
other code page. It also contains additional characters not contained in the other code page.

Subset - A code page is a subset of another code page when all characters in the code page are encoded in
the other code page.

9. What is default block buffer size?

DWBIConcepts
Answer: 64K

10. What is default LM shared memory size?

Answer: 2MB

DWBIConcepts
11. Define Server Concepts with respect to memory buffers

Answer:

The Informatica server used three system resources – CPU, Shared Memory & Buffer
MemoryInformatica server uses shared memory, buffer memory and cache memory for session DWBIConcepts
information and to move data between session threads.

LM Shared Memory - Load Manager uses both process and shared memory. The LM keeps the information
server list of sessions and batches, and the schedule queue in process memory. Once a session starts, the LM
uses shared memory to store session details for the duration of the session run or session schedule. This
shared memory appears as the configurable parameter (LMSharedMemory) and the server allots 2,000,000
bytes as default. This allows you to schedule or run approximately 10 sessions at one time.

DTM Buffer Memory - The DTM process allocates buffer memory to the session based on the DTM buffer
poll size settings, in session properties. By default, it allocates 12,000,000 bytes of memory to the session.
DTM divides memory into buffer blocks as configured in the buffer block size settings. (Default: 64,000 bytes
per block)

94
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

12. What are the two programs that communicate with the Informatica Server?

Answer:

Informatica provides Server Manager and pmcmd programs to communicate with the Informatica Server:

DWBIConcepts
Server Manager - A client application used to create and manage sessions and batches, and to monitor and
stop the Informatica Server. You can use information provided through the Server Manager to troubleshoot
sessions and improve session performance.

pmcmd - A command-line program that allows you to start and stop sessions and batches, stop the
Informatica Server, and verify if the Informatica Server is running.

DWBIConcepts
DWBIConcepts
DWBIConcepts

95
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

22. Command Line Arguments

1. What is pmcmd commands?

Answer:

pmcmd is a command line program to communicate with the Informatica server. This does not replace the

DWBIConcepts
server manager, since there are many tasks that you can perform only with server Manager.

These are some operations that you can do using PMCMD - Start, Stop and abort the session

2. What is pmrep commands?

Answer:

DWBIConcepts
You can use pmrep to create or delete repository users and groups. You can also use pmrep to modify repos-
itory privileges assigned to users and groups.

3. How do we start & stop session from pmcmd command line?

Answer:

DWBIConcepts
Use the following syntax to ping the Informatica Server on a UNIX system:

pmcmd ping [{user_name | %user_env_var} {password | %password_env_var}]


[hostname:]portno

Use the following syntax to start a session or batch on a UNIX system:

pmcmd start {user_name | %user_env_var} {password | %password_env_var}


[hostname:]portno [folder_name:]{session_name | batch_name}
[:pf=param_file] session_flag wait_flag
DWBIConcepts
Use the following syntax to stop a session or batch on a UNIX system:

pmcmd stop {user_name | %user_env_var} {password | %password_env_var}


[hostname:]portno[folder_name:]{session_name | batch_name} session_flag

Use the following syntax to stop the Informatica Server on a UNIX system:

pmcmd stopserver {user_name | %user_env_var} {password | %pass-


word_env_var} [hostname:]portno

96
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

23. Metadata Repository

1. Is there any metadata query to find the list of Informatica folder name, workflow names
which are migrated in a particular Quarter?

Answer:

DWBIConcepts
The below SQL will give you the list of folders, workflows and their last saved date.

SELECT W.SUBJECT_AREA FOLDER_NAME, W.WORKFLOW_NAME, W.WORKFLOW_LAST_SAVED


FROM REP_WORKFLOWS W
ORDER BY TO_DATE (W.WORKFLOW_LAST_SAVED, 'MM/DD/YYYY HH24:MI:SS') DESC

2. How can I run Metadata Queries in Informatica PowerCenter?

DWBIConcepts
Answer:

Informatica metadata is stored in some database repository. This can be the same database where we have
our source/ staging / target tables or it may be a completely different database (that is the case in general).
We can execute User defined queries metadata queries only on this database.
We may need to ask Informatica administrator about the database login credentials. We need to have a read
access username/password for the database. After that we can connect to the database and run the
metadata queries.

DWBIConcepts
3. Write a metadata query to identify the sessions having truncate option enabled

Answer:

select
task_name, DWBIConcepts
'Truncate Target Table' ATTR,
decode(attr_value,1,'Yes','No') Value
from OPB_EXTN_ATTR OEA,
REP_ALL_TASKS RAT
where
OEA.SESSION_ID=rat.TASK_ID
and attr_id=9

4. Where can I find a history / metrics of the load sessions that have occurred in
Informatica?

Answer:

97
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

The tables which house this information are OPB_LOAD_SESSION, OPB_SESSION_LOG, and
OPB_SESS_TARG_LOG. OPB_LOAD_SESSION contains the single session entries, OPB_SESSION_LOG contains
a historical log of all session runs that have taken place. OPB_SESS_TARG_LOG keeps track of the errors, and
the target tables which have been loaded. Keep in mind these tables are tied together by Session_ID. If a
session is deleted from OPB_LOAD_SESSION, it's history is not necessarily deleted from OPB_SESSION_LOG,
nor from OPB_SESS_TARG_LOG. Unfortunately - this leaves un-identified session ID's in these tables. How-
ever, when you can join them together, you can get the start and complete times from each session.

DWBIConcepts
5. How to extract the workflow monitor record information from Informatica metadata re-
pository?

Answer:

DWBIConcepts
SELECT DISTINCT
FOLDER_NAME, WORKFLOW_NAME, SESSION_NAME,
START_DATE, START_TIME, END_DATE, END_TIME, DURATION "DURATION IN
DD:HH:MI:SS",
SOURCE_ROWS, TARGET_ROWS, REJECTED_ROWS, REJECTED_STATUS, STATUS,
FAILED_REASON
FROM
( SELECT

DWBIConcepts
t.SUBJECT_AREA FOLDER_NAME, t.WORKFLOW_NAME, t.SESSION_NAME,
DECODE(t.RUN_STATUS_CODE, 2,NULL, TO_CHAR(t.ACTUAL_START,'DD-MON-YYYY'))
START_DATE,
DECODE(t.RUN_STATUS_CODE, 2,NULL, TO_CHAR(t.ACTUAL_START,'HH24:MI:SS
AM')) START_TIME,
DECODE(t.RUN_STATUS_CODE, 2,NULL, TO_CHAR(t.SESSION_TIMESTAMP,'DD-MON-
YYYY')) END_DATE,
DECODE(t.RUN_STATUS_CODE, 2,NULL, TO_CHAR(t.SESSION_TIMESTAMP,'HH24:MI:SS
PM')) END_TIME,
DECODE(t.RUN_STATUS_CODE, 2,NULL, TRUNC((((86400*(SESSION_TIMESTAMP- DWBIConcepts
ACTUAL_START))/60)/60)/24)||':'
|| (TRUNC(((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60)/60) -
24*(TRUNC((((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60)/60)/24)))||':'
|| (TRUNC((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60) -
60*(TRUNC(((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60)/60))) ||':'
|| (TRUNC(86400*(SESSION_TIMESTAMP-ACTUAL_START)) -
60*(TRUNC((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60)))) DURATION ,
DECODE(t.RUN_STATUS_CODE, 2,NULL, t.SUCCESSFUL_SOURCE_ROWS) SOURCE_ROWS ,
DECODE(t.RUN_STATUS_CODE, 2,NULL, t.SUCCESSFUL_ROWS) TARGET_ROWS,
DECODE(t.RUN_STATUS_CODE, 2,NULL, t.FAILED_ROWS) REJECTED_ROWS,
DECODE(t.RUN_STATUS_CODE, 2,NULL,CASE WHEN t.SUCCESSFUL_SOURCE_ROWS <>
t.SUCCESSFUL_ROWS THEN 'VALIDATE THE MISMATCH' END) REJECTED_STATUS,

98
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DECODE(t.RUN_STATUS_CODE, 1,'Succeeded', 2,'Disabled', 3,'Failed',


4,'Stopped', 5,'Aborted', 6,'Running', 7,'Suspending', 8,'Suspended',
9,'Stopping', 10,'Aborting', 11,'Waiting', 15,'Terminated') AS STATUS,
REPLACE(REPLACE(t.FIRST_ERROR_MSG,CHR(10),' '),'No errors encoun-
tered.','') AS FAILED_REASON,
RANK() OVER (PARTITION BY session_name ORDER BY t.SESSION_TIMESTAMP DESC)
rnk
FROM REP_SESS_LOG t WHERE t.SUBJECT_AREA='<<informatica_folder_name>>'
) sess_run

DWBIConcepts
WHERE sess_run.rnk = 1
ORDER BY START_DATE, START_TIME
Don't forget to put the informatica folder name in the SUBJECT_AREA filter above. Also we might need to
make some other small adjustments above to better suit your purpose / informatica version.

DWBIConcepts
DWBIConcepts
DWBIConcepts

99
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

24. Repository Manager

1. Describe the steps for export and import?

Answer:

 Open the folder which contains the mapping.

DWBIConcepts
 Check Out the mapping to be exported.
 Click Repository-->Export Objects and save it in your local drive.
 Open the folder in which you want to export the mapping.
 Click Repository-->Import Objects and select mapping xml file and Click import.
 Once the mapping is imported to the new folder just save it and Check In.

2. What are the various methods of code migration or which is the best way of deployment?

DWBIConcepts
Answer:

The best way is, arguably, the XML export and import, as it is very easy.
But again it all depends upon the requirement; if we want to migrate some workflows with de-
pendent objects at once shot, then the suggested way is XML export and import.

If you need to migrate only some small objects (say some designer or workflow manager objects) then we
can go for copying through Repository Manager or through Designer(for Designer objects) or through Work-
flow manager (for Workflow manager objects) itself. But for this we have to be connected to both the repos-

DWBIConcepts
itories while coping.

Sometime we may need to migrate entire project and want to have a complete log of deployment, then we
can go for creating Deployment Group using Deployment Wizard.

We might use pmrep to automate exporting objects on a daily or weekly basis. To use this command, we
must create a Control File with all the specifications that the Copy Wizard requires. The control file is an XML
file defined by the depcntl.dtd file. A deployment control file is an XML file that you use with the
DeployFolder and DeployDeploymentGroup pmrep commands to deploy a folder or deployment group.

We can create a deployment control file manually to provide parameters for deployment, or you can create DWBIConcepts
a deployment control file with the Copy Wizard. If you create the deployment control file manually, it must
conform to the depcntl.dtd file that is installed with the PowerCenter Client. You include the location of the
depcntl.dtd file in the deployment control file.

One good thing is we can roll back a deployment to purge the deployed versions from the target repository
or folder. When we roll back a deployment, you roll back all the objects in a deployment group that we de-
ployed at a specific date and time. We cannot roll back part of a deployment.

In the PowerCenter Client, we can export repository objects to an XML file and then import repository ob-
jects from the XML file. Use the following client applications to export and import repository objects:

 Repository Manager: You can export and import both Designer and Workflow Manager Objects.
 Designer: You can export and import Designer objects.
 Workflow Manager: You can export and import Workflow Manager objects.
100
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 pmrep: You can export and import both Designer and Workflow Manager objects. You might use
pmrep to automate exporting objects on a daily or weekly basis.

3. What are the various options for ETL code migration

Answer:

DWBIConcepts
There are couples of Options Available for Code migration.

If you have a Versioned Repository, as the first step Check in all the Workflows and dependent objects. Now
we have Couple of different ways to achieve the migration.

Option 1. Now you can export the Workflow from Repository Manager using the Export Object Option to ex-
port as XML and then import into QA using Repository Manager Import Object Option.

Option 2. You can keep your Dev and QA is in the same Repo, you can just do the Drag and Drop option. For
this Open Both Dev and QA Folders in Repository Manager and Just Drag the Objects from Dev to QA.

DWBIConcepts
Option 3. You can Create a Deployment Group using Repository Manager and attach all the Workflows you
need to migrate in the Deployment group and This Deployment group can be migrated

Option 4. You have the Option to Migrate the Entire Folder As well

when we can Use these Options

Option 1. We can use this Option when the number of Workflows to migrate is few. If you do not have
Informatica Versioned Repository, These Exported XML can be used to keep your Versions.

DWBIConcepts
Option 2. When you have less number of Workflows to Migrate you can use this option.

Option 3. Large number of Objects migrated together. It will keep the list of Objects migrated as a group and
in case of a rollback is required it is easy in this approach.

Option 4. Mostly used when you migrate a Project for the first time to QA with a large number of workflows .

4. What is labeling in Informatica? DWBIConcepts

Answer:

we can see label concept in many places like in our mail box. Some time we do group some of our mails to
different level. Like marking some mails to personal level.

In Informatica, Label is a global object that you can associate with any versioned object or group of ver-
sioned objects in a repository. You may want to apply labels to versioned objects to achieve the following re-
sults:

- Track versioned objects during development.


- Improve query results.
- Associate groups of objects for deployment.

101
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

- Associate groups of objects for import and export.

For example, you might apply a label to sources, targets, mappings, and sessions associated with a workflow
so that you can deploy the workflow to another repository without breaking any dependency.

You can apply the label to multiple versions of an object. Or you can specify that you can apply the label to
one version of the object.

DWBIConcepts
You can create and modify labels in the Label Browser. From the Repository Manager, click Versioning > La-
bels to browse for a label.

Informatica Version control is nothing but a team based development methodology where we create copies
of the actual objects to tract the modification using check in and checkout options.

5. Suppose having Informatica Version Control in place, can we revert back an object to a

DWBIConcepts
state of two previous version.

Answer:

 From the Version History of the Object, open the required version of the Object in Workspace.

 Next export the xml metadata of the Object.

 Next Check out the Object.

 Then import the metadata exported earlier.

DWBIConcepts
 Save and Check In the Object.

6. What do we mean by Team based development in Informatica?

Answer: DWBIConcepts
Team based development is nothing but version control for the metadata objects.

If we have the team-based development option, we can enable version control for the repository. A ver-
sioned repository stores multiple versions of an object. Each version is a separate object with unique proper-
ties. A PowerCenter version control feature allows us to efficiently develop, test, and deploy metadata into
production.

During development, we can perform the following change management tasks to create and manage multi-
ple versions of objects in the repository:

 Check out and check in versioned objects.

 Compare objects.

102
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 Track changes to an object.

 Delete or purge a version.

 Use global objects such as queries, deployment groups, and labels to group versioned objects.

DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts

103
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

25. Scenario Questions

1. Suppose we have ten source flat files of same structure. How can we load all the files in
target database in a single batch run using a single mapping?

Answer:

DWBIConcepts
After we create a mapping to load data in target database from source flat file definition, next we move on
to the session property of the Source Qualifier.

To load a set of source files we need to create a file say final.txt containing the source flat file names, ten
files in our case and set the Source filetype option as Indirect. Next point this flat file final.txt, fully qualified
with Source file directory and Source filename.

DWBIConcepts
DWBIConcepts
DWBIConcepts

2. Suppose we have two Source Qualifier transformations SQ1 and SQ2 connected to Target
tables TGT1 and TGT2 respectively. How do you ensure TGT2 is loaded after TGT1?

Answer:

If we have multiple Source Qualifier transformations connected to multiple targets, we can designate the or-
der in which the Integration Service loads data into the targets.

In the Mapping Designer, We need to configure the Target Load Plan based on the Source Qualifier trans-
formations in a mapping to specify the required loading order.
104
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts

It defines the order in which Informatica server loads the data into the targets. This is to avoid integrity con-
straint violations

105
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

3. Suppose we have a Source Qualifier transformation that populates two target tables. How
do you ensure TGT2 is loaded after TGT1?

Answer:

In the Workflow Manager, we can Configure Constraint based load ordering for a session. The Integration
Service orders the target load on a row-by-row basis. For every row generated by an active source, the Inte-
gration Service loads the corresponding transformed row first to the primary key table, then to the foreign

DWBIConcepts
key table.

Hence if we have one Source Qualifier transformation that provides data for multiple target tables having
primary and foreign key relationships, we will go for Constraint based load ordering.

DWBIConcepts
DWBIConcepts
DWBIConcepts
4. Suppose we have the EMP table as our source. In the target we want to view those em-
ployees whose salary are greater than or equal to the average salary for their depart-
ments. Describe your mapping approach.

Answer:

Our Mapping will look like this:

106
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DWBIConcepts
To start with the mapping we need the following transformations:

After the Source qualifier of the EMP table place a Sorter transformation. Sort based on DEPTNO port.

DWBIConcepts
DWBIConcepts
Next we place a Sorted Aggregator Transformation. Here we will find out the AVERAGE SALARY for each
(GROUP BY) DEPTNO.

When we perform this aggregation, we lose the data for individual employees.

To maintain employee data, we must pass a branch of the pipeline to the Aggregator Transformation and DWBIConcepts
pass a branch with the same sorted source data to the Joiner transformation to maintain the original data.

When we join both branches of the pipeline, we join the aggregated data with the original data.

107
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts

So next we need Sorted Joiner Transformation to join the sorted aggregated data with the original data,
based on DEPTNO. Here we will be taking the aggregated pipeline as the Master and original dataflow as De-
tail Pipeline.

108
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts

After that we need a Filter Transformation to filter out the employees having salary less than average salary
for their department.

Filter Condition: SAL >= AVG_SAL

109
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

DWBIConcepts
DWBIConcepts
Finally we place the Target table instance.

5. How can we perform changed data capture based on load sequence number (integer) col-
umn present in the Source table?

DWBIConcepts
Answer:

Create a Mapping Variable as integer data type and Aggregation type as MAX. Set the value of this
mapping variable in any of these transformations (Expression, Filter, Router or Update Strategy).
Use SETMAXVARIABLE( $$Variable, load_seq_column ) function. This function will assign the MAX
sequence number of that particular load into the variable $$variable.

This function executes only if a row is marked as insert. SETMAXVARIABLE ignores all other row types and
the current value remains unchanged. The function sets the current value of a mapping variable to the high-
er of two values- the current value of the variable or the value from the source column for each record. At DWBIConcepts
the end of a successful session, the Integration Service saves the final current value to the repository.

When used with a session that contains multiple partitions, the Integration Service generates different cur-
rent values for each partition. At the end of the session, it saves the highest current value across all parti-
tions to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the
next session run.

Now since the max sequence number for previous load is captured in this mapping variable and is saved in
the repository. We can use this variable as a filter in the Source Qualifier query. Next time when we run the
workflow, it will only extract those records having load sequence number greater than this sequence num-
ber.

110
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

6. Scenario Implementation 1
In my mapping I have 3 tables that we are joining.
In the source query we want to filter the data based off a value that is stored in one of our target tables. Is
there a way of pulling that one particular value from that target table and be able to use it in the filter in the
source qualifier? Basically the value is a load sequence number that gets incremented with each session run.
So when the session runs again we only pull records that are greater than that load sequence number.

Answer:

DWBIConcepts
There are different options to solve the problem.

Option 1: Assumption- Source and target tables cannot be accessed using a single DB Connection and "load
Sequence Number" is modified by the current process.

In this case you can use a mapping variable in the mapping and set the value of the mapping variable to the
highest/current value using the SETMAXVARIABLE function. This value will be stored in Informatica reposito-
ry and the same value can be used in Source Qualifier Filter for the next session run. If incase the workflow
fails, the value of the mapping variable will not get incremented.

DWBIConcepts
Steps

 Define mapping Variable with Aggregation type as MAX.


 Use SETMAXVARIABLE($$variable, “Current load Sequence Number") function to store the value into
repository.
 Use the variable $$Variable in Source Qualifier filter.

We can provide a default value for the variable and change the value during your code migration to set the
starting value

DWBIConcepts
Option 2: Assumption- Source and target tables cannot be accessed using a single DB Connection and "load
Sequence Number" is modified by different process.

In this case you can create a mapping parameter and need to pass the value as a parameter.

Steps

 Create a workflow to get the latest "load Sequence Number" and create a parameter file.
This workflow will write a flat file which will contain the parameter value. E.g.
[wf_DAILY_INCR_LOAD] DWBIConcepts
$$Variable=100

 In the actual mapping


Define a mapping parameter $$Variable and use $$Variable in the Source Qualifier

Each time you need to run the workflow which creates the parameter file before your actual workflow is run

Option 3: Assumption- Source and Target table can be accessed using a single DB connection.

If both your source and target tables are connected using a single DB Connection, we can write the filter to
get the latest data in the Source Qualifier itself joining all the tables.

111
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

7. How can we load ‘x’ records (user defined record numbers) out of ‘N’ records from source
dynamically, without using filter and sequence generator transformation?

Answer:

 Take a mapping parameter say $$CNT to pass the number of records we want to load dynamically by
changing in the parameter file each time before session run.
 Next after the Source Qualifier use an Expression transformation and create one output port say

DWBIConcepts
CNTR with value CUME (1).
 Next use an Update Strategy with condition IIF ($$CNT >= CNTR, DD_INSERT, DD_REJECT).

8. Suppose we have ‘n’ number of rows in the Source and we have two target tables. How
can we load ‘n/2’ i.e. first half the source data into one target and the remaining half into
the next target?

Answer:

DWBIConcepts
Use a Expression transformation with an output port ROWNUM with the expression CUME(1)

Next use a Router with 2 groups having below conditions:

MOD( ROWNUM, 2 ) = 0

MOD( ROWNUM, 2 ) = 1

Connect to the corresponding target instances.

Alternatively,

DWBIConcepts
Below are the implementation steps in Informatica.

 First place the Source table and its corresponding Source Qualifier in the mapping.
 Next split the data into two flows; One going to the Expression Transformation with all the ports and
the other flow with any one column to an Aggregator Transformation.
 In the Aggregator add a numeric output port say CNT with expression as COUNT (1) and do not
group by on any other input port.
 Propagate this output column CNT to an Expression Transformation. Next in this expression trans-
formation create another numeric output port JN with expression value 1.
 Now let’s go back to the first expression transformation having all the source columns. Introduce a DWBIConcepts
Sequence Generator transformation with RESET attribute property enabled and propagate
the NEXTVAL port to the expression transformation. Next also add one more numeric output
port JN with expression value 1
 Now take a Joiner Transformation and check the property Sorted Input.
 Now bring in all the columns from the Expression Transformation next to the Source Qualifier. An-
other flow to the joiner is from the expression with two columns CNT and JN. Join condition is based
on JN ports.
 Next after the joiner place a Router Transformation. Create one group say FST with condition
as NEXTVAL < (CNT/2).
 Next introduce two target tables first and second. Propagate the columns of the FST group of the
router to the first target. Next propagate the columns of the Default group of the router transfor-
mation to the second target.

112
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

9. Suppose we have a flat file which has a header record with ‘file creation date’, and de-
tailed data records. Describe the approach to load the 'file creation date' column along
with each and every detailed record.

Answer:

DWBIConcepts
 We can use the below shell command to write the header information in another flat file as pre-
session command.
head -1 Sourc_File.dat > header.txt
 Next Use this flat file header.txt as Lookup in the mapping.
 Create an output port in expression transformation with value 'H' or the tag in the source data file
that identifies the header record
 Use this as Lookup condition and get the file creation date as return field and populate it in your tar-
get table.

DWBIConcepts
10. Scenario Implementation 2
Suppose we have the below two tables. What will be the output if we select Table 1 as Source and use Joiner
and Lookup transformation on Table 2 based on column ID?

Table 1 Table 2
ID ID Name
10 10 A
10 B
10 C

DWBIConcepts
Answer:

When we use a Joiner Transformation as Inner Join on column id, we will get 3 rows as output.

When we use Passive Lookup Transformation we will get 1 row as output. In this case of multiple lookup
match, lookup will return either the first or the last as configured in “on multiple matches” property of the
transformation.

When we use Active Lookup Transformation we will get 3 rows as output, as active lookup returns all the
matching values on multiple lookup matches. DWBIConcepts

11. Suppose we have a flat file which contains just a numeric value. We need to populate this
value in one column of the target table for every source record. How can we achieve this?

Answer:

 Use an Expression and create a decimal Output port say ‘DUMMY’ with a very high number
along with other I/O ports from the source table.
Say, DUMMY = 99999999999 [Note- Use such a number value that can never appear in the
lookup flat file.]

 Now use a Lookup transformation based on the source file. Say, the column name in the lookup
is ‘VALUE’
113
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 Map DUMMY from Expression to Lookup and use the lookup condition as
DUMMY != VALUE

 Next use the VALUE column of the Lookup to populate the target column.

12. How will you load a source flat file into a staging table when the file name is not fixed?
The file name is like sales_2013_02_22.txt, i.e. date is appended at the end of the file as a
part of file name.

DWBIConcepts
Answer:

The generic file name is like- sales_YYYY_MM_DD.txt

One option is to rename the file in the pre session load task. We will use OS level command to rename the
file to a fixed name. We will next set the Informatica source filename to this fixed name and load the file.
E.g. in Unix:
$> mv sales_*.txt sales.txt

Another option is to use Indirect Loading with a fixed file name. The content of the filename will contain the
actual filename to be processed.

DWBIConcepts
E.g. in Unix:
$> ls sales_*.txt > sales.txt

13. Solve the below scenario using Informatica and Database SQL.
Source

PRODUCT_ID PRODUCT_NAME PRODUCT_PRICE


10 Lux 100

DWBIConcepts
10 Dove 200
20 Cinthol 400
20 Dettol 500
30 Fiama 600

Target

PRODUCT_ID PRODUCT_NAME PRODUCT_PRICE SUM_PRODUCT_PRICE DWBIConcepts


10 Lux 100 300
10 Dove 200 300
20 Cinthol 400 900
20 Dettol 500 900
30 Fiama 600 600

Answer:

Using Informatica:

In one pipeline, calculate SUM (product-price) GROUP BY product-id using Aggregator transformation.

In the other flow bring all the data normally, then join the first flow with the second using an Informatica
Joiner transformation suing join column product-id and join type inner join.
114
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Using SQL:

SELECT M.*, N. SUM_PRODUCT_PRICE


FROM SOURCE M,
(SELECT SUM (PRODUCT_PRICE) SUM_PRODUCT_PRICE, PRODUCT_ID
FROM SOURCE
GROUP BY PRODUCT_ID) N
WHERE M. PRODUCT_ID = N. PRODUCT_ID

DWBIConcepts
14. Suppose we have a column in source with values as below:
EMPNO ENAME SAL
1 Tom 100
2 Jack 200
3 Peter 150
4 Donald 230
999 TEST 999
6 Eric 300

DWBIConcepts
If we encounter EMPNO = 999, then whole record set should not be loaded in target table. Describe the ap-
proach.

Answer:

From Source create two flows:-

1: Source -> Expression -> Sorter


2: Source -> Filter ->Expression -> Sorter

DWBIConcepts
1.1 In the Expression create output field dummy_M as 'X'
1.2 Sort on dummy

2.1 In the Filter set Filter Condition as EMPNO = 999

2.2 In the Expression create output field dummy_D as 'X'

2.3 Sort on dummy


DWBIConcepts
3. Next use a Joiner Transform:

Set first flow as Master and second flow as Detail.


Set Join Condition as dummy_M = dummy_D
Set Join Type as Detail Outer Join.

Use Sorted Input.

4. Next use a Filter Transform:

Set Filter Condition as dummy_D IS NULL

And finally your Target.

115
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

15. Can we pass the value of a mapping variable between 2 pipelines under the same map-
ping? If not how can we achieve this?

Answer:

We cannot pass the value of an Informatica variable between 2 pipelines in a same mapping. Mapping varia-
bles are values that can change between sessions. The Integration Service saves the latest value of a map-
ping variable to the repository only at the end of each successful session run. Now in case we have two pipe-

DWBIConcepts
lines under same mapping- The mapping will have a single session and the value of the mapping variable will
be saved to the repository only when this session succeeds, that means when both the pipeline execution
completes.

The alternative method to solve this scenario is as below:

1. Split the pipelines into two different mappings say “map1” and “map2”.
2. Create a mapping variable say “var1” in “map1” and set the value of the variable using SETVARIABLE ()
function. Next our goal is to pass the value of “var1” at the end of the successful session run to “map2”.
3. Create a mapping variable say “var2” in “map2” and use this in the mapping where ever the value of the
variable from the first mapping “var1” is required.

DWBIConcepts
4. Create the workflow with a workflow variable say "wfvar".
5. Create two Non-Reusable sessions say “ses1”,”ses2” for “map1”, “map2” respectively.
6. In the Post-session success variable assignment of “ses1” assign the value of mapping variable “var1” to
workflow variable “wfvar”.
7. In the Pre-session variable assignment of “ses2” assign the value of workflow variable “wfvar” to the map-
ping variable “var2”.

With this approach, we will be able to pass the value from the first session to the second session.

DWBIConcepts
16. Scenario Implementation 3
Suppose we have a huge (size in GB) flat file as source. The flat file contains 22 columns- out of which 4 col-
umns are considered as “key” columns-CUST_SRC_ID, PRODUCT_ID, FF_ID, SNM_ID

There is one more column in the flat file relevant to the discussion that is DATE_ID which stores date in YYYY-
MM-DD format.

The flat file contains duplicate records based on the above 4 columns (that is - the records are not entirely
duplicated, may be some values are different in some other columns).

Now the requirement is to choose all the unique records from the flat file based on the uniqueness of the
DWBIConcepts
above mentioned “keys”. If there is any duplicate record then, we must select the record for which DATE_ID
column contains the latest value. So suppose we get following records in the flat file:

CUST_SRC_ID PRODUCT_ID FF_ID SNM_ID DATE_ID OTHER COLUMNS


123 P1 F1 S1 2013-01-02 X, Y, Z
123 P1 F1 S1 2013-01-06 P, Q, R
123 P1 F1 S1 2013-01-02 S, T, U

In the above case we want the following row in the target:

CUST_SRC_ID PRODUCT_ID FF_ID SNM_ID DATE_ID OTHER COLUMNS


123 P1 F1 S1 2013-01-06 P, Q, R

116
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

How can we achieve this in a single mapping?

Answer:

Use a Sorter transformation after Source Qualifier. Sorting key will be in below order:

 CUST_SRC_ID Ascending order

 PRODUCT_ID Ascending order

DWBIConcepts
 FF_ID Ascending order

 SNM_ID Ascending order

 DATE_ID Descending order

Next use an Expression transformation and create 3 variable ports in the below order:

 V_Keys = CUST_SRC_ID || PRODUCT_ID || FF_ID || SNM_ID

 V_FLAG = IIF (V_Keys != V_Keys_PREV, 1, 0)

DWBIConcepts
 V_Keys_PREV = V_Keys

 O_FLAG = V_FLAG (output port)

Now use a filter transformation with filter condition as below:

 O_FLAG=1

After sorting the data, for every group based on the unique keys, first record will have the latest date, be-
cause we have sorted it on DATE_ID descending. Using this expression logic, for every group 1st record (with
latest date) will have O_FLAG value as 1 and rest others with 0. We will filter those unwanted duplicate rec-

DWBIConcepts
ords using Filter transformation.

17. Scenario Implementation 4


I have a flat file with just one column as given below-
C1
L1
C2 DWBIConcepts
L2
C3
L3

where data starting with C denotes company name and that of L depicts Location of the Company.
Have to load this data in Target table (using Infa) as -
C1, L1
C2, L2
C3, L3

Answer:

This is what i would do to achieve this req.

117
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

1. After the SQ, in a expression generate (This is tricky, use variable port logic)
unique sequence number each group
unique number for each record with in the group
duplicate the column once
After the Expression the output will be as below
Col1, Col2, Col3, Col4
1,1,C1,C1
1,2,L1,L1
2,1,C2,C2

DWBIConcepts
2,2,L2,L2
3,1,C3,C3
3,2,L3,L3

2. Add an Aggregator with


group by on the first column
Agg expression max(col3, col2 = 1)
Agg expression max(col3, col2 = 2)

DWBIConcepts
18. Implement slowly changing dimension of Type 2 which will load current record in Current
table and old data in Log table.

Answer:

 Use Joiner transformation to join Source and Current table with Full Outer Join.

 Next use Expression transformation to mark the rows which are new or old and correspondingly
assign values like 0 or 1 in new output port.

DWBIConcepts
 Pass all the columns to a Router transformation and filter based on new port created.

 If 0 means use Update Strategy transform DD_INSERT with insert to current table.

 If 1 means use Update Strategy transform DD_UPDATE with update to current table

 Also populate the data from Current table for 1 to the Log table.

DWBIConcepts

118
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

26. Performance Tuning

1. Which one is faster Connected or Unconnected Lookup?

Answer:

There can be some very specific situation where unconnected lookup may add some performance benefit on

DWBIConcepts
total execution.

If you are calling the “Unconnected lookup” based on some condition (e.g. calling it from an expression
transformation only when some specific condition is met - as opposed to a connected lookup which will be
called anyway) then you might save some “calls” to the unconnected lookup, thereby marginally improving
the performance.

The improvement will be more apparent if your data volume is really huge. Keep the “Pre-build Lookup
Cache” option set to “Always disallowed” for the lookup, so that you can ensure that the lookup is not even
cached if it is not being called, although this technique has other disadvantages, check

DWBIConcepts
http://www.dwbiconcepts.com/etl/14-etl-informatica/46-tuning-informatica-lookup.html , especially the
points under following subheadings:
- Effect of choosing connected OR Unconnected Lookup, and
- WHEN TO set Pre-build Lookup Cache OPTION (AND WHEN NOT TO)

2. How we can improve performance of Informatica Normalization Transformation.

Answer:

DWBIConcepts
As such there is no way to improve the performance of any session by using Normalizer. Normalizer is a
transformation used to pivot or normalize datasets and has nothing to with performance. In fact, Normalizer
does not much impact the performance (apart from taking a little more memory).

3. How to improve the Session performance?

Answer: DWBIConcepts
 Run concurrent sessions
 Partition session (Power center)
 Tune Parameter - DTM buffer pool, Buffer block size, Index cache size, data cache size, Commit In-
terval, Tracing level (Normal, Terse, Verbose Initialization, Verbose Data)
 The session has memory to hold 83 sources and targets. If it is more, then DTM can be increased.
 The Informatica server uses the index and data caches for Aggregate, Rank, Lookup and Joiner trans-
formation. The server stores the transformed data from the above transformation in the data cache
before returning it to the data flow. It stores group information for those transformations in index
cache. If the allocated data or index cache is not large enough to store the date, the server stores
the data in a temporary disk file as it processes the session data. Each time the server pages to the
disk the performance slows. This can be seen from the counters. Since generally data cache is larger
than the index cache, it has to be more than the index.
 Remove Staging area

119
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 Tune off Session recovery


 Reduce error tracing

4. How do you identify the bottlenecks in Mappings?

Answer:

DWBIConcepts
Bottlenecks can occur in

 Targets - The most common performance bottleneck occurs when the informatica server writes to a tar-
get database. You can identify target bottleneck by configuring the session to write to a flat file target. If
the session performance increases significantly when you write to a flat file, you have a target bottle-
neck.

Solution:

DWBIConcepts
 Drop or Disable index or constraints
 Perform bulk load (Ignores Database log)
 Increase commit interval (Recovery is compromised)
 Tune the database for RBS, Dynamic Extension etc.,

 Sources - Set a filter transformation after each SQ and see the records are not through. If the time taken
is same then there is a problem. You can also identify the Source problem by Read Test Session - where
we copy the mapping with sources, SQ and remove all transformations and connect to file target. If the
performance is same then there is a Source bottleneck.

DWBIConcepts
Using database query - Copy the read query directly from the log. Execute the query against the source
database with a query tool. If the time it takes to execute the query and the time to fetch the first row
are significantly different, then the query can be modified using optimizer hints.

Solution:

 Optimize Queries using hints.


 Use indexes wherever possible.

 Mapping - If both Source and target are OK then problem could be in mapping. Add a filter transfor-
mation before target and if the time is the same then there is a problem. (OR) Look for the performance
DWBIConcepts
monitor in the Sessions property sheet and view the counters.

Solutions:

 If High error rows and rows in lookup cache indicate a mapping bottleneck.
 Optimize Single Pass Reading:
 Optimize Lookup transformation :
o Caching the lookup table: When caching is enabled the Informatica server caches the lookup ta-
ble and queries the cache during the session. When this option is not enabled the server queries
the lookup table on a row-by row basis. Static, Dynamic, Shared, Un-shared and Persistent cache

o Optimizing the lookup condition: Whenever multiple conditions are placed, the condition with
equality sign should take precedence.

120
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

o Indexing the lookup table: The cached lookup table should be indexed on order by columns. The
session log contains the ORDER BY statement The un-cached lookup since the server issues a SE-
LECT statement for each row passing into lookup transformation, it is better to index the lookup
table on the columns in the condition

 Optimize Filter transformation: You can improve the efficiency by filtering early in the data flow. Instead
of using a filter transformation halfway through the mapping to remove a sizable amount of data.

 Use a source qualifier filter to remove those same rows at the source, If not possible to move the filter

DWBIConcepts
into SQ, move the filter transformation as close to the source qualifier as possible to remove unneces-
sary data early in the data flow.

 Optimize Aggregate transformation:


o Group by simpler columns. Preferably numeric columns.
o Use Sorted input. The sorted input decreases the use of aggregate caches. The server assumes
all input data are sorted and as it reads it performs aggregate calculations.
o Use incremental aggregation in session property sheet.

 Optimize Seq. Generator transformation:

DWBIConcepts
o Try creating a reusable Seq. Generator transformation and use it in multiple mappings
o The number of cached value property determines the number of values the Informatica server
caches at one time.

 Optimize Expression transformation:


o Factoring out common logic
o Minimize aggregate function calls.
o Replace common sub-expressions with local variables.
o Use operators instead of functions.

DWBIConcepts
 Sessions: If you do not have a source, target, or mapping bottleneck, you may have a session bottleneck.
You can identify a session bottleneck by using the performance details. The informatica server creates
performance details when you enable Collect Performance Data on the General Tab of the session prop-
erties. Performance details display information about each Source Qualifier, target definitions, and indi-
vidual transformation. All transformations have some basic counters that indicate the Number of input
rows, output rows, and error rows. Any value other than zero in the readfromdisk and writetodisk coun-
ters for Aggregate, Joiner, or Rank transformations indicate a session bottleneck. Low
BufferInput_efficiency and BufferOutput_efficiency counter also indicate a session bottleneck. Small
cache size, low buffer memory, and small commit intervals can cause session bottlenecks.
DWBIConcepts
 System (Networks)

5. How do you handle performance issues in Informatica? Where can you monitor the per-
formance?

Answer:

There are several aspects to the performance handling .Some of them are:-

 Source tuning
 Target tuning
 Repository tuning
121
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 Session performance tuning


 Incremental Change identification in source side.
 Software, hardware (Use multiple servers) and network tuning.
 Bulk Loading
 Use the appropriate transformation.

To monitor this

 Set performance detail criteria

DWBIConcepts
 Enable performance monitoring
 Monitor session at runtime &/ or Check the performance monitor file .

6. What are performance counters?

Answer:

DWBIConcepts
The performance details provide that help you understand the session and mapping efficiency. Each Source
Qualifier, target definition, and individual transformation appears in the performance details, along with that
display performance information about each transformation

Understanding Performance Counters

All transformations have some basic that indicates the number of input rows, output rows, and error rows.
Source Qualifiers, Normalizes, and targets have additional that indicates the efficiency of data moving into
and out of buffers. You can use these to locate performance bottlenecks. Some transformations have specif-

DWBIConcepts
ic to their functionality. For example, each Lookup transformation has an indicator that indicates the number
of rows stored in the lookup cache. When you read performance details, the first column displays the trans-
formation name as it appears in the mapping, the second column contains the name, and the third column
holds the resulting number or efficiency percentage. When you partition a source, the Informatica Server
generates one set of for each partition. The following performance illustrate two partitions for an Expression
transformation:

Transformation Counter Value

 EXPTRANS [1] DWBIConcepts


o Expression_input rows 8
o Expression_output rows 8
 EXPTRANS [2]
o Expression_input rows 16
o Expression_output rows 16

Note: When you partition a session, the number of aggregate or rank input rows may be different from the
number of output rows from the previous transformation.

7. How can we increase Session Performance?

122
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

Answer:

 Minimum log (Terse)


 Partitioning source data
 Performing ETL for each partition, in parallel. (For this, multiple CPUs are needed)
 Adding indexes
 Changing commit Level
 Using Filter transformation to remove unwanted data movement

DWBIConcepts
 Increasing buffer memory, when large volume of data
 Multiple lookups can reduce the performance. Verify the largest lookup table and tune the expres-
sions.
 In session level, the causes are small cache size, low buffer memory and small commit interval

At system level,

 WIN NT/2000-Use the task manager


 UNIX: VMSTART, IOSTART

DWBIConcepts
Hierarchy of optimization

 Target
 Source
 Mapping
 Session
 System

Optimizing Target Databases:

DWBIConcepts
 Drop indexes /constraints
 Increase checkpoint intervals
 Use bulk loading /external loading
 Turn off recovery
 Increase database network packet size

Source level

 Optimize the query (using group by, group by) DWBIConcepts


 Use conditional filters
 Connect to RDBMS using IPC protocol

Mapping

 Optimize data type conversions


 Eliminate transformation errors
 Optimize transformations/ expressions

Session

 Concurrent batches
 Partition sessions

123
© www.dwbiconcepts.com – All rights reserved.
www.dwbiconcepts.com – Community of DWBI Professionals

 Reduce error tracing


 Tune session parameters

System

 Improve network speed


 Use multiple preservers on separate systems
 Reduce paging

DWBIConcepts
8. Scenario Implementation 1
What would be the best approach to update a huge table (more than 200 million records) using Informatica.
The table does not contain any primary key. However there are a few indexes defined on it. The target table
is partitioned. On the other hand the source table contains only a few records (less than a thousand) that will
go to the target and update the same. Is there any better approach than just doing it by an update strategy
transformation?

Answer:

DWBIConcepts
Since the target busy percentage is 99.99% it is very clear that the bottleneck is on the target. So we need
tweak the target. I have couple of Options

1. Since the target tale is partitioned on time_id, you need to include in the WHERE clause of the SQL fired by
Informatica. For that you can define the time_id column as primary key in the target definition. With this
your update query will have the time_id in the where clause.

2. With Informatica update strategy, it fires update sql for every row which is marked for update by update
strategy. To avoid multiple update statements you can INSERT all the records which is meant to be UPDATE
into a temporary table. Then use a correlated sql to update the records in the actual table (200M table). This

DWBIConcepts
query can be fires as a post session SQL. Please see the sample SQL

UPDATE TGT_TABLE U SET (U.COLUMNS_LIST /*Column List to be updated*/) = (SELECT I.COLUMNS_LIST


/*Column List to be updated*/ FROM UPD_TABLE I WHERE I.KEYS = U.KEYS AND I.TIME_ID = U.TIME_ID)
WHERE EXISTS (SELECT 1 FROM UPD_TABLE I WHERE I.KEYS = U.KEYS AND I.TIME_ID = U.TIME_ID)
TGT_TABLE –

Actual table with 200M records UPD_TABLE - Table with records meant for UPDATE (1K record) We need to
make sure that your indexes are up to date and stats are collected. Since this is more to be done with DB
performance, you may need the help of DBA as well to check the DB throughput, SQL cost etc Hope this will
help you. DWBIConcepts

124
© www.dwbiconcepts.com – All rights reserved.

Das könnte Ihnen auch gefallen