Beruflich Dokumente
Kultur Dokumente
ODS is nothing but the Operational Data Store which holds the data when the
business gets started. It means , it holds the history of data till yesterdays
data(depends upon customer requirement). Some...
Yes, ODS is a Open Data Source where it contains real time data (because we should
apply any changes on real time data right..!) so dump the real time data into ODS
called Landing area later we get the data into staging area here is the place where
we do all transformation.
Lets suppose we have some 10,000 odd records in source system and
when load them into target how do we ensure that all 10,000 records that
are loaded to target doesn't contain any garbage values. How do...
It requires 2 steps:
As other posts have mentioned, I would do some of the following: "sql Select
COLUMN, count(*) from TABLE group by COLUMN order by COLUMN Select
min(COLUMN), max(COLUMN) from TABL...
3 answers
A ETL Tester primarily test source data extraction, business transformation logic and
target table loading . There are so many tasks involved for doing the same , which
are given below - 1. Stage tab...
Answered by: radhakrishna on: Nov 4th, 2013
ETL tester responsibilities are: writing sql queries for various scenarios like count
test, primary key test, duplicate test, attribute test, default check, technical data
quality, business data qua...
6 answers
In etl testing if the src is flat file, then how will you verify the data count validation?
Not always (in my experience, quite rarely in fact). Most often the flat file is just
that, a flat file. If in UNIX, the wc command is great, in Windows, one could open the
file in notepad and CTRL+...
To find the count in flat file, you have to import that file into excel sheet (use Data-
>Import External Data->import data) and find the count of records using count()
[goto Insert->function->count->] in excel....
16 answers
9.5.1
9.5
1.To find faults in software 2.To purify that software has no defects. 3.To find perform
problems. 4.To give cunt in software. select best option.
List of all the etl and reporting tools available in the market
Can any one let me know where can I find the list of etl& reporting tolls available in
the market. Thanks in advance
Hi, i would like share my knowledge on this. ETL Tools: Ab Initio, BusinessObjects
Data Integrator, IBM InfoSphereDataStage, Informatica, Oracle Warehouse Builder,
SQL Server In...
2 answers
As an etl tester what are the things to test in Informatica tool. What are the types of
testing has to do in Informatica and the things to test in each type? If there is any
documents on etl testing
ETL Testing in Informatica: 1. First check the workflow exist in specified folder. 2.
Run the workflow. If the workflow is success then check the target table is loaded on
proper data else we can nee...
1 answer
If you even do dynamic lookup you should make use of update strategy to mark the
records either to insert or update in the target using look-up cache (look-up will be
cached on target).
your requirement is not so clear here whether to use dynamic lookup or session
properties.
Note:When you create a mapping with a Lookup transformation that uses a dynamic
lookup cache, you must use Update Strategy transformations to flag the rows for
the target tables.
3 answers
Data Cleansing is a process of detecting and correcting the corrupt and inaccurate
data from table or database.
There are following steps used:-
1) Data auditing
2) Workflow Specification
3) Workflow Execution
4) Post-processing and controlling
5 answers
If we are using flat file in our loading, and flat file name change daily so how we
handle this without changing file name manually daily? for example: like file name is
changing depend on date so what should I do? Plshelp…
Use the indirect filelist option at informatica session. say the filelist name is
daily_file.txt put a shell script daily_filename.sh at the pre-session command. The
content of daily_filename.sh is ...
You can use Informatica File List option in target to get the dynamic file names along
with the transaction control transformation so that you can create dynamic
filenames based on some transaction properties.....
11 answers
The main difference between connected and unconnected lookup is, we can call
unconnected lookup based on some conditions but not the connected lookup.
Lookup can be used as Connected or Unconnected. Apart from cache and receiving
input values from pipe line, there is one more difference. If you want to use the
same lookup more than one in a mapping ...
1 answer
How to check the existence of a lookup file in a graph ..The requriement is if lookup
file is present then some search will be carried out in the same else default value
will be set. Please note we need to check the existence of the lookup file in graph
level only..
7 answers
SIMPLY to say::::::::::::::::::
Mapping :-set of transformations.And moving data from source to target along with
transformation s
Cache files
3 answers
Aggregator
joiner
lookup
rank
sorter
Answer Question
deepthi
Aggregator
joiner
lookup
rank
sorter
5 answers
Answer posted by staline on 2005-05-27 00:42:44: you can use a command task to
call the shell scripts, in the following ways: 1. Standalone command task. You can
use a command task anywhere in the workflow or worklet to run shell commands.
2. Pre- and post-session shell command. You can call...
You can use a Command task to call the shell scripts, in the following ways: 1.
Standalone Command task. You can use a Command task anywhere in the workflow
or worklet to run shell commands. 2. Pr...
1
. What is Data warehouse?
In 1980, Bill Inmon known as father of data warehousing. "A Data warehouse is a
subject oriented, integrated ,time variant, non volatile collection of data in support
of management's decision making process".
Subject oriented : means that the data addresses a specific subject such
as sales, inventory etc.
Time variant : implies that the data is stored in such a way that when
some data is changed.
Non volatile : implies that data is never removed. i.e., historical data is
also kept.
In Data base we can maintain only current data which was not more than 3 years
But in datawarehouse we can maintain history data it means from the starting day
of enterprise DDL commands it means ( Insert ,update,delete)we can do in
Database In datawarehouse once data loaded in Datawarehouse we can do any
DDL operatations.
Database is used for insert, update and delete operation where asdatawarehouse is
used for select to analyse the data.
The tables and joins in DB are are complex Tables and joins are simple since they
since they are normalized are de-normalized
Performance is slow for analysis queries High performance for anlytical queries
Database uses OLTP concept Data warehouse uses OLAP concept, means Data
warehouse stores historical data.
A database is a collection related data and also it is related to same data. Where as
come to Data warehouse, It is collection of data integrated from different sources
and stored in one container for taking or ( getting knowledge) managerial
decisions.
In database we are using CRUD operations means “create, read, use, delete” but in
datawarehouse we are using select operation.
3
. What are the benefits of data warehousing?
4
. What are the types of data warehouse?
Data Mart
5
. What is the difference between data mining and data warehousing?
Data mining, the operational data is analyzed using statistical techniques and
clustering techniques to find the hidden patterns and trends. So, the data mines
do some kind of summarization of the data and can be used by data warehouses
for faster analytical processing for business intelligence.
Data warehouse may make use of a data mine for analytical processing of the
data in a faster way.
Generally, basic testing concepts remains same across all domains. So, the basic
testing questions will also remain same. The only addition would be some questions
on domain. e.g. in case of ETL testing interview questions, it would be some
concepts of ETL, how to’s on some specific type of checks / tests in SQL and some
set of best practices. Here is the list of some ETL testing interview questions:
Q. 1) What is ETL?
Ans. ETL - extract, transform, and load. Extracting data from outside source
systems.Transforming raw data to make it fit for use by different departments.
Loading transformed data into target systems like data mart or data warehouse.
Q5) What is the difference between Data Mining and Data Warehousing?
Ans. Data mining - analyzing data from different perspectives and concluding it
into useful decision making information. It can be used to increase revenue, cost
cutting, increase productivity or improve any business process. There are lot of tools
available in market for various industries to do data mining. Basically, it is all about
finding correlations or patterns in large relational databases.
Data warehousing comes before data mining. It is the process of compiling and
organizing data into one database from various source systems where as data
mining is the process of extracting meaningful data from that database (data
warehouse).
Production Reconciliation
IT Developer Productivity
Data Integrity
Generally, basic testing concepts remains same across all domains. So, the basic testing questions will also remain
same. The only addition would be some questions on domain. e.g. in case of ETL testing interview questions, it would
be some concepts of ETL, how to’s on some specific type of checks / tests in SQL and some set of best practices.
Here is the list of some ETL testing interview questions:
Q. 1) What is ETL?
Ans. ETL - extract, transform, and load. Extracting data from outside source systems. Transforming raw data to make
it fit for use by different departments. Loading transformed data into target systems like data mart or data warehouse.
Q5) What is the difference between Data Mining and Data Warehousing?
Ans. Data mining - analyzing data from different perspectives and concluding it into useful decision making
information. It can be used to increase revenue, cost cutting, increase productivity or improve any business process.
There are lot of tools available in market for various industries to do data mining. Basically, it is all about finding
correlations or patterns in large relational databases.
Data warehousing comes before data mining. It is the process of compiling and organizing data into one database
from various source systems where as data mining is the process of extracting meaningful data from that database
(data warehouse).
Reconciliation testing: Sometimes, it is also referred as ‘Source to Target count testing’. In this check,
matching of count of records is checked. Although this is not the best way, but in case of time crunch, it helps.
Constraint testing: Here test engineer, maps data from source to target and identify whether the data is
mapped or not. Following are the key checks: UNIQUE, NULL, NOT NULL, Primary Key, Foreign key, DEFAULT,
CHECK
Validation testing (source to target data): It is generally executed in mission critical or financial projects.
Here, test engineer, validates each data point and match source to target data.
Testing for duplicate check: It is done to ensure that there are no duplicate values for unique columns.
Duplicate data can arise due to any reason like missing primary key etc. Below is one example:
Testing for attribute check: To check if all attributes of source system are present in target table.
Logical or transformation testing: To test any logical gaps in the. Here, depending upon the scenario,
following methods can be used: boundary value analysis, equivalence partitioning, comparison testing, error guessing
or sometimes, graph based testing methods. It also covers testing for look-up conditions.
Incremental and historical data testing: Test to check the data integrity of old & new data with the addition
of new data. It also covers the validation of purging policy related scenarios.
GUI / navigation testing: To check the navigation or GUI aspects of the front end reports.
In case of ETL or data warehouse testing, re-testing or regression testing is also part of this effort.
As these tools help in dealing with huge amount of Data and Historic Data, it is necessary to carry out
the ETL testing. To keep a check on the accuracy of the Data, ETL testing is very important.
There are two types of ETL testing available
Application Testing
Data Eccentric Testing
Having a well defined ETL testing strategy can make the testing process much easier. Hence, this
process need to be followed before you start the Data Integration processed with the selected ETL
tool.
In this ETL testing process, a group of experts comprising the programming and developing team will
start writing SQL statements. The development team may customize according to the requirements.
But only if you are well aware of the technical features and applications, you will have the chance of
getting hired in this profile. You have to be well prepared on these basic concepts of ETL tools, their
techniques and processes to give your best shot.
Below you can find few Questions and Answers which are more frequently asked in the ETL
testing interviews:
Q #1. What is ETL?
Ans. ETL refers to Extracting, Transforming and Loading of Data from any outside system to the
required place. These are the basic 3 steps in the Data Integration process. Extracting means locating
the Data and removing from the source file, transforming is the process of transporting it to the
required target file and Loading the file in the target system in the format applicable.
------------
[AD]
Q #2. Why ETL testing is required?
Ans.
To keep a check on the Data which are being transferred from one system to the other.
To keep a track on the efficiency and speed of the process.
To be well acquainted with the ETL process before it gets implemented into your business and
production.
Q #3. What are ETL tester responsibilities?
Ans.
Requires in depth knowledge on the ETL tools and processes.
Needs to write the SQL queries for the various given scenarios during the testing phase.
Should be able to carry our different types of tests such as Primary Key, defaults and keep a
check on the other functionality of the ETL process.
Quality Check
Q #4. What are Dimensions?
Ans. Dimensions are the groups or categories through which the summarized data are sorted.
Q #5. What is Staging area referring to?
Ans. Staging area is the place where the data is stored temporarily in the process of Data Integration.
Here, the data s cleansed and checked for any duplication.
Q #6. Explain ETL Mapping Sheets.
Ans. ETL mapping sheets contains all the required information from the source file including all the
rows and columns. This sheet helps the experts in writing the SQL queries for the ETL tools testing.
Q #7. Mention few Test cases and explain them.
Ans.
Mapping Doc Validation – Verifying if the ETL information is provided in the Mapping Doc.
Data Check – Every aspect regarding the Data such as Data check, Number Check, Null check
are tested in this case
Correctness Issues – Misspelled Data, Inaccurate data and null data are tested.
Q #8. List few ETL bugs.
Ans. Calculation Bug, User Interface Bug, Source Bugs, Load condition bug, ECP related bug.
4.What are snapshots? What are materialized views & where do we use them? What is a materialized view log? - ETL
11.What are the differences between Connected and Unconnected lookup? - ETL
12.When do we use dynamic cache and static cache in connected and unconnected lookup transformations? Difference
between Stop and Abort - ETL
13.What is tracing level? How many types of transformations supported by sorted input? - ETL
14.What are the differences between SQL Override and Update Override? - ETL
16.In Informatica, I have 10 records in my source system. I need to load 2 records to target for each run. how do I do this? -
ETL
18.How DTM buffer size and buffer block size are related? - ETL
19.State the differences between shortcut and reusable transformation - ETL
20.What are the out put files that the informatica server creates during the session running? - ETL
2. Snapshot Fact Table: This type of fact table deals with the particular period of
time. They contain non-additive and semi-additive facts.