Beruflich Dokumente
Kultur Dokumente
You can enable bulk loading when you load to Sybase, Oracle, or Microsoft SQL Server. If you
enable bulk loading for other database types, the Informatica Server reverts to a normal load.
Bulk loading improves the performance of a session that inserts a large amount of data to the
target database.
When bulk loading, the Informatica Server invokes the database bulk utility and
bypasses the database log, which speeds performance. Without writing to the database log,
however, the target database cannot perform rollback. As a result, you may not be able to
perform recovery.
You must drop indexes and constraints in the target tables before running a bulk load
session. After the session completes, you can rebuild them.
Note: When loading to Microsoft SQL Server and Oracle targets, you must specify a normal
load if you select data driven for the Treat Source Rows As session property. When you specify
bulk mode and data driven, the Informatica Server fails the session.
When bulk loading to Sybase targets, the Informatica Server ignores the commit interval you
define in the session properties and commits data when the writer block is full.
When bulk loading to Microsoft SQL Server and Oracle targets, the Informatica Server
commits data at each commit interval. Also, Microsoft SQL Server and Oracle start a new bulk
load transaction after each commit.
Tip: When bulk loading to Microsoft SQL Server or Oracle targets, define a large commit
interval to reduce the number of bulk load transactions and increase performance.
If one Source Qualifier provides data for multiple targets, you can enable constraint-based
loading in a session to have the Informatica Server load data based on target table primary and
foreign key relationships.
When you select this option, the Informatica Server orders the target load on a row-by-row
basis. For every row generated by an active source, the Informatica Server loads the
corresponding transformed row first to the primary key table, then to any foreign key tables.
Constraint-based loading depends on the following requirements:
Active source. Related target tables must have the same active source.
Key relationships. Target tables must have key relationships.
Target connection groups. Targets must be in one target connection group.
Treat rows as insert. Use this option when you insert into the target. You cannot use
updates with constraint-based loading.
3. What is target connection group?
Targets in the same target connection group meet the following criteria:
You can use a dynamic or static cache. You can use a static cache.
Can return multiple columns from the same row or Designate one return port (R). Returns
insert into the dynamic lookup cache. one column from each row.
10. What are the differences between MAPPLET & REUSABLE TRANSFORMATION?
We can use joiner if we have following type of sources (or) we can use following sources in the
joiner.
Both input pipelines originate from the same Source Qualifier transformation.
Both input pipelines originate from the same Normalizer transformation.
Both input pipelines originate from the same Joiner transformation.
Either input pipeline contains an Update Strategy transformation.
You connect a Sequence Generator transformation directly before the Joiner
transformation.
Check the status of a target database before loading data into it.
Determine if enough space exists in a database.
Perform a specialized calculation.
Drop and recreate indexes.
18. When do we use LOOKUP transformation?
You can use the Lookup transformation to perform many tasks, including:
Get a related value. For example, if your source table includes employee ID, but you
want to include the employee name in your target table to make your summary data
easier to read.
Perform a calculation. Many normalized tables include values used in a calculation,
such as gross sales per invoice or sales tax, but not the calculated value (such as net
sales).
Update slowly changing dimension tables. You can use a Lookup transformation to
determine whether records already exist in the target.
You can use the Source Qualifier to perform the following tasks:
Create a custom query to issue a special SELECT statement for the Informatica Server
to read source data.
You can perform the following tasks with a Sequence Generator transformation:
Create keys.
Replace missing values.
Cycle through a sequential range of numbers.
24. What are the output files created or used by the Informatica server?
The Informatica server creates a log for all status and error messages. On UNIX, the
default name of the Informatica Server log file is pmserver.log. On Windows, the Informatica
Server logs status and error messages in the event log. Use the Event Viewer to access those
messages.
The Informatica Server creates a workflow log file for each workflow it runs. It writes
information in the workflow log such as initialization of processes, workflow task run
information, errors encountered, and workflow run summary.
The Informatica Server creates a session log file for each session it runs. It writes
information in the session log such as initialization of processes, session validation, creation of
SQL commands for reader and writer threads, errors encountered, and load summary. The
amount of detail in the session log depends on the tracing level that you set.
When you run a session, the Workflow Manager creates session details that provide load
statistics for each target in the mapping. You can monitor session details during the session or
after the session completes. Session details include information such as table name, number of
rows written or rejected, and read and write throughput. You can view this information by
double-clicking the session in the Workflow Monitor.
Performance details log: (file name: session_name.perf & dir: $PMSessionLogDir)
The Informatica Server can create a set of information known as session performance
details to help determine where performance can be improved. Performance details provide
transformation-by-transformation information on the flow of data through the session.
The Informatica Server creates a reject file for each target in the session. The reject file
contains rows of data that the writer does not write to targets.
When you run a session that uses an external loader, the Informatica Server creates a
control file and a target flat file. The control file contains information about the target flat file
such as data format and loading instructions for the external loader.
If you use a flat file as a target, you can configure the Informatica Server to create an
indicator file for target row type information. For each target row, the indicator file contains a
number to indicate whether the row was marked for insert, update, delete, or reject.
If the session writes to a target file, the Informatica Server creates the target file based
on file target definition.
Cache files log: (file name: PM*.idx and PM*.dat & dir: $PMCacheDir)
The Informatica Server creates index and data cache files for the following
transformations in a mapping:
Aggregator transformation
Joiner transformation
Rank transformation
Lookup transformation
The Informatica Server writes to the index and data cache files during the session in the
following cases:
When the Informatica server starts a recovery session, it reads the OPB_SRVR_RECOVERY
table and notes the row id of last row committed to the target database. The informatica server
then reads all sources again and starts processing from the next row id.
When you run a session with a Joiner transformation, the Informatica Server reads all the rows
from the master source and builds index and data caches based on the master rows. Since the
caches read only the master source rows, you should specify the source with fewer rows as the
master source. After building the caches, the Joiner transformation reads rows from the detail
source and performs joins.
When you run a workflow that uses an Aggregator transformation, the Informatica Server
creates index and data caches in memory to process the transformation. If the Informatica
Server requires more space, it stores overflow values in cache files.
During a workflow, the Informatica Server compares an input row with rows in the data cache.
If the input row out-ranks a cached row, the Informatica Server replaces the cached row with
the input row. If the Rank transformation is configured to rank across multiple groups, the
Informatica Server ranks incrementally for each group it finds.
The Informatica Server builds a cache in memory when it processes the first row of data in a
cached Lookup transformation. It allocates memory for the cache based on the amount you
configure in the transformation or session properties. The Informatica Server stores condition
values in the index cache and output values in the data cache.
It refers to the logical organization of data used for analysis in OLAP applications. This logical
organization is generally specialized for the most efficient data representation and access by the
end users of the OLAP applications.
31. What is meant by aggregation?
It refers to pre-stored summery of data or grouping of detailed data which satisfies a business
rule.
Slowly growing dimensions are dimension tables that have slowly increasing data without
updates to existing dimensions. We maintain these dimensions by appending new data to the
existing table.
Slowly changing dimensions are dimension tables that have slowly increasing data as well as
updates to existing dimensions. When updating existing dimensions we decide whether to keep
all historical data, no historical data or just the current and previous versions of dimension
table.
Mapping wizards are designed to create mappings for loading and marinating star schemas, a
series of dimension tables related to a central fact table.
It loads a static fact or dimension tables by inserting all rows. We use this mapping when we
want to drop existing data before loading new data.
It filters the source rows based on the user-defined comparisons and inserts only those found to
be new to the target. We use this mapping, which source rows are, new and to load them to an
existing target table and existing target does not require updates.
It filters the source rows based on the user-defined comparisons and inserts rows those found to
be new to the target. And rows containing changes to existing dimension are updated in the
target by overwriting the existing target. When we use this mapping, the designer automatically
creates an additional column called PM_PRIMARYKEY in the target
It filters the source rows based on the user-defined comparisons and inserts rows those found to
be new to the target. Changes are tracked in the target by flagging the current version of each
dimension and versioning the primary key. When we use this mapping, the designer
automatically creates two additional columns called PM_PRIMARYKEY &
PM_CURRENT_FLAG in the target
It filters the source rows based on the user-defined comparisons and inserts rows those found to
be new to the target. Changes are tracked in the target by maintaining an effective date range
for each version of the each dimension in the target. When we use this mapping, the designer
automatically creates three additional columns called PM_PRIMARYKEY &
PM_BEGGIN_DATE and PM_END_DATE in the target
It filters the source rows based on the user-defined comparisons and inserts rows those found to
be new to the target. The Informatica server tracks changes by saving the existing data in
different columns of the same row and replacing the existing data with the updates. The
Informatica server optionally enters the sys date as a time stamp for each row it inserts or
updates.