Sie sind auf Seite 1von 5

ETL (Extraction Transformation and Load)

ETL is a way of moving data from one environment to the other to meet an
organization’s business need.
ETL is also described as a process that extracts raw data from various systems,
transforming it into valuable information and loads it for business use.
Informatica is a tool used for an ETL process.
Informatica provides an environment or platform that allows one to move data
from a variety of systems, transform and integrate the data and store them in a
centralized location such as a Datamart, Datawarehouse and an Operational
data store (ods).

Informatica Main Components:


Informatica Repository
Informatica Client
Informatica Server.

Informatica Repository:
Created on a Database e.g (Oracle, Sybase )
Center of Informatica Suite. There are tables within the Repository Database that
holds the metadata information.
Other Components of the tool access the repository to save and retrieve
metadata Information.

Informatica Client:
Comprised of 3 applications
1. Repository Manager:
Used to create and administer the metadata repository. E.g. create users.
Manage folders and locks.
Used to assign privileges and permissions.
Used to generate repository reports.

2. Designer:
Its Used to create mappings that contains the transformation instructions for the
informatica server.
The Designer has five tools within it that are used to analyze sources, design
target schemas, and build the source-to-target mappings.

Designer tools:
Source Analyzer: Import or Create Source Definitions
Warehouse Designer: Import or Create Target Definitions
Transformation Developer: Used to Develop reusable transformations to use in a
mapping.
Mapplet Designer: Create sets of transformations to use in mapping.
Mapping Designer: Create mapping that the informatica server uses to extract,
transform and load data.

3. Server Manager:
Used to create, schedule, execute and monitor session.
A session is created based on a mapping in the repository and schedule it to run
against an informatica server.
One can also view scheduled and running sessions through the domain.

The session properties display all the information that the informatica server is
Going to use move the data.
It displays the source info including the database
It displays the target information including type and database
The memory size it going to utilize for the data transformation

Informatica Server:
Reads Mapping and Session information from the repository
Extracts data from the mapping sources, stores the data in memory while it
applies the transformation rules that was configured in the mapping.
Loads the transformed data to the mapping targets configured.
Repository Objects:
Repository Objects are created using the Informatica client tools. Some of the
objects are:
Source Definitions: Definitions of database objects or files that provides the
source data.
Informatica PowerCenter accesses the following sources:
Relational: Oracle,Sybase,Informix,IBM DB2, Microsoft Sql server, Access,
Teradata.
Files: Fixed and Delimited Flat Files, Xml Files, Excel and Cobol Files
Extended Apps: Peoplesoft, SAP R/3, Siebel, JD Edwards and IBM Mqseries.
Mainframe: MVS and DB2

Target Definitions: Definitions of Database Objects or files that contain or


will contain the target or final data.
Informatica PowerCenter can load data to the following targets:
Relational: Oracle, Sybase, Informix, Sybase IQ, IBM DB2, MS Sql Server, MS
Access and Teradata.
File: Fixed width and Delimited files, Xml
Extended Apps: Peoplesoft, SAP R/3, Siebel, JD Edwards and IBM Mqseries.
Mainframe: MVS and DB2

Mapping:
A set of source and target definitions along with transformations containing the
business logic that is built into the transformations. These are the
instructions that the informatica server uses to transform and move
data.
Transformations:
A transformation object is any object in a mapping that generates or modifies
data. A transformation is added to a mapping to transform the data to meet the
required business logic.
Examples of Transformation objects are:
Expression evaluates and calculates a value
Filter records based on certain condition
Joiner Joins data from different Databases or file systems
Lookup looks up values
Normalizer normalizes flat records including those sourced from Cobol
Router Routes data into multiple transformations based on certain expressions
and filters

Mapplet:
A Mapplet is a reusable object that represents a set of transformations.
Mapplet allows one to build logic into source definitions and transformations that
can be reused in multiple mappings.
Think of a mapplet as a function. You supply some data and it returns an answer.

Slowly changing dimension:


A slowly changing dimension is a dimensional table that is loaded base on the
amount of historical data that is needed to keep and handle the historical data.

There are 3 types:


Type 1: Loads by inserting new dimensions and overwriting the existing
dimensions. It’s used when the history of the previous dimension is not needed.
Type 2: Loads by inserting new and changed dimensions using an incremental
primary key to track the changes.
Type 3: Loads by inserting new and updating values in existing dimensions.
Teradata Multiload:
Supervalu Utilizes multiload to push the final output of data from Informatica to its
Teradata Datawarehouse.
Multiload is a table-maintenance utility.
It allows all base sql/dml functions.
It can work on up to five tables at a time.
It can handle multiple input files concurrently.
It has a full restart capability.

D a t a mr t A a l o g i c a l s u b s e t o f a c o m p l
D a t a w
a r e h A o Qu s u e e r y a b l e s o u r c e o f d a t a
A n o p e r a t i o n a l s y s t e m o f r e
O p e r a t i o n a Tt rl ah D ne as I t a na c f oSt i ro t mo n r sa e t oi ( c Of a a D R b S e u ) ps o i n s e i t s o s r y.
R e p o s i t o r ym e t a d a t a u s e d b y t h e i n f o r m
M e t a d a t a A l l o f t h e i n f o r m a t i o n i n t h e
S o u r c e A p l a c e w h e r e t h e d a t a t o b
T a r g e t A p l a c e t h a t w i l l h o l d t h e p r o
A s t o r a g e a r e a t h a t i s u s e d
S t a g i n g a r e a ar c h i v e a n d p r e p a r e s o u r c e