Beruflich Dokumente
Kultur Dokumente
3
1. Access
Access
All data must be accessed, regardless of its source or structure.
Data must be extracted from arcane
– Mainframe systems
– Relational databases
– Applications
– XML
– Messages
– Spreadsheets.
Products used in Informatica for Data Access are
Informatica PowerExchange
Informatica B2B Data Exchange
4
Informatica PowerExchange
5
Access Major Enterprise and Packaged
Applications ( Supported Application )
6
Database and Data Warehouse Environment
7
Get Secure, Scalable, Real-Time Access to
Mainframe Data
8
Message-Oriented Middleware
9
PowerExchange Features
Accelerated Access to Enterprise Application Data
10
PowerExchange Features
11
Informatica B2B Data Exchange
12
Informatica B2B Data Exchange Features
Universal Data Transformation Features full support for
complex flat files, including
deep hierarchy, complex
Provides single, centralized, looping, delimited, and fixed-
consistent, and reusable data and variable-width data
transformation service that
enables true “any to any” data Complex hierarchical industry
transformation standards:
HIPAA
HL7
NCPDP
Supplies unique support for: ACORD
DTCC
− Binary documents (e.g., MVR
PDF, Excel, Word) EDI
EDI-Fact
− Printing formats (e.g., AFP SWIFT
and PostScript) FIX
NACHA
Telekurs
− Large batch file and real- Provides native support for
time messages XML,
13
Data Integration Life cycle ( Discover )
14
Informatica Data Explore
16
Data Integration Life cycle ( Cleanse )
17
Informatica Data Quality
18
Data Quality Features
19
Data Quality Features
20
Data Quality Features
22
Informatica PowerCenter
23
Devlopmet Life Cycle (Deliver)
The right data must be delivered in the right format, at the right time,
to all the applications and users that need it.
Delivering data can range from a single data element or record in
support of a real-time business operation to millions of records for
trend analysis and enterprise reporting.
It also involves delivering inactive data to archives/history databases
and provisioning masked subsets of production data for non-
production systems.
Data must be both highly available and secure in its delivery.
Learn more
Informatica PowerExchange
Informatica B2B Data Exchange
Informatica Data Archive
24
Development Life Cycle
Audit, Manage, and Monitor
Data stewards and IT administrators need to collaborate to audit, manage, and monitor
data. Key metrics, such as data quality, are constantly measured with an eye toward
steady improvement over time.
The goal is to track progress on key data attributes and flag any new issues for resolution
and continual improvement once data is fed back into the data integration life cycle.
The Informatica Platform provides shared metadata to document where your data is, as
well as the business rules and logic associated with your data. The Platform shows the
impact of potential changes, which helps all roles respond more quickly and cost-
effectively to change.
Define, Design, and Develop
Business analysts, data architects, and IT developers need a powerful set of tools to help
them collaborate on defining, designing, and developing data integration rules and
processes.
The Informatica Platform includes common set of integrated tools to make sure all people
are working together effectively. The Platform also ensures that metadata is shared and
consistent across all data integration roles.
25
Informatica Power Center
Course Objectives
27
Extract, Transform, and Load
Operational Systems Decision Support
Data
RDBMS Mainframe Other Warehouse
ETL Load
Extract
28
PowerCenter Architecture
29
PowerCenter Architecture
Server
native native
Sources Targets
TCP/IP
Repository
Heterogeneous Services Heterogeneous
Targets Targets
TCP/IP Repository
Agent
native
Repository Designer Workflow Workflow Rep Server Repository
Manager Manager Monitor Administrative
Console
Not Shown: Client ODBC Connections for Source and Target metadata 30
PowerCenter 8.6 Components
PowerCenter Domain
Power Center Node
PowerCenter Repository Services
Power Center Intergration Services
Power Center Reporting Services
PowerCenter Client
• Designer
• Repository Manager
• Administration Console
• Workflow Manager
• Workflow Monitor
External Components
• Sources
• Targets
31
Repository Topics
Repository
Server
Repository
Agent
The Repository Server runs on the same system running the Repository Agent 34
Repository Server Administration Console
35
Repository Server Administration Console
Console Tree
Hypertext Links to
Repository
Maintenance Tasks
36
Repository Management
Perform all Repository
maintenance tasks through
Repository Server from the
Repository Server Administration
Console
Create the Repository
Configuration
Select Repository Configuration
and perform maintenance tasks:
Navigator
Window
Main Window
Dependency Window
Output Window
39
Users, Groups and Repository Privileges
Steps:
Create groups
Create users
Assign users to
groups
Assign privileges to
groups
Assign additional
privileges to users
(optional)
40
Managing Privileges
41
Folder Permissions
43
Object Searching
(Menu- Analyze – Search)
Keyword search
• Limited to keywords
previously defined in
the Repository
(via Warehouse
Designer)
Search all
• Filter and search
objects
44
Object Sharing
Reuse existing objects
Enforces consistency
Decreases development time
Share objects by using copies and shortcuts
COPY SHORTCUT
Copy object to another folder Link to an object in another folder
Changes to original object not Dynamically reflects changes to
captured original object
Duplicates space Preserves space
Copy from shared or unshared folder Created from a shared folder
46
Sample Metadata Extensions
48
Source Object Definitions
49
Source Analyzer
Designer Tools
Analyzer Window
Navigation
Window
50
Methods of Analyzing Sources
Repository
Import from Database
Import from File
Import from Cobol File
Import from XML file
Create manually Source
Analyzer
51
Analyzing Relational Sources
Source Analyzer Relational Source
ODBC Table
View
Synonym
DEF
Repository
Server
TCP/IP
Repository
Agent
native
Repository
DEF
52
Analyzing Relational Sources
Editing Source Definition Properties
53
Analyzing Flat File Sources
Source Analyzer
Mapped Drive Flat File
NFS Mount
Local Directory DEF
Fixed Width or
Delimited
Repository
Server
TCP/IP
Repository
Agent
native
Repository
DEF
54
Flat File Wizard
Three-step
wizard
Columns can
be renamed
within wizard
Text, Numeric
and Datetime
datatypes are
supported
Wizard
‘guesses’
datatype
55
XML Source Analysis
Source Analyzer .DTD File
Mapped Drive
NFS Mounting
Local Directory DEF
DATA
Repository
Server
TCP/IP
Repository
Agent In addition to the DTD file, an
XML Schema or XML file
native can be used as a Source
Definition
Repository
DEF
56
Analyzing VSAM Sources
Source Analyzer .CBL File
Mapped Drive
NFS Mounting
DEF
Local Directory
Repository DATA
Server
TCP/IP
Repository
Agent Supported Numeric Storage Options:
COMP, COMP-3, COMP-6
native
Repository
DEF
57
VSAM Source Properties
58
Target Object Definitions
59
Creating Target Definitions
60
Automatic Target Creation
Drag-and-
drop a
Source
Definition
into
the
Warehouse
Designer
Workspace
61
Import Definition from Database
Can “Reverse engineer” existing object definitions
from a database system catalog or data dictionary
Warehouse
Database
Designer
ODBC
Table
Repository View
Server DEF Synonym
TCP/IP Repository
Agent
native
Repository DEF
62
Manual Target Creation
1. Create empty definition 2. Add desired columns
64
Target Definition Properties
65
Creating Physical Tables
DEF
DEF
LOGICAL PHYSICAL
Repository target table Target database
definitions tables
66
Creating Physical Tables
Create tables that do not already exist in target database
Connect - connect to the target database
Generate SQL file - create DDL in a script file
Edit SQL file - modify DDL script as needed
Execute SQL file - create physical tables in target database
70
Transformation Views
A transformation has
three views:
Iconized - shows the
transformation in
relation to the rest of
the mapping
Normal - shows the
flow of data through
the transformation
Edit - shows
transformation ports
and properties; allows
editing
71
Edit Mode
Allows users with folder “write” permissions to change
or create transformation ports and properties
Define transformation
Define port level handling
level properties
Enter comments
Make reusable
Switch
between
transformations
72
Expression Transformation
Passive Transformation
Connected
Ports
• Mixed
• Variables allowed
Click here to invoke the
Expression Editor
Create expression in an
output or variable port
Usage
• Perform majority of
data manipulation
73
Expression Editor
An expression formula is a calculation or conditional statement
Used in Expression, Aggregator, Rank, Filter, Router, Update Strategy
Performs calculation based on ports, functions, operators, variables,
literals, constants and return values from other transformations
74
Informatica Functions - Samples
ASCII
CHR Character Functions
CHRCODE
CONCAT Used to manipulate character data
INITCAP
INSTR CHRCODE returns the numeric value
LENGTH (ASCII or Unicode) of the first character
LOWER of the string passed to this function
LPAD
LTRIM
RPAD
RTRIM
SUBSTR For backwards compatibility only - use || instead
UPPER
REPLACESTR
REPLACECHR
75
Informatica Functions
77
Informatica Functions
Special Functions
ERROR
ABORT Used to handle specific conditions within a session;
DECODE search for certain values; test conditional
statements
IIF
IIF(Condition,True,False)
ISNULL
Test Functions
IS_DATE
IS_NUMBER
Used to test if a lookup result is null
IS_SPACES
Used to validate data
SOUNDEX
Encoding Functions
METAPHONE
Used to encode string values
78
Expression Validation
79
Variable Ports
Use to simplify complex expressions
• e.g. - create and store a depreciation formula to
be
referenced more than once
Use in another variable port or an output port expression
Local to the transformation (a variable port cannot also be
an input or output port)
Available in the Expression, Aggregator and Rank
transformations
80
Informatica Data Types
NATIVE DATATYPES TRANSFORMATION DATATYPES
Specific to the source and target PowerMart / PowerCenter internal
database types datatypes based on ANSI SQL-92
Display in source and target tables Display in transformations within
within Mapping Designer Mapping Designer
83
Mapping Designer
Transformation Toolbar
Mapping List
Iconized Mapping
84
Pre-SQL and Post-SQL Rules
85
Data Flow Rules
Each Source Qualifier starts a single data stream
(a dataflow)
Transformations can send rows to more than one
transformation (split one data flow into multiple pipelines)
Two or more data flows can meet together -- if (and only if)
they originate from a common active transformation
x Cannot add an active transformation into the mix
ALLOWED DISALLOWED
Passive Active
T T T T
Example holds true with Normalizer in lieu of Source Qualifier. Exceptions are:
Mapplet Input and Joiner transformations 86
Connection Validation
87
Mapping Validation
Mappings must:
• Be valid for a Session to run
• Be end-to-end complete and contain valid expressions
• Pass all data flow rules
Mappings are always validated when saved; can be validated
without being saved
Output Window will always display reason for invalidity
88
Workflows
By the end of this section, you will be familiar with:
The Workflow Manager GUI interface
Workflow Schedules
Setting up Server Connections
x Relational, FTP and External Loader
Task
Tool Bar
Workflow
Designer
Tools
Workspace
Navigator
Window
Output Window
Status Bar
90
Workflow Manager Tools
Workflow Designer
• Maps the execution order and dependencies of Sessions,
Tasks and Worklets, for the Informatica Server
Task Developer
• Create Session, Shell Command and Email tasks
• Tasks created in the Task Developer are reusable
Worklet Designer
• Creates objects that represent a set of tasks
• Worklet objects are reusable
91
Workflow Structure
Start Session
Task Task
92
Workflow Scheduler Objects
93
Server Connections
Configure Server data access connections
− Used in Session Tasks
Configure:
1. Relational
2. MQ Series
3. FTP
4. Custom
5. External Loader
94
Relational Connections (Native )
Create a relational (database) connection
− Instructions to the Server to locate relational tables
− Used in Session Tasks
95
Relational Connection Properties
Define native
relational (database)
connection
User Name/Password
Database connectivity
information
Rollback Segment
assignment (optional)
96
FTP Connection
Create an FTP connection
− Instructions to the Server to ftp flat files
− Used in Session Tasks
97
External Loader Connection
Create an External Loader connection
− Instructions to the Server to invoke database bulk loaders
− Used in Session Tasks
98
Task Developer
Create basic Reusable “building blocks” – to use in any Workflow
Reusable Tasks
• Session Set of instructions to execute Mapping logic
• Command Specify OS shell / script command(s) to run
during the Workflow
• Email Send email at any point in the Workflow
Session
Command
Email
99
Session Task
Server instructions to runs the logic of ONE specific Mapping
• e.g. - source and target data location specifications,
memory allocation, optional Mapping overrides,
scheduling, processing and load instructions
Becomes a
component of a
Workflow (or
Worklet)
If configured in
the Task
Developer,
the Session Task
is reusable
(optional)
100
Command Task
Specify one (or more) Unix shell or DOS (NT, Win2000)
commands to run at a specific point in the Workflow
Becomes a component of a Workflow (or Worklet)
If configured in the Task Developer, the Command Task is
reusable (optional)
102
Additional Workflow Components
103
Developing Workflows
Create a new Workflow in the Workflow Designer
Customize
Workflow name
Select a
Server
104
Workflow Properties
Customize Workflow
Properties
Select a Workflow
Schedule (optional)
May be reusable or
non-reusable
105
Workflows Properties
106
Building Workflow Components
Add Sessions and other Tasks to the Workflow
Connect all Workflow components with Links
Save the Workflow
Start the Workflow Save
Start Workflow
Link 2
108
Session Tasks
109
Session Task
110
Session Task - General
111
Session Task - Properties
112
Session Task – Config Object
113
Session Task - Sources
114
Session Task - Targets
115
Session Task - Transformations
Allows overrides of
some transformation
properties
Does not change the
properties in the
Mapping
116
Session Task - Partitions
117
Monitor Workflows
118
Monitor Workflows
The Workflow Monitor is the tool for monitoring
Workflows and Tasks
Review details about a Workflow or Task in two views
• Gantt Chart view
• Task view
Task view
Gantt Chart view 119
Monitoring Workflows
Perform operations in the Workflow Monitor
• Restart -- restart a Task, Workflow or Worklet
• Stop -- stop a Task, Workflow, or Worklet
• Abort -- abort a Task, Workflow, or Worklet
• Resume -- resume a suspended Workflow after a
failed Task is corrected
Stopping a Session Task means the Server stops reading data 120
Monitoring Workflows
Start Completion
Task View Task Workflow Worklet Time Time
Monitoring filters
can be set using
drop down menus
Minimizes items
displayed in
Task View
123
Debugger Features
124
Debugger Interface
Debugger windows & indicators Debugger Mode
indicator
Solid yellow
arrow Current
Transformation
indicator
Flashing
yellow
SQL
indicator
Transformation
Debugger Instance
Log tab Data window
Active Transformation
Connected
Ports
• All input / output
Usage
• Filter rows from
flat file sources
• Single pass source(s)
into multiple targets
126
Aggregator Transformation
Active Transformation
Connected
Ports
• Mixed
• Variables allowed
• Group By allowed
Create expressions in
output or variable ports
Usage
• Standard aggregations
127
Informatica Functions
Aggregate Functions
AVG Return summary values for non-null data
COUNT in selected ports
FIRST
LAST Use only in Aggregator transformations
MAX
MEDIAN
Use in output ports only
MIN Calculate a single value (and row) for all
PERCENTILE records in a group
STDDEV
SUM Only one aggregate function can be
VARIANCE nested within an aggregate function
Conditional statements can be used with
these functions
128
Aggregate Expressions
Aggregate
functions are
supported
only
in the
Aggregator
Transformation
Conditional
Aggregate
expressions
are supported Conditional SUM format: SUM(value, condition)
129
Aggregator Properties
Instructs the
Aggregator to
expect the data
to be sorted
Set Aggregator
cache sizes (on
Informatica Server
machine)
130
Sorted Data
131
Incremental Aggregation
MTD
Trigger in calculation
Session Properties,
Performance Tab
Best Practice is to copy these files in case a rerun of data is ever required.
Reinitialize when no longer needed, e.g. – at the beginning new month processing 132
Joiner Transformation
133
Homogeneous Joins
Joins that can be performed with a SQL SELECT statement:
Source Qualifier contains a SQL join
Tables on same database server (or are synonyms)
Database server does the join “work”
Multiple homogenous tables can be joined
134
Heterogeneous Joins
135
Joiner Transformation
Active Transformation
Connected
Ports
• All input or input / output
• “M” denotes port comes
from master source
Specify the Join condition
Usage
• Join two flat files
• Join two tables from
different databases
• Join a flat file with a
relational table 136
Sorter Transformation
137
Lookup Transformation
138
Lookup Transformation
Looks up values in a database table and provides
data to other components in a Mapping
Passive Transformation
Connected / Unconnected
Ports
• Mixed
• “L” denotes Lookup port
• “R” denotes port used as a
return value (unconnected
Lookup only)
Specify the Lookup Condition
Usage
• Get related values
• Verify if records exists or
if data has changed 139
Lookup Properties
Override
Lookup SQL
option
Toggle
caching
Native
Database
Connection
Object name
140
Additional Lookup Properties
Set cache
directory
Make cache
persistent
Set
Lookup
cache sizes
141
To Cache or not to Cache?
Caching can significantly impact performance
Cached
• Lookup table data is cached locally on the Server
• Mapping rows are looked up against the cache
• Only one SQL SELECT is needed
Uncached
• Each Mapping row needs one SQL SELECT
Rule Of Thumb: Cache if the number (and size) of
records in the Lookup table is small relative to the
number of mapping rows requiring lookup
142
Update Strategy Transformation
143
Update Strategy Transformation
Used to specify how each individual row will be used to
update target tables (insert, update, delete, reject)
Active Transformation
Connected
Ports
• All input / output
Usage
• Updating Slowly
Changing Dimensions
• IIF or DECODE logic
determines how to
handle the record
144
Router Transformation
Active Transformation
Connected
Ports
• All input/output
• Specify filter conditions
for each Group
Usage
• Link source data in
one pass to multiple
filter conditions
145
Router Transformation in a Mapping
146
Parameters and Variables
147
System Variables
SYSDATE Provides current datetime on the
Informatica Server machine
• Not a static value
149
Mapping Parameters and Variables
Sample declarations
Set the
User- appropriate
defined aggregation
names type
Set optional
Initial Value
Passive Transformation
Connected
Ports
• Two predefined
output ports,
NEXTVAL and
CURRVAL
• No input ports allowed
Usage
• Generate sequence
numbers
• Shareable across mappings
151
Sequence Generator Properties
Number
of
Cached
Values
152
Dynamic Lookup
153
Additional Lookup Cache Options
155
Dynamic Lookup Cache Advantages
156
Update Dynamic Lookup Cache
157
Example: Dynamic Lookup Configuration
Active Transformation
Connected
Ports
• Mixed
• One pre-defined
output port
RANKINDEX
• Variables allowed
• Group By allowed
Usage
• Select top/bottom
• Number of records
159
Normalizer Transformation
Active Transformation
Connected
Ports
• Input / output or output
Usage
• Required for VSAM
Source definitions
• Normalize flat file or
relational source
definitions
• Generate multiple
records from one record
160