Beruflich Dokumente
Kultur Dokumente
Optimization
Jason Hamby
Agenda
How it works
Demo
Overview
Reduce data moved when source and target are the same
Customer Scenario
Batch transformation and load -- staging and target tables
in the same target database
Transformation and load from real-time status table to
data warehouse in the same database
Step 1
Step 2
Staging
Data
Sources
Warehouse
Target
Database
Solution Overview
Pushdown optimization is an option that user selects
SQL to be processed in DB is automatically generated
A session may be partially, or completely pushed down
Step 2
Step 1
DI
Server
Staging
Warehouse
SQL
Data
Sources
Optimizer
Metadata
Repository
Target
Database
How It Works
Available as a session property
Pushdown Optimization Options
Partial pushdown optimization to source
Partial pushdown optimization to target
Full pushdown optimization
10
11
Transformations
Pushed to Source
or Target Database
Generated SQL
Statement
12
13
Supported Databases
Teradata (V2R5 or above)
Oracle (9i or above)
DB2 (v8 or above)
SQL Server (7 and above)
Sybase (ASE 12.5)
ODBC source/target
14
Supported Transformations
To Source
Aggregator
Expression
To Target
Expression
Lookup
Filter
Joiner
Lookup
Sorter
Union
15
Unsupported Transformations
Custom Transformation
Router
External Procedure
Sequence Generator
XML
Stored Procedure
Normalizer
TCT
Rank
Update Strategy
16
a
Extract
Source
DB
Transform
Load
Target
17
a
Source
Extract
Transform
Load
Target
DB
18
Full Pushdown
Condition:
Source and target are in the same RDBMS
All transformations can be processed in database
z Extract
Source
DB
Transform
Load
Target
DB
19
Design (Two-Pass)
Pass 1:
Start from the source and traverse transformations
downstream, and build SQL query (SELECT statement).
Stop if a transformation cannot be processed in source
database and settle for partial pushdown to source.
If target is reached, then full pushdown can be done with
INSERT SELECT statement
20
Design (Two-Pass)
Pass 2:
Bypass if phase 1 results in full pushdown optimization
Start from the target and traverse transformations upstream
and build SQL statement (INSERT, DELETE, and
UPDATE) for partial pushdown to target
Stop if a transformation cannot be processed in target
database or already pushed to source database
21
Considerations
Error handling subject to DBMS error handling
No row-level error logging
For mappings that generate long transaction
Require more database resources (locks and log space)
No partial commit: entire transaction rolled back when an error is encountered
Case sensitivity
How null is treated in sort order
Formats (numeric value conversion to char; date conversion to char)
Data precision
22
Limitations
A transformation will not be pushed down / stops the optimization if:
A Source Qualifier, lookup, update transformation contains a SQL override
Optimizer does not parse user-defined SQL override (i.e. lookup, update, DSQ)
DSQ SQL override limitation will be removed in GA by using temporary views
23
Limitations
24
Processing within
PowerCenter is used
when :
Operation cant be done in
database (i.e. using SQL)
Source or target is not a
database
25