Beruflich Dokumente
Kultur Dokumente
Agenda
Time :30 :30 :45 :45 :45 :30 :30 :30 3:00 Topic Introduction to PowerCenter Lab 1: Sources and Targets Lab 3: Building a Mapping Data Integration Part 1 Lab 4: Working with Data Data Integration Part 2 Lab 5: Workflow Using Workflow Manager & Workflow Monitor Lab 6: Using the Debugger Lab 7: Web Services Making Your Mapping SOA Ready Lab 9: Extra credit. Sorted input and dynamic partitioning Lab 8: Putting It All Together
Helpful VMware Commands Ctrl-Alt-Enter switch VMWare to full screen Ctrl-Alt-Insert (in VMware) same as Ctrl-Alt-Del Ctrl-Alt-Esc Return to Desktop from VMware
3
Workshop Objectives
Understand the broad set of data integration challenges facing organizations today and how the Informatica Platform can be used to address them Access data from different data sources and targets Profile a data set and understand how to look for basic problems that need to be solved Integrate data from multiple sources through Extraction, Transformation and Load (ETL) Debug data integration processes (mappings) Expose integration logic as Web Services for use in a SOA architecture
Introduction to PowerCenter
Enterprise Data Integration and ETL
PowerCenter
ORACLE
----------------------------------------------------------------------------- "AGGREGATOR_TOTAL_AMOUNT_c" Cursor declaration ---------------------------------------------------------------------------CURSOR "AGGREGATOR_TOTAL_AMOUNT_c" IS SELECT NVL("CUSTOMER_DETAILS_LKP"."CUST_KEY", NULL) "CUST_KEY", NVL("CUSTOMER_DETAILS_LKP"."NAME", NULL) "NAME", NVL("CUSTOMER_DETAILS_LKP"."TYPE", NULL) "TYPE", "AGGREGATOR_TOTAL_AMOUNT"."REGION" "REGION", "AGGREGATOR_TOTAL_AMOUNT"."TOTAL_AMOUNT" "TOTAL_AMOUNT" FROM ( SELECT SUM("AGG_INPUT"."QUANTITY"*"AGG_INPUT"."PRICE")/* AGGREGATOR_TOTAL_AMOUNT.OUTGRP1.TOTAL_AMOUNT */ "TOTAL_AMOUNT", "AGG_INPUT"."CUST_ID$1"/* AGGREGATOR_TOTAL_AMOUNT.OUTGRP1.CUST_ID */ "CUST_ID", "AGG_INPUT"."REGION$1"/* AGGREGATOR_TOTAL_AMOUNT.OUTGRP1.REGION */ "REGION" FROM (SELECT "SPLITTER_INPUT_SUBQUERY"."CUST_ID$2" "CUST_ID$1", "SPLITTER_INPUT_SUBQUERY"."QUANTITY$1" "QUANTITY", NVL("PRODUCT_LKP"."PRICE", NULL) "PRICE", "SPLITTER_INPUT_SUBQUERY"."REGION$2" "REGION$1" FROM ( SELECT "ODS_INVOICE_SUMMARY"."PAYMNT_TYPE" "PAYMNT_TYPE", ...... .....
ORACLE
----------------------------------------------------------------------------- "AGGREGATOR_TOTAL_AMOUNT_c" Cursor declaration ---------------------------------------------------------------------------CURSOR "AGGREGATOR_TOTAL_AMOUNT_c" IS SELECT NVL("CUSTOMER_DETAILS_LKP"."CUST_KEY", NULL) "CUST_KEY", NVL("CUSTOMER_DETAILS_LKP"."NAME", NULL) "NAME", NVL("CUSTOMER_DETAILS_LKP"."TYPE", NULL) "TYPE", "AGGREGATOR_TOTAL_AMOUNT"."REGION" "REGION", "AGGREGATOR_TOTAL_AMOUNT"."TOTAL_AMOUNT" "TOTAL_AMOUNT" FROM ( SELECT SUM("AGG_INPUT"."QUANTITY"*"AGG_INPUT"."PRICE")/* AGGREGATOR_TOTAL_AMOUNT.OUTGRP1.TOTAL_AMOUNT */ "TOTAL_AMOUNT", "AGG_INPUT"."CUST_ID$1"/* AGGREGATOR_TOTAL_AMOUNT.OUTGRP1.CUST_ID */ "CUST_ID", "AGG_INPUT"."REGION$1"/* AGGREGATOR_TOTAL_AMOUNT.OUTGRP1.REGION */ "REGION" FROM (SELECT "SPLITTER_INPUT_SUBQUERY"."CUST_ID$2" "CUST_ID$1", "SPLITTER_INPUT_SUBQUERY"."QUANTITY$1" "QUANTITY", NVL("PRODUCT_LKP"."PRICE", NULL) "PRICE", "SPLITTER_INPUT_SUBQUERY"."REGION$2" "REGION$1" FROM ( SELECT "ODS_INVOICE_SUMMARY"."PAYMNT_TYPE" "PAYMNT_TYPE", ...... .....
ORACLE
10
TERADATA ORACLE
11
TERADATA
12
customers.xml Oracle PL/SQL Teradata BTEQ XML + Java SAP ABAP PowerCenter
TERADATA
13
14
15
source_system_cd
? => $ !!
17
Informatica na pkladech
PowerCenter
Implementace zmn
source_system_cd
18
Informatica na pkladech
PowerCenter
Dopad zmn
source_system_cd
19
20
Proven Scalability
Pipeline Parallel Processing
Provider
PowerCenter
Provider Thread
Consumer
Monitor
Consumer Thread
Portals , Dashboards , and Reports XML , Messaging , and Web Services
Design Designer
Transformation Threads
Manage
Client
Workflow Manager
Workflow Monitor
Administrat or
Packaged Applications
In-memory pipeline
Provider Thread
Transformation Threads
Repository Service
Consumer Thread
Services Framework
Packaged Application s
Repository
Relational and Flat Files
In-memory pipeline
Integration Service
Web Services
21
Informatica Platform
Single unified architecture
Provider
PowerCenter
Design Manage Workflow Manager Monitor Workflow Monitor
Consumer
Portals , Dashboards , and Reports XML , Messaging , and Web Services
Client
Designer
Administrat or
Packaged Applications
Services Framework
Repository Service
X
Repository
Web Services
Packaged Application s
Integration Service
22
Real-time High Availability Adv XML Data Masking High Availability Partitioning Metadata Exchange
Partitioning
PCSE
PCAE
PCRE
23
PowerExchange
24
Packaged Applications
JD Edwards SAP NetWeaver Lotus Notes SAP NetWeaver BI Oracle E-BusinessSAS PeopleSoft Siebel
SaaS / BPO
Industry Standards
XML Standards
One platform for any data, any latency, for 24x7 operations
25
Any Time
Timely data to the command when you want it, how you want it
Batch Data Integration , Data Migration Analytical Data Integration Operational Data Integration Transactional Integration
nchronization to support [our] high performance requirements. PowerCenter and PowerExchange enabled us to imp
26
Packaged Applications
Batch
Informix Teradata Netezza ODBC Flat Files Web Logs VSAM C-ISAM Complex Files Tape Formats
Change
Real time
Print Streams Unstructured
Unstructured Data
PowerExchange
27
Ctrl-Alt-Enter switch VMWare to full screen Ctrl-Alt-Insert (in VMware) same as Ctrl-Alt-Del Ctrl-Alt-Esc return to Desktop from VMware
28
29
How to:
1. Launch PowerCenter Designer to start your project 2. Connect to the PowerCenter Repository 3. Import Source and Target Structures
From Relational Tables and Flat Files
5. 6.
30
31
Informatica Platform
Single unified architecture
Provider
PowerCenter
Design Manage Workflow Manager Monitor Workflow Monitor
Consumer
Portals , Dashboards , and Reports XML , Messaging , and Web Services
Client
Administrat or
Designer
Packaged Applications
Services Framework
Repository Service
Packaged Application s
Repository
Relational and Flat Files
Integration Service
Web Services
32
Mapping Designer - Used to create mappings to extract, transform and load data.
33
Source Analyzer
Integrated. Key component of PowerCenter Designer, Source Analyzer offers universal data access in a single unified platform Consistent. A single consistent method to access and manage any data source regardless of type or location Visual. Simple graphical interface for importing and creating source definitions for any of the data sources supported by PowerCenter
34
Target Designer
Integrated. Key component of PowerCenter Designer, Target Analyzer offers universal data access in a single unified platform Consistent. A single consistent method to access and manage any data target regardless of type or location Visual. Simple graphical interface for importing target definitions for any of the data types supported by PowerCenter Extensible. Can create target definitions, executable DDLs, and even create new tables in the warehouse
35
6. Cre ate a relational target structure and build it in the relational instance
CUSTOMER_DATES
36
37
Using Designer
38
Using Designer
39
Using Designer
40
Using Designer
1.Right-click the MappingLab folder 2.Select Open to open This is where most of our work will be done
41
Using Designer
1.Make sure you see Source Analyzer at the top left hand part of the gra
42
Using Designer
43
Using Designer
In the Import Tables dialog, choose the ODBC connection for the data so 1.Click the ODBC data source drop-down box 2.Select the data source called source
Note Informatica only uses ODBC to import the metadata structures into PowerCenter.
44
Using Designer
1.Enter Username: source 2.The Owner name will self populate 3.Enter Password: source 4.Press Connect 5.
45
Using Designer
1.Open up the directory tree under Select tables 2.While holding the SHIFT key, select both the CUSTOMERS and GOOD_CUS 3.Press OK
46
Using Designer
e source structure
47
Using Designer
1.From the menu bar and select Sources Import from Fil
48
Using Designer
Sources
49
Using Designer
ames from the first line check box this tells PowerCenter to start importing from ts (the flat file source Delimited not Fixed Width)
The flat file wizard is now displayed which allows us to parse through our flat file source.
50
Using Designer
1.Keep the defaults (the flat file is comma delimit 2.Press Next
Look around this page. Notice you can account for multiple delimiters, consecutive delimiters and quotes around data.
51
Using Designer
Earlier we told PowerCenter to use the first line of the original flat file for the column names. Note that the columns are now named for us. Review the other options on this page.
1.Press Finish
52
Using Designer
Congratulations!
You just successfully imported one flat file and two relational source structures.
53
Using Designer
Select the Target Designer to bring in our target structur 1.Select the second icon on the shortcut line
54
Using Designer
Notice when you select the Target designer, the menu options change. On 1.Select Targets and choose the Import from Database option
55
Using Designer
The target structures are in the target instance of the database 1.Select the target ODBC data source named target
56
Using Designer
1.Enter Username: target 2.The Owner name will self populate 3.Enter Password: target 4.Press Connect 5.
57
Using Designer
CUSTOMER_NONAME will capture all of our records that do not have an associated customer name. GOOD_CUSTOMERS will capture all clean records to be loaded into our Data Warehouse.
58
Using Designer
59
Using Designer
Select the database type 1.Click the drop-down box and choose Oracle for the database
2.
60
Using Designer
1.Enter CUSTOMER_DATES as the name for the target table 2.Press Create
61
Using Designer
A new table should appear in the workspace behind the pop u 1.Select Done to close the Create Target Table dialog
62
Using Designer
Edit the table - CUSTOMER_DATES 1.Double-click the CUSTOMER_DATES table
63
Using Designer
64
Using Designer
65
Using Designer
1.Press the Add icon three times to add in three new column
66
Using Designer
67
Using Designer
68
Using Designer
1.Click in the Key Type drop-down for CUST_ID to make this field a Pri
69
Using Designer
e second Column Name to TRANSACTION_ID e Datatype of the TRANSACTION_ID to number. e third Column Name to Date_of_Purchase e Datatype of the Date_of_Purchase column to date
70
Using Designer
We now have a metadata target structure in the PowerCenter Metadata Repository. We will now build the table in the Oracle target instance.
71
Using Designer
Build the table in the Oracle target instance 1.Select Targets Generate/Execute SQL
72
Using Designer
73
Using Designer
1.Press the ODBC data source drop-down menu 2.Select the target database
74
Using Designer
1.Enter the Username target 2.Enter the Password target 3.Press Connect
75
Using Designer
The table has been successfully built 1.Close the Database Object Generation box
76
Using Designer
The table GOOD_CUST_STG is for staging good customer records prior to loading them into the data warehouse. It will be used as both a target (when we clean the data) and a source (when we load the clean data into the warehouse). We can reuse the source definition to create the target.
er Selected folder so that GOOD_CUST_STG is visible ST_STG object from the Sources directory tree in the navigation pane to the Targe
77
Using Designer
GOOD_CUST_STG is now setup to be used as both a source and a target in PowerCenter. However, while the table exists in PowerCenter, it does not yet exist in our target Oracle database. Lets build this table in our target Oracle database.
78
Using Designer
Build the table in the Oracle target instance 1.Select Targets Generate/Execute SQL
79
Using Designer
80
Using Designer
1.Press the ODBC data source drop-down menu 2.Select the target database
81
Using Designer
1.Enter the Username target 2.Enter the Password target 3.Press Connect
82
Using Designer
_STG table
es on the radio menu (we only want to build the GOOD_CUST_STG table) bove (We know the table doesnt exist but lets drop the table before we build the n execute
83
Using Designer
The table has been successfully built 1.Close the Database Object Generation box
84
Using Designer
If we look back at the directory tree in the Navigation Pane, we will see that TRANSACTIONS (flat file) CUSTOMERS (relational) GOOD_CUST_STG (relational) and four Targets (all relational) CUSTOMER_DATES CUSTOMER_NONAME GOOD_CUSTOMERS GOOD_CUST_STG
85
86
What will we learn in this chapter? What is a mapping? What are Transformation Objects? How do we build a mapping? How do we Join sources together? How do we separate out records with missing data?
87
88
PowerCenter Transformations
Some examples
Transformations used in this mapping. For a detailed description of these Transformations and their function see the tables in Appendix A
XML Parser XML Generator Expression Joiner Mapplet Input Mapplet Output
89
PowerCenter Functions
Some Examples. A more complete reference can be found in the Appendix B at the end of this Guide
Summary view of all available functions Character manipulation (CONCAT, LTRIM, UPPER, ) Datatype Conversion (TO_CHAR, TO_DECIMAL, ) Data matching and parsing (Reg_Match, Soundex, ) Date manipulation (Date_Compare, Get_Date_Part, ) Encryption/Encoding (AES_Encrypt, Compress, MD5, ) Financial Functions (PV, FV, Pmt, Rate, ) Mathematical operations (LOG, POWER, SQRT, Abs, ) Trigonometric Functions (SIN, SINH, COS, TAN, ) Flow Control and Conditional (IIF, DECODE, ERROR, ) Test and Validation (ISNULL, IS_DATE, IS_NUMBER, ) Library of Reusable User Created Functions Variable Updates (SETVARIABLE, SETMINVARIABLE, ) Available Lookups that may be used
90
In this Scenario
We will learn how to build mappings with Designer. Mappings are a logical process that define the structure of data and how it is changed as it flows from one or more data sources to target locations. Mappings are the core of the Informatica data integration tool set. With Informatica transformations and mappings are reusable and can be used in multiple different scenarios.
For our first mapping we need to combine two sets of data for our data warehouse. We also need to separate good records from bad ones that are missing the customer name.
91
92
93
94
95
96
97
TRANSACTIONS to the mapping next to source and Flatfile so TRANSACTIONS and CUSTOMERS are visible NSACTIONS source into the work area
98
99
Add the source CUSTOMERS table to the mapping 1.Click and drag the CUSTOMERS source into the workspace
100
101
the target tables to the mapping. pand the Targets folder hile holding CTRL select the CUSTOMER_NONAME and GOOD_CUST_STG tabl ll holding CTRL, drag them onto the workspace
102
103
lapse the Navigation Pane for now to give us more work space (Single click-left ic lapse the Output Window at the bottom of our screen (Single Click-right icon)
104
Add a joiner transformation to join the two source files together 1.Single click on the joiner transform 2.Single click in the workspace, the Joiner transformation should appear
105
the TRANSACTION Source Qualifier transformation (holding SHIFT, click the firs e selection to the Joiner Transformation
106
1.Highlight the fields in the CUSTOMERS Source Qualifie 2.Drag them to the Joiner Next, we need to edit the Joiner properties 1.Double-click on the Joiner 2.
107
1.Click Rename
Remember, all of this metadata will be captured in the PowerCenterMetadata Repository. Since we have the ability to report on the PowerCenter Metadata Repository, we want the names of our transformation objects to be meaningful.
108
109
Notice that once we have a field in each source named CUST_ID, it named the second instance of CUST_ID to CUST_ID1.
110
Add a join condition 1.Click on the Condition tab 2.Click on the Add condition icon
111
A default condition will be displayed. Since we have two fields with similar names, by default, the condition will use these to field names.
1.Press OK
2.
112
h the data joined we need to separate good records from those with missing customer name
lick on the Router Transformation lick on the workspace to add a router to the mapping
113
We want to keep all of the fields from the Joiner except CUST_ID1, which is the sam 1.Hold CTRL and select all fields except for CUST_ID1 2.Drag the selected fields to the Router We need to tell the Router what conditions to check for. 1. Double-click the Router to edit it.
114
Rename the Router 1.Click Rename 2.Type Transformation Name rtr_check_customer_name 3.Click OK
115
116
The Router groups data based on user defined conditions. All records that meet the Group Filter Condition are included in the output for that group.
We need to create two groups. One for records with a customer name and one rec 1. Click the Add button twice.
117
Rename the Groups 1.Click on the first Group Name 2.Rename the group GOOD_CUSTOMER 3.Click on the second Group Name 4.Rename the group CUSTOMER_NONAME Next we need to edit the Group Filter Condition 1. Click the arrow on the first condition to open editor
118
e a NULL value for the customer name. If the record is not NULL then it is good.
ession: T ISNULL(CUST_NAME) to test your expression ose the message window. ose the Expression Editor
119
120
121
1.Press OK
122
The Routerappears. Expand the transformation and scroll down to see the two we created
123
1.Expand the Router transform and scroll until the GOOD_CUSTO 2.Select all of the fields (or ports) under GOOD_CUSTOMER 3.Drag the selected fields to the CUST_ID field on the GOOD_CUS
Note: When you drag and release, Designer connects the first field in the set being dragged to the field under the cursor when the mouse is released, the second with the second and so on. If your fields are not in matching order you may need to connect them one at a time.
124
1.Connect the CUSTOMER_NONAME group to the CUSTOMER_NON Both tables should now be connected
125
1.Click the disk icon to Save the mapping 2.Click the Toggle Output Window icon to view save status and other me 3.Verify the mapping is VALID, if it is not check for Error messages 4.Finally clean up the workspace. Right-click and select Arrange All Icon
126
127
28
128
What will we learn in this chapter? How to use a look-up to enrich records with data from another source? What is a reusable transformation? How to use expressions to format data? How to use aggregate functions to generate results from a data set?
129
In this Scenario
We will use Designer to build another mapping. Where the last lab focused on joining raw data and removing bad records, this lab focuses on using transformations to convert, enrich, and reformat the data and, finally, load it into the data warehouse.
Specifically, we will be working with the good records that the first mapping loaded into the staging table.
130
PowerCenter Transformations
Some examples
XML Parser XML Generator Expression Joiner Mapplet Input Mapplet Output
131
132
33
133
Starting from the Mapping Designer 1.Select Mappings Create to build a new mapping
134
135
Add the source GOOD_CUST_STG to the mapping 1.Drag the GOOD_CUST_STG source into the work are
136
137
e target tables to the mapping nd the Targets folder t and drag CUSTOMER_NONAME and GOOD_CUST_STG tables onto the works
138
All sources and targets are now imported 1.Collapse the Navigation Pane for now to give us more s 2.Collapse the Output Window at the bottom of our scree
139
The Lookup Transformation will allow us to pull back the product description names from our PRODUCT table. This is required by the end user so they can see exactly what products were purchased by our customers.
ransformation
140
Select the Lookup Table 1.Click on the Import tab 2.Select From Relational Table
141
We have to connect to the database instance that holds our lookup table. Note that PowerCenter will NEVER override database level security.
1.Click on ODBC data source the drop-down box 2.Select the source ODBC connection
142
1.Enter the Username source 2.Enter the Password source 3.Press Connect
143
The Lookup Transformation appears in the workspace. We will use the Product_ID
1.Highlight the PRODUCT_ID field from the Source Qualifier and drag it onto the w
145
146
147
Much like the joiner, the lookup transformation requires a condition to be true for it to pass values. In this case, we want the product ID from the TRANSACTIONS file to match the product ID in the PRODUCTS table. Once there is a match, the lookup will return the proper product description value.
148
matches the Product_IDwe passed in. Designer automatically identified the corre
149
Select the Return value 1.Check the box in the R column for PRODUCT_DESC 2.Press OK
150
We would like to do some formatting on our source data. We want the initial character of our customer names and product descriptions to be Upper Case
151
152
153
Minimize completed transformations 1.Click on the minimize icon for each completed transformation 2.Next, double-click on the Expression Transformation to edit it
154
155
1.Select the Ports tab 2.Add field button twice to add two output ports 3.Select the first field and rename it CUST_NAME_OUT 4.Select the second field and rename it PRODUCT_DESC_OUT 5.
156
precision to 50 for each new port e O (output) ports for CUST_NAME and PRODUCT_DESC (they will be replaced by the n e I for the new fields (they originate here and have no input) en the O is selected the expression editor box on the right will become active
157
he Expression box area next to the first field CUST_NAME_OUT (an arrow will ap e arrow to open the Expression Editor
158
1.Edit the Expression 2.Expand the Character folder 3.Select the Initcap function 4.Double-click the function to add it to the Formula
159
This is a simple expression telling PowerCenter capitalize the first letter of the customer first and last name.
hat it matches the one above. Remember CUST_NAME is the input being modifie e editor
160
Repeat for PRODUCT_DESC_OUT 1.Press the down arrow to open the Expression Editor
161
1.Select the Initcap function 2.Edit the Formula so it matches the one above 3.Press OK
162
1.Click the row number at left and use the black arrows to move the row up or dow
163
164
Next we need to format our date. In our flat file, the date is an 8 character string. We need to convert that string to a date format so that it matches the format the target database (Oracle) is expecting
1.Validate the mapping it should look like this 2.Open the Navigation Pane
165
1.Select the Transformation Developer 2.Expand the Transformations folder in the left Navigation pane 3.Drag the exp_formatted_date transformation onto the workspa 4.Double-click the transformation to edit it
166
1.Select the Ports tab 2.Open the Expression Editor for the formatted_date port
167
1.Review the expression formula 2.Press OK 3.Press OK on the next screen to close the Edit Transformations
168
169
1.Select any Object Types that should be included in the repo 2.Press OK
3.
170
1.Review the report content. In this case there are no depe 2.Close the report
171
172
DATEOFTRANSACTION port to the DATE_IN port on the new Expression Trans Aggregator to our mapping
173
We need to calculate the total revenue for each customer. The Aggregator transformation performs these types of functions on groups of data. It can also help collapse records based on a grouping criteria (CUST_ID in this case), eliminating duplicate sets of results
174
1.Map the output ports from the two expressions to the Aggregato 2.Minimize the Expressions now we are done with them 3.Double-click the Aggregator to edit the transformation properties 4.
175
Rename the Aggregator Transformation 1.Click Rename 2.Name the transformation agg_revenue 3.
176
1.Select the Ports tab 2.Remove the _OUT from the CUST_NAME and PRODUCT_DESC 3.Click Add new port button once A new port is added to the Aggregator
177
1.Rename NEWFIELD to TOTAL_REVENUE 2.Change the Datatype to Double 3.De-select the I so the Expression Editor becomes available 4.Click the arrow to open the Expression Editor
178
179
r calculation computes the total revenue by customer. To accomplish this, data needs to be
180
We are ready to map the fields from the aggregator to the GOOD_CUSTOMERS table
he relevant ports from the Aggregator e selected fields to the matching ports on the GOOD_CUSTOMERS target table
181
We want to map three of the fields in the Aggregator to our second target, CUSTOMER_DATES. The CUST_ID field will go to both targets.
1.Select the relevant ports from the Aggregator to map to CUSTOMER_DATES 2.Connect the selected fields to the matching ports on the target table
182
1.Save the mapping 2.Verify the mapping is VALID 3.Clean up. Right click anywhere in the workspace, select Arrange All
4.
183
Congratulations!
You are now ready to load your data into the Data Warehouse.
184
Lab 5: Workflow
Using Workflow Manager and Monitor
85
185
186
Informatica Platform
Workflow Manager and Workflow Monitor
Provider
PowerCenter
Design Manage Workflow Manager Monitor Workflow Monitor
Consumer
Portals , Dashboards , and Reports XML , Messaging , and Web Services
Client
Administrat or
Designer
Packaged Applications
Services Framework
Repository Service
Packaged Application s
Repository
Relational and Flat Files
Integration Service
Web Services
187
Workflow Tasks
Workflow Tasks
Assignment Command Control Decision Email Event-Raise Event-Wait Session Timer
Description
Assigns a value to a workflow variable Specifies a shell command to run during the workflow. Stops or aborts the workflow. Specifies a condition to evaluate. Sends email during the workflow. Notifies the Event-Wait task that an event has occurred. Waits for an event to occur before executing the next task. Runs a mapping you create in the Designer. Waits for a timed event to trigger.
188
In this Scenario
We will use the Workflow Manager and Workflow Monitor to build a workflow to execute the mappings we just built. We will configure our workflow and then monitor the workflow in the Workflow Monitor. Along the way, we will investigate the various options in both tool sets.
189
190
91
191
ess the Orange W on the tool bar above to launch the Workflow Manager
192
Create worklets
Create workflows
193
194
195
n and remove records with no customer name before we can load them into the Da
apping m_remove_missing_customers
196
197
We need to configure the Session to connect to the source and targe 1.Double-click the Session task to open and edit it
198
199
1.Scroll down under Properties until you see Source file directory 2.Enter the location C:\Workshop\IPC\Sources as the Source file directory 3.Enter TRANSACTIONS.dat as the Source filename
C:\Workshop\IPC\Sources
200
1.Select SQ_CUSTOMERS under Sources on the left 2.Click the drop-down to select the correct Oracle instance that houses this sour
201
ource connection under Objects (this is the Oracle instance where the CUSTOME
202
Configure the target structures 1.Select CUSTOMER_NONAME from the Targets folder on the left 2.Click the drop-down box under Value to open the Relational Connction Browser
203
204
click in the Value box Apply Connection Value To all Instances to assign this connection value to a
205
1.Review the information for GOOD_CUST_STG 2.Notice Target is already filled in 3.Press OK to close the Session Editor
206
1.Under Properties, scroll down and select the Truncate target table o 2.Select OK to close the Session Editor 3.
207
208
209
w Session is added. We need to sequence the sessions so they execute in the pro
Link Tasks on the left session and drag to the session on the right, so the sessions are connec e-click the new Session (on right) task to edit it
210
211
1.Select SQ_GOOD_CUST_STG under Sources on the left 2.Click the arrow under Value to select the correct Oracle instance for this source
212
In the Relational Connection Browser 1.Select the Target connection under Objects why ?? 2.Press OK
213
Configure the target structures 1.Select CUSTOMER_DATES from the Targets folder on the left 2.Click the arrow under Value to open the Relational Connction Browser
214
215
lick in the Value box Apply Connection Value To all Instances to assign this connection value to a
216
217
the lkp_product_description Connection Value uses Source, if not click the arrow OK
218
Take a moment to review the configuration options under the 1.Review Properties
Properties allow you to specify log options, recovery strategy, commit intervals for this session in the workflow and so forth. Note in this case the workflow will continue even if the mapping fails.
219
The Config Objectallows you to specify a variety of Advanced, Logging, Error Handling and Grid related options. Scroll down to view the range of options available.
220
In the Components tab, you can configure presession shell commands, post-session commands, email messages if the session succeeds or fails, and variable assignments.
221
222
1.Verify the workflow is VALID, if not scroll up to check for error 2.Select Workflows Start Workflow
223
Workflow Monitor provides a variety of views for monitoring workflows and sessions. This view shows the status of running jobs.
1.Notice that the Workflow Monitor is displayed when you start a 2.Let the task run to completion
224
225
1.Select the Gantt Chart view 2.Right click on the first session in the workflow we just ran 3.Select Get Run Properties
4.
226
1.Review the Task Details 2.Note the session Status 3.Note the number of Source and Target rows Do the results make sense? Two tables were joined so we would expect a lower total writ 1.Click and expand Source/Target Statistics
227
Source/Target Statistics ere written to CUSTOMER_NONAME Table but 11 were rejected rows er and check the Last Error Message for that target
Looks like Writer execution failed for some reason with error 8425. Lets take a look at the session log and find out what the 8425 error is.
228
1.Select Find. . . 2.Enter the Error Number 8425 3.Select the radio button for All fields 4.Select Find Next
229
In order to debug, lets override writing our data to the CUSTOMER_NONAME table.
231
1.Select CUSTOMER_NONAME from the Mapping tab 2.Override the Relational Writer 3.Select the drop-down box
232
233
1.Under properties Scroll down to Header Options 2.Click drop-down and select Output Field Names 3.Select Set File Properties
234
1.Switch the radio button to Delimited 2.Press OK 3.Press OK to exit the Session Editor
4.
235
1.Save the changes we made 2.Verify the workflow is VALID 3.Run the workflow again
236
237
238
As we suspected, we have duplicate Customer IDs and will have to deal with that in our mapping, but well save that for another day!
239
1.Go to Designer and open the Target Designer 2.Right click on the GOOD_CUSTOMERS target 3.Select Preview Data. . .
240
1.Verify the ODBC data source is target 2.Enter Username: target 3.Enter Password: target 4.Press Connect
241
242
43
243
244
In this Scenario
As a developer you want to test the mapping you built prior to running the data to ensure that the logic in the mapping will work. For this lab we will use a pre-built mapping to review the features of the Debugger
245
246
47
247
1.Open the Mapping Designer 2.Expand the Mappings Folder 3.Drag M_DebuggerLab to the Mapping Designer workspace
249
Debugger Toolbar Start Debugger Stop the Debugger Next Instance Step to Instance Show Current Instance Continue Break Now Edit Breakpoints
250
251
252
1.Select the Int_Workshop_Service as the Integration Service on which to r 2.Leave the defaults 3.Click Next
253
1.Choose the Source and Target database connectio 2.Leave the default values of Source and Target 3.Click Next
254
255
256
Configure Session parameters 1.Check Discard target data 2.Click Finish to start the session
257
Lets adjust the tool bars so it is easier to work with the Debugger. 1.Right click on the tool bar and unselect the Advanced Transformations too 2.Repeat and select Debugger so the toolbar is visible
258
259
260
Select the Add button to create a breakpoint at the expression transformation Under Condition, click the Add box to set the breakpoint rules Edit the rule so that it will stop when CUST_ID = 325 Click OK
261
262
Debugger Menu
Breakpoint
263
Next Instance
From the Debugger Toolbar 1.Click Next Instance to step into the mapping 2.Review values and outputs in the debug panes 3.Continue to step through and monitor changes
See Output
Examine values
264
265
266
67
267
268
Informatica Platform
Workflow Manager and Workflow Monitor
Provider
PowerCenter
Design Manage Workflow Manager Monitor Workflow Monitor
Consumer
Portals , Dashboards , and Reports XML , Messaging , and Web Services
Client
Administrat or
Designer
Packaged Applications
Services Framework
Repository Service
Packaged Application s
Repository
Relational and Flat Files
Integration Service
Web Services
269
Metadata Services
Allows retrieval of PowerCenter metadata via web services including metadata for Repositories, Servers, Folders, Workflows, and Tasks
Repository 1
Repository 2
Response
Data Servers
ETL Workflow
S 1
S 2
S 3 S 4
271
Usability
Web Service Wizards Does not require WSDL knowledge
WSDL Workspace
Efficiently handles complex SOAP request/response
Built-in Web Services Testing tool Web Services Monitoring and Reports
272
Supports RPC/Encoded and Document/Literal web services Does not use the Web Service Hub
273
Purchasing System
Inventory System
PowerCenter Session
274
275
In this Scenario
We have been asked to develop a reusable Web Service that will accept and Employee Number and return the Employees Name, Job, Manager Name, Hire Date and Salary, Commission, Department Name. The Web Service needs to be exposed via the Web Services Hub so it can be called by external systems that require this information.
276
277
78
278
he menu bar ct Mappings Create Web Service Mapping Use Source/Target definitio
280
281
282
1.Scroll to bottom of the Source Ports list 2.Click on 8 3.Click the Delete a port icon to remove fields 4.Repeat until only EMPNO remains
283
284
285
1.Enter Web Service Name GetEmpInfoLab (the name will be used by PowerC 2.Check Create Mapping 3.Click OK PowerCenter will generate a mapping that returns all the fields in the EMP table
286
287
288
1.Select Lookup from the drop-down list 2.Name the transformation lkp_GetEMP 3.Press Create
4.
289
1.Select Source as the location of the Lookup Table 2.Navigate to the EMP table under WebServiceLab Sources Scott 1.Click OK
2.
290
291
1.Click and drag n_EMPNO in the source qualifier to the new Lookup transform 2.Double-click the Lookup to open the transformation editor 3.
292
The default condition linked EMPNO with the input value of n_EMPNO this looks up the incoming value and see if it is found in the EMP table. This is the condition we want.
293
Map the Lookup ports to the Web Services target 1.Select all Lookup ports except for the input n_EMPNO 2.Drag the fields to the n_EMPNO port in the target
294
In the Output Window you should see Mapping M_GetEmpInfoLab is VALID. We are now ready to create our Workflow
295
296
297
1.Name the Workflow wf_GetEmpInfoLab 2.Click Enabled to enable the Workflow as a Web Service 3.Click Config Service. . .
4.
298
1.Enter GetEmpInfoLab as the Service Name 2.Set the Maximum Run Count Per Hub to 5 3.Click Visible and Runnable 4.Click the icon to configure Web Service Hubs
299
1.Select Select from the list 2.Highlight PowerCenter WebServices Hub 3.Click OK
300
1.Validate that the Config Service dialog looks like this 2.Click OK
301
1.Validate that the Create Workflow dialog looks like this 2.Click OK to close
302
Add a Session
Add a Session to the Workflow 1.Click the Session icon 2.Click in the Workflow Designer workspace
303
Add a Session
304
1.Click the Link Tasks icon 2.Click on the Start task and drag to the s_MGetEmpInfoLab Session to link the tw 3.Double-click the s_M_GetEmpInfoLab task to open it for editing
305
1.Select the Mapping Tab to configure the mapping 2.Click lkp_GetEMP under the Transformations folder 3.Click the arrow under Connections to choose the correct database connec
306
307
Configure Session
1.Validate that the Session looks like this 2.Click OK
308
309
Connect to the Web Services Hub 1.Open Internet Explorer 2.Click on the Web Services Hub link
3.
310
Under the Web Services Hub navigation pane, Realtime Web Services 1.Select Valid WebService 2.In the Web Services pane, scroll and find GetEmpInfoLab 3.Click GetEmpInfoLab so that the row is selected (green) 4.Click Try-it
311
312
1.Enter 7844 for the EMPNO 2.Press Send Other valid EMPNOs are 7369, 7499, 7521
1.
313
Success!
314
15
315
Informatica Community
my.informatica.com
316
My.informatica.com Assets
Searchable knowledge base Online support and service request management Product documentation and demos Comprehensive partner sales, support and training tools Velocity Informaticas implementation methodology Educational services offerings Mapping Templates Link to devnet Many more
317
Developer Network
devnet.informatica.com
318
19
319
Some Examples. A more complete set of transformation naming conventions can be found in the Appendix C at the end of this Guide
320
In this Scenario
You are the regional manager for a series of car dealerships. Management has asked you to track the progress of your employees. Specifically, you need to capture:
Employee name Name of the dealership they work at What they have sold How much they have sold (net revenue)
321
1. Create a new target definition to use in the mapping, and create a target table based on the new target definition. 2. Create a mapping using the new target definition. You will add the following transformations to the mapping:
Lookup transformation. Finds the name of the employees, dealerships they work at, and all products they have sold. Aggregator transformation. Calculates the net revenue that the employee has sold. Expression transformation. Format all employees names and product descriptions.
3. Create a workflow to run the mapping in the Workflow Manager 4. Monitor the workflow in the Workflow Monitor
322
323
Step 2: Mapping
1. Open up Mapping Designer 2. Create a new mapping call it whatever you like 3. Bring in mm_transaction source and T_Employee_Summary target 4. Find dealership name (hint: Use the mm_data user as all dealerships names are kept in the mm_dealership table) 5. Find product description (hint: Use mm_data user as all product descriptions are kept in the mm_product table) 6. Find employee name (hint: Use mm_data user as all employees names are kept in the mm_employees table) 7. Format the employee name and make sure the name is capitalized 8. Format the product description and make sure the initial letters are capitalized 9. Calculate net revenue (hint: keep it simple, net revenue is revenue cost) 10. Group by Employee_ID to collapse all unique employees 11. Map to target table
324
325
326
27
327
Sorted input
1. For the mapping in lab 3 set the joiner property of sorted input on. 2. Do the same for the aggregator in the mapping of the lab 4.
Hints Sorted input will require the data coming in the transformation to be sorted. This can be achieved either by sorting in the source qualifier or adding sort transformation. In lab 3 one sources is flat file that will require adding sorter transformation. The second source is an oracle table and it can be sorted by oracle or you can choose the sorter tranformation.
328
Dynamic Partitionning
1. Your VM is set for 2 CPU)Cores 2. Partitioning is a powerful way of getting the performance 3. Informtica provides automated ways of Partitioning 4. Revising the sessions created in the lab 5 and set dynamic partitioning to them based on the CPU
Hint Look into ths session properties tab
329
Thank You!
30
330
31
331
Transformation Objects
Transformation
Aggregator Application Source Qualifier Custom Data Masking Expression External Procedure Filter HTTP Input Java Joiner Lookup Normalizer Output Rank
Description
Performs aggregate calculations. Represents the rows that the PowerCenter Server reads from an application, such as an ERP source, when it runs a session. Calls a procedure in a shared library or DLL. Replaces sensitive production data with realistic test data for non-production environments. Calculates a value. Calls a procedure in a shared library or in the COM layer of Windows. Filters data. Connects to an HTTP server to read or update data. Defines mapplet input rows. Available in the Mapplet Designer. Executes user logic coded in Java. The byte code for the user logic is stored in the repository. Joins data from different databases or flat file systems. Looks up values. Source qualifier for COBOL sources. Can also use in the pipeline to normalize data from relational or flat file sources. Defines mapplet output rows. Available in the Mapplet Designer. Limits records to a top or bottom range.
332
Transformation Objects
Transformation Router Sequence Generator Sorter Source Qualifier SQL Stored Procedure Transaction Control Union Unstructured Data Update Strategy XML Generator XML Parser XML Source Qualifier Description Routes data into multiple transformations based on group conditions. Generates primary keys. Sorts data based on a sort key. Represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. Executes SQL queries against a database. Calls a stored procedure. Defines commit and rollback transactions. Merges data from different databases or flat file systems. Transforms data in unstructured and semi-structured formats. Determines whether to insert, delete, update, or reject rows. Reads data from one or more input ports and outputs XML through a single output port. Reads XML from one input port and outputs data to one or more output ports. Represents the rows that the Integration Service reads from an XML source when it runs a session.
333
34
334
Aggregate Functions
Function Description
AVG COUNT FIRST LAST MAX MEDIAN MIN PERCENTILE STDDEV SUM VARIANCE Returns the average of all values in a group. Returns the number of records with non-null value, in a group. Returns the first record in a group. Returns the last record in a group. Returns the maximum value, or latest date, found in a group. Returns the median of all values in a selected port. Returns the minimum value, or earliest date, found in a group. Calculates the value that falls at a given percentile in a group of numbers. Returns the standard deviation for a group. Returns the sum of all records in a group. Returns the variance of all records in a group.
335
Character Functions
Function ASCII Descriptionthe numeric ASCII value of the first character of the string In ASCII mode, returns
passed to the function. In Unicode mode, returns the numeric Unicode value of the first character of the string passed to the function. CHR Returns the ASCII or Unicode character corresponding to the specified numeric value.
CHRCODE
In ASCII mode, returns the numeric ASCII value of the first character of the string passed to the function. In Unicode mode, returns the numeric Unicode value of the first character of the string passed to the function. Concatenates two strings. Capitalizes the first letter in each word of a string and converts all other letters to lowercase. Returns the position of a character set in a string, counting from left to right.
336
Description
Returns the number of characters in a string, including trailing blanks. Converts uppercase string characters to lowercase. Adds a set of blanks or characters to the beginning of a string, to set a string to a specified length. Removes blanks or characters from the beginning of a string. Encodes characters of the English language alphabet (A-Z). It encodes both uppercase and lowercase letters in uppercase. Replaces characters in a string with a single character or no character. Replaces characters in a string with a single character, multiple characters, or no character. Converts a string to a specified length by adding blanks or characters to the end of the string. Removes blanks or characters from the end of a string. Works for characters in the English alphabet (A-Z). It uses the first character of the input string as the first character in the return value and encodes the remaining three unique consonants as numbers. Returns a portion of a string. Converts lowercase string characters to uppercase.
SUBSTR UPPER
337
Conversion Functions
Function
TO_BIGINT
Description
Converts a string or numeric value to a bigint value.
TO_CHAR
TO_DATE
Converts a character string to a date datatype in the same format as the character string. Converts any value (except binary) to a decimal.
TO_DECIMAL
TO_FLOAT
Converts any value (except binary) to a double-precision floating point number (the Double datatype).
TO_INTEGER
Converts any value (except binary) to an integer by rounding the decimal portion of a value.
338
Description
Returns the greatest value from a list of input values Matches input data to a list of values Returns the position of a character set in a string, counting from left to right. Returns whether a value is a valid date. Returns whether a string is a valid number. Returns whether a value consists entirely of spaces. Returns whether a value is NULL. Returns the smallest value from a list of input values. Removes blanks or characters from the beginning of a string. Encodes characters of the English language alphabet (A-Z). It encodes both uppercase and lowercase letters in uppercase.
339
Description
Extracts subpatterns of a regular expression within an input value Returns whether a value matches a regular expression pattern Replaces characters in a string with a another character pattern. Replaces characters in a string with a single character or no character Replaces characters in a string with a single character, multiple characters, or no character. Removes blanks or characters from the end of a string. Encodes a string value into a four-character string. Returns a portion of a string. Converts a string or numeric value to a bigint value. Converts numeric values and dates to text strings. Converts a character string to a date datatype in the same format as the character string. Converts any value (except binary) to a decimal. Converts any value (except binary) to a double-precision floating point number (the Double datatype). Converts any value (except binary) to an integer by rounding the decimal portion of a value.
340
Date Functions
Function ADD_TO_DATE Description Adds a specified amount to one part of a date/time value, and returns a date in the same format as the specified date.
DATE_COMPARE DATE_DIFF
Returns a value indicating the earlier of two dates. Returns the length of time between two dates, measured in the specified increment (years, months, days, hours, minutes, or seconds). Returns the specified part of a date as an integer value, based on the default date format of MM/DD/YYYY HH24:MI:SS. Returns whether a string value is a valid date. Returns the date of the last day of the month for each date in a port. Returns the date and time based on the input values Rounds one part of a date. Sets one part of a date/time value to a specified value. Date/Time datatype. Passes the date values you want to convert to character strings Truncates dates to a specific year, month, day, hour, or minute.
341
Encoding Functions
Function AES_DECRYPT AES_ENCRYPT COMPRESS CRC32 DEC_BASE64 DECOMPRESS ENC_BASE64 Description Returns encrypted data to string format Returns data in encrypted format Compresses data using the zlib compression algorithm Returns a 32-bit Cyclic Redundancy Check (CRC32) value Decodes the value and returns a string with the binary data representation of the data Decompresses data using the zlib compression algorithm Encodes data by converting binary data to string data using Multipurpose Internet Mail Extensions (MIME) encoding Calculates the checksum of the input value. The function uses Message-Digest algorithm 5 (MD5)
MD5
342
Financial Functions
Function FV
NPER PMT PV RATE
Description of an investment, where you make periodic, constant payments Returns the future value
and the investment earns a constant interest rate Returns the number of periods for an investment based on a constant interest rate and periodic, constant payments Returns the payment for a loan based on constant payments and a constant interest rate Returns the present value of an investment Returns the interest rate earned per period by a security
343
Numeric Functions
Function ABS CEIL CONVERT_BASE CUME EXP FLOOR LN LOG MOD MOVINGAVG MOVINGSUM POWER RAND ROUND SIGN SQRT TRUNC Description Returns the absolute value of a numeric value. Returns the smallest integer greater than or equal to the specified numeric value. Converts a number from one base value to another base value Returns a running total of all numeric values. Returns e raised to the specified power (exponent), where e=2.71828183. Returns the largest integer less than or equal to the specified numeric value. Returns the natural logarithm of a numeric value. Returns the logarithm of a numeric value. Returns the remainder of a division calculation. Returns the average (record-by-record) of a specified set of records. Returns the sum (record-by-record) of a specified set of records. Returns a value raised to the specified exponent. Returns a random number between 0 and 1 Rounds numbers to a specified digit. Notes whether a numeric value is positive, negative, or 0. Returns the square root of a positive numeric value. Truncates numbers to a specific digit.
344
Scientific Functions
Function Description
COS
COSH
SIN
SINH
TAN
TANH
345
Special Functions
Function Description
ABORT
DECODE
ERROR
Causes the PowerCenter Server to skip a record and issue the specified error message.
IIF
Returns one of two values you specify, based on the results of a condition.
LOOKUP
Searches for a value in a lookup source column. Informatica recommends using the Lookup transformation.
346
String Functions
Function Description
CHOOSE
INDEXOF
REVERSE
347
Test Functions
Function
IS_DATE
Description
Returns whether a value is a valid date..
IS_NUMBER
IS_SPACES
ISNULL
348
Variable Functions
Function
SETCOUNTVARIABLE
Description
Counts the rows evaluated by the function and increments the current value of a mapping based on the count. Sets the current value of a mapping variable to the higher of two values:thecurrent value of the variable or the value specified. Returns the new current value.
SETMAXVARIABLE
SETMINVARIABLE
Sets the current value of a mapping variable to the lower of two values: the current value of the variable or the value specified. Returns the new current value.
SETVARIABLE
Sets the current value of a mapping variable to a value you specify. Returns the specified value. Returns the current date and time of the node hosting the Integration Service with precision to the nanosecond.
SYSTIMESTAMP
349
50
350
Transformation Naming
Each object in a PowerCenter repository is identified by a unique name. This allows PowerCenter to efficiently manage and track statistics all the way down to the object level.
When an object is created, PowerCenter automatically generates a unique name. These names, however, do not reflect project/repository specific context. As a best practice Informatica recommends the following convention for naming PowerCenter objects:
Abbreviations are usually 1 3 letters, the minimum needed to easily identify the object type. For example:
agg_quarterly_total (an aggregator that computes a quarterly totals) m_load_dw (a mapping that loads the datawarehouse) exp_string_to_date (an expression that converts a string to a date)
351