Sie sind auf Seite 1von 811

MDM Multidomain Edition (Version 9.0.

1)

Administrator Guide
Informatica MDM Multidomain Hub - Version 9.0.1 - 2010
Copyright (c) 2010 Informatica. All rights reserved.
This software and documentation contain proprietary information of Informatica Corporation and are provided under a license
agreement containing restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is
prohibited.
No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or
otherwise) without prior consent of Informatica Corporation. This Software is be protected by U.S. and/or international Patents and
other Patents Pending.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software
license agreement and as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013©(1)(ii) (OCT 1988), FAR
12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.
The information in this product or documentation is subject to change without notice. If you find any problems in this product or
documentation, please report them to us in writing.
Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter
Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica
B2B Data Transformation, Informatica B2B Data Exchange, Informatica On Demand and Siperian are trademarks or registered
trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product
names may be trade names or trademarks of their respective owners.
Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright
DataDirect Technologies. All rights reserved. Copyright © Sun Microsystems. All rights reserved.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and other software which is
licensed under the Apache License, Version 2.0 (the "License"). You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the
License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and limitations under the License.
This product includes software which is licensed under the GNU Lesser General Public License Agreement, which may be found at
http://www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind,
either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.
This product includes software which is licensed under the CDDL (the "License"). You may obtain a copy of the License at
http://www.sun.com/cddl/cddl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind,
either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. See
the License for the specific language governing permissions and limitations under the License.
This product includes software which is licensed under the BSD License (the "License"). You may obtain a copy of the License at
http://www.opensource.org/licenses/bsd-license.php. The materials are provided free of charge by Informatica, "as-is", without
warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a
particular purpose. See the License for the specific language governing permissions and limitations under the License.
This product includes software Copyright (c) 2003-2008, Terence Parr, all rights reserved which is licensed under the BSD License (the
"License"). You may obtain a copy of the License at http://www.antlr.org/license.html. The materials are provided free of charge by
Informatica, "as-is", without warranty of any kind, either express or implied, including but not limited to the implied warranties of
merchantability and fitness for a particular purpose. See the License for the specific language governing permissions and limitations
under the License.
This product includes software Copyright (c) 2000 - 2009 The Legion Of The Bouncy Castle (http://www.bouncycastle.org) which is
licensed under a form of the MIT License (the "License"). You may obtain a copy of the License at
http://www.bouncycastle.org/licence.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any
kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.
See the License for the specific language governing permissions and limitations under the License.
DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied,
including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. Informatica
Corporation does not warrant that this software or documentation is error free. The information provided in this software or
documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is
subject to change at any time without notice.
NOTICES
This Informatica product (the “Software”) may include certain drivers (the “DataDirect Drivers”) from DataDirect Technologies, an
operating company of Progress Software Corporation (“DataDirect”) which are subject to the following terms and conditions:
1. THE DATADIRECT DRIVERS ARE PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR
IMPLIED, INCLUDING BUT NOTLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE
ODBC DRIVERS, WHETHER OR NOT INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS
APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACH OF CONTRACT, BREACH OF
WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.
Contents

Contents 3

Preface 10

Organization 10

Learning About Informatica MDM Hub 13

Informatica Global Customer Support 15

Informatica Resources 16

Part 1: Introduction 18

Chapter 1: Introduction to Informatica MDM Hub Administration 19

About Informatica MDM Hub Administrators 19

Phases in Informatica MDM Hub Administration 20

Summary of Administration Tasks 21

Chapter 2: Getting Started with the Hub Console 29

About the Hub Console 29

Starting the Hub Console 30

Navigating the Hub Console 32

Informatica MDM Hub Workbenches and Tools 46

Part 2: Building the Data Model 50

Chapter 3: About the Hub Store 51

Databases in the Hub Store 51

How Hub Store Databases Are Related 51

Creating Hub Store Databases 52

Version Requirements 52

Chapter 4: Configuring Operational Reference Stores and Datasources 54

Before You Begin 54

About the Databases Tool 54

Starting the Databases Tool 55

Configuring Operational Reference Stores 55

Configuring Datasources 71

-3-
Chapter 5: Building the Schema 73

Before You Begin 73

About the Schema 73

Starting the Schema Manager 81

Configuring Base Objects 82

Configuring Columns in Tables 102

Configuring Foreign-Key Relationships Between Base Objects 113

Viewing Your Schema 119

Chapter 6: Configuring Queries and Packages 127

Before You Begin 127

About Queries and Packages 127

Configuring Queries 127

Configuring Packages 151

Chapter 7: State Management 159

Before You Begin 159

About State Management in Informatica MDM Hub 159

Configuring State Management for Base Objects 162

Modifying the State of Records 164

Rules for Loading Data 168

Chapter 8: Configuring Hierarchies 169

About Configuring Hierarchies 169

Starting the Hierarchies Tool 178

Configuring Hierarchies 191

Configuring Relationship Base Objects and Relationship Types 193

Configuring Packages for Use by HM 205

Configuring Profiles 211

Sandboxes 216

Part 3: Configuring the Data Flow 217

Chapter 9: Informatica MDM Hub Processes 218

Before You Begin 218

-4-
About Informatica MDM Hub Processes 218

Land Process 221

Stage Process 224

Load Process 227

Tokenize Process 240

Match Process 245

Consolidate Process 255

Publish Process 260

Chapter 10: Configuring the Land Process 264

Before You Begin 264

Configuration Tasks for the Land Process 264

Configuring Source Systems 264

Configuring Landing Tables 269

Chapter 11: Configuring the Stage Process 274

Before You Begin 274

Configuration Tasks for the Stage Process 274

Configuring Staging Tables 275

Mapping Columns Between Landing and Staging Tables 286

Using Audit Trail and Delta Detection 300

Chapter 12: Configuring Data Cleansing 307

Before You Begin 307

About Data Cleansing in Informatica MDM Hub 307

Configuring Cleanse Match Servers 308

Using Cleanse Functions 314

Configuring Cleanse Lists 333

Chapter 13: Configuring the Load Process 343

Before You Begin 343

Configuration Tasks for Loading Data 343

Configuring Trust for Source Systems 344

Configuring Validation Rules 353

-5-
Chapter 14: Configuring the Match Process 363

Before You Begin 363

Configuration Tasks for the Match Process 363

Navigating to the Match/Merge Setup Details Dialog 365

Configuring Match Properties for a Base Object 366

Configuring Match Paths for Related Records 373

Configuring Match Columns 387

Configuring Match Rule Sets 399

Configuring Match Column Rules for Match Rule Sets 407

Configuring Primary Key Match Rules 434

Investigating the Distribution of Match Keys 438

Excluding Records from the Match Process 441

Chapter 15: Configuring the Consolidate Process 443

Before You Begin 443

About Consolidation Settings 443

Changing Consolidation Settings 447

Chapter 16: Configuring the Publish Process 449

Before You Begin 449

Configuration Steps for the Publish Process 450

Starting the Message Queues Tool 450

Configuring Global Message Queue Settings 451

Configuring Message Queue Servers 452

Configuring Outbound Message Queues 454

Configuring Message Triggers 456

JMS Message XML Reference 464

Legacy JMS Message XML Reference 479

Part 4: Executing Informatica MDM Hub Processes 495

Chapter 17: Using Batch Jobs 496

Before You Begin 496

About Informatica MDM Hub Batch Jobs 496

-6-
Running Batch Jobs Using the Batch Viewer Tool 501

Running Batch Jobs Using the Batch Group Tool 512

Batch Jobs Reference 530

Chapter 18: Writing Custom Scripts to Execute Batch Jobs 559

About Executing Informatica MDM Hub Batch Jobs 559

Setting Up Job Execution Scripts 560

Monitoring Job Results and Statistics 563

Stored Procedure Reference 566

Executing Batch Groups Using Stored Procedures 598

Developing Custom Stored Procedures for Batch Jobs 604

Part 5: Configuring Application Access 610

Chapter 19: Generating ORS-specific APIs and Message Schemas 611

Before You Begin 611

Generating ORS-specific APIs 611

Generating ORS-specific Message Schemas 615

Chapter 20: Setting Up Security 621

About Setting Up Security 621

Securing Informatica MDM Hub Resources 629

Configuring Roles 638

Configuring Informatica MDM Hub Users 646

Configuring User Groups 658

Assigning Users to the Current ORS Database 661

Assigning Roles to Users and User Groups 662

Managing Security Providers 664

Chapter 21: Viewing Registered Custom Code 678

About User Objects 678

About the User Object Registry Tool 678

Starting the User Object Registry Tool 679

Viewing User Exits 679

Viewing Custom Stored Procedures 680

-7-
Viewing Custom Java Cleanse Functions 681

Viewing Custom Button Functions 682

Chapter 22: Auditing Informatica MDM Hub Services and Events 684

About Integration Auditing 684

Starting the Audit Manager 686

Auditing SIF API Requests 688

Auditing Message Queues 689

Auditing Errors 690

Using the Audit Log 691

Part 6: Appendixes 697

Appendix A: Configuring International Data Support 698

Configuring Unicode in Informatica MDM Hub 698

Configuring the ANSI Code Page (Windows Only) 703

Configuring NLS_LANG 704

Appendix B: Backing Up and Restoring Informatica MDM Hub 706

Backing Up Informatica MDM Hub 706

Backup and Recovery Strategies for Informatica MDM Hub 706

Appendix C: Configuring User Exits 708

About User Exits 708

Types of User Exits 708

Appendix D: Viewing Configuration Details 715

About the Enterprise Manager 715

Starting the Enterprise Manager 715

Viewing Enterprise Manager Properties 716

Viewing Version History 723

Using ORS Database Logs 724

Appendix E: Implementing Custom Buttons in Hub Console Tools 730

About Custom Buttons in the Hub Console 730

Adding Custom Buttons 731

Appendix F: Configuring Access to Hub Console Tools 737

-8-
About User Access to Hub Console Tools 737

Starting the Tool Access Tool 737

Granting User Access to Tools and Processes 738

Revoking User Access to Tools and Processes 739

Appendix G: Row-level Locking 740

About Row-level Locking 740

Configuring Row-level Locking 741

Locking Interactions Between SIF Requests and Batch Processes 742

Glossary 744

Index 786

-9-
Preface

Organization
This guide contains the following chapters:

"Introduction" Provides an overview of Informatica MDM Hub


on page 18 administration and explains how to navigate the Hub
Console.
"Introduction to Introduces Informatica MDM Hub administration phases,
Informatica tools, and tasks.
MDM Hub
Administration"
on page 19
"Getting Started Introduces tools in the Hub Console and provides general
with the Hub navigation instructions.
Console" on
page 29
"Building the Describes how to construct the schema (data model) used in
Data Model" your Informatica MDM Hub implementation and stored in the
on page 50 Hub Store. It provides instructions on using Hub Console
tools to configure Operational Reference Stores (ORSs),
datasources, the data model, queries, packages, hierarchies,
and other objects.
"About the Hub Describes the key components of the Hub Store: the Master
Store" on page Database and Operational Reference Stores (ORS).
51
"Configuring Explains how to configure Operational Reference Stores
Operational (ORS) and datasources.
Reference
Stores and
Datasources" on
page 54
"Building the Describes the Hub Store schema and provides instructions on
Schema" on building the schema for your Informatica MDM Hub
page 73 implementation.
"Configuring Explains how to use and create Informatica MDM Hub queries
Queries and and packages.
Packages" on
page 127
"State Describes state management concepts and provides
Management" instructions for configuring state management in your
on page 159 Informatica MDM Hub implementation.
"Configuring Explains how to configure Informatica Hierarchy Manager
Hierarchies" on (HM) and describes how to create and configure relationships
page 169 based on foreign keys.
"Configuring Describes the flow of data through the Informatica MDM Hub
the Data via a series of processes (land, stage, load, match,
Flow" on page consolidate, and distribute), and provides instructions for
217 configuring each process using tools in the Hub Console.
"Informatica Describes the flow of data through the Informatica MDM Hub
MDM Hub via batch processes, starting with the land process and
Processes" on concluding with the distribution process.
page 218

- 10 -
"Configuring the Describes the data landing process and explains how to
Land Process" configure source systems and landing tables.
on page 264
"Configuring the Describes the data staging process and explains how to
Stage Process" configure staging tables, mappings, and other settings in that
on page 274 affect Stage jobs.
"Configuring Explains how to configure data cleansing rules that are run
Data Cleansing" during Stage jobs.
on page 307
"Configuring the Explains how to use the load process, and how to define trust
Load Process" and validation rules.
on page 343
"Configuring the Explains how to configure your Hub Store to match data.
Match Process"
on page 363
"Configuring the Explains how to configure your Hub Store to consolidate
Consolidate data.
Process" on
page 443
"Configuring the Explains how to configure Informatica MDM Hub to write
Publish Process" changes to a message queue.
on page 449
"Executing Describes how to use Hub Console tools to run Informatica
Informatica MDM Hub processes via batch jobs, and how to use third-
MDM Hub party job management tools to schedule and manage
Processes" on Informatica MDM Hub processes via stored procedures.
page 495
"Using Batch Explains how to use the Informatica MDM Hub batch jobs and
Jobs " on page batch groups.
496
"Writing Custom Explains how to schedule Informatica MDM Hub batch jobs
Scripts to using job execution scripts.
Execute Batch
Jobs " on page
559
"Configuring Describes how to use Hub Console tools to configure
Application Informatica MDM Hub client applications that access
Access" on Informatica MDM Hub using Services Integration Framework
page 610 (SIF) requests.
"Generating Describes how to generate ORS-specific SIF request APIs
ORS-specific using the SIF Manager tool in the Hub Console.
APIs and
Message
Schemas" on
page 611
"Setting Up Explains how to set up security for users who will access
Security" on Informatica MDM Hub resources via the Hub Console or
page 621 third-party applications.
"Viewing Explains how to register custom code using the User Object
Registered Registry tool in the Hub Console.
Custom Code"
on page 678
"Auditing Describes how to set up auditing and debugging in the Hub
Informatica Console.
MDM Hub
Services and
Events" on page
684
"Appendixes" Describes other administration-related topics.

- 11 -
on page 697
"Configuring Describes how to configure different character sets for
International internationalization purposes.
Data Support"
on page 698
"Backing Up and Explains how to back up and restore a Informatica MDM Hub
Restoring implementation.
Informatica
MDM Hub" on
page 706
"Configuring Explains how to configure user exits, which are user-
User Exits" on customized, unencrypted stored procedures that are
page 708 configured to execute at a specific point during batch job
execution.
"Viewing Explains how to view details of your Informatica MDM Hub
Configuration implementation using the Enterprise Manager tool in the Hub
Details" on page Console.
715
"Implementing Explains how to add custom buttons to tools in the Hub
Custom Buttons Console that allow users to invoke external services on
in Hub Console demand.
Tools" on page
730
"Configuring Describes how to grant or revoke user access to tools in the
Access to Hub Hub Console using the Tool Access tool.
Console Tools"
on page 737
"Row-level Describes row-level locking, which assists API processing on
Locking" on the Hub concurrent with the execution of batch processes.
page 740
"Glossary" on Defines Informatica MDM Hub terminology.
page 744

- 12 -
Learning About Informatica MDM Hub
Here are the Informatica MDM Hub technical manuals and training materials.

What's New in Informatica MDM Hub

What's New in Informatica MDM Hub describes the new features in this
release.

Informatica MDM Hub Release Notes

The Informatica MDM Hub Release Notes contain important information about
this Informatica MDM Hub release. Installers should read the Informatica
MDM Hub Release Notes before installing Informatica MDM Hub.

Informatica MDM Hub Overview

The Informatica MDM Hub Overview introduces Informatica MDM Hub,


describes the product architecture, and explains core concepts that all users
need to understand before using the product.

Informatica MDM Hub Installation Guide

The Informatica MDM Hub Installation Guide explains to installers how to set
up Informatica MDM Hub, the Hub Store, Cleanse Match Servers, and other
components.

Informatica MDM Hub Upgrade Guide

The Informatica MDM Hub Upgrade Guide explains to installers how to


upgrade a previous Informatica MDM Hub version to the most recent version.

Informatica MDM Hub Cleanse Adapter Guide

The Informatica MDM Hub Cleanse Adapter Guide explains to installers how to
configure Informatica MDM Hub to use the supported adapters and cleanse
engines.

Informatica MDM Hub Data Steward Guide

The Informatica MDM Hub Data Steward Guide explains to data stewards how
to use Informatica MDM Hub tools to consolidate and manage their
organization's data. After reading the Informatica MDM Hub Overview, data
stewards should read the Informatica MDM Hub Data Steward Guide.

- 13 -
Informatica MDM Hub Administrator Guide

The Informatica MDM Hub Administrator Guide explains to administrators how


to use Informatica MDM Hub tools to build their organization’s data model,
configure and execute Informatica MDM Hub data management processes, set
up security, provide for external application access to Informatica MDM Hub
services, and other customization tasks. After reading the Informatica MDM
Hub Overview, administrators should read the Informatica MDM Hub
Administrator Guide.

Informatica MDM Hub Services Integration Framework Guide

The Informatica MDM Hub Services Integration Framework Guide explains to


developers how to use the Informatica MDM Hub Services Integration
Framework (SIF) to integrate Informatica MDM Hub functionality with their
applications, and how to create applications using the data provided by
Informatica MDM Hub. SIF allows developers to integrate Informatica MDM
Hub smoothly with their organization's applications. After reading the
Informatica MDM Hub Overview, developers should read the Informatica MDM
Hub Services Integration Framework Guide.

Informatica MDM Hub Metadata Manager Guide

The Informatica MDM Hub Metadata Manager Guide explains how to use the
Informatica MDM Hub Metadata Manager tool to validate their organization’s
metadata, promote changes between repositories, import objects into
repositories, export repositories, and related tasks.

Informatica MDM Hub Resource Kit Guide

The Informatica MDM Hub Resource Kit Guide explains how to install and use
the Informatica MDM Hub Resource Kit, which is a set of utilities, examples,
and libraries that assist developers with integrating the Informatica MDM Hub
into their applications and workflows. This document provides a description of
the sample applications that are included with the Resource Kit.

Informatica Training and Materials

Informatica provides live, instructor-based training to help professionals


become proficient users as quickly as possible. From initial installation
onward, a dedicated team of qualified trainers ensure that an organization’s
staff is equipped to take advantage of this powerful platform. To inquire about
training classes or to find out where and when the next training session is
offered, please visit Informatica’s web site or contact Informatica directly.

- 14 -
Informatica Global Customer Support
You can contact a Customer Support Center by telephone or through the
WebSupport Service. WebSupport requires a user name and password. You
can request a user name and password at http://my.informatica.com. Use the
following telephone numbers to contact Informatica Global Customer Support:

North America / Europe / Middle East / Africa Asia / Australia


South America
Toll Free Toll Free Toll Free
North America United Kingdom Australia
+1 877 463 2435 00800 4632 4357 or 1 800 151 830
Brazil 0800 023 4632 New Zealand
0800 891 0202 France 1 800 151 830
Mexico 00800 4632 4357 Singapore
001 888 209 8853 Netherlands 001 800 4632 4357
00800 4632 4357
Germany
00800 4632 4357
Switzerland
00800 4632 4357
Israel
00800 4632 4357
Spain
900 813 166
Portugal
800 208 360
Italy
800 915 985
Standard Rate Standard Rate Standard Rate
North America Belgium India
+1 650 653 6332 +32 15 281 702 +91 80 4112 5738
France
0805 804632
Germany
+49 1805 702 702
Netherlands
+31 306 022 797
Switzerland
0800 463 200

- 15 -
Informatica Resources
Informatica Customer Portal
As an Informatica customer, you can access the Informatica Customer Portal
site at http://my.informatica.com. The site contains product information, user
group How-To Library, the Informatica Knowledge Base, the Informatica
Multimedia Knowledge Base information, newsletters, access to the
Informatica customer support case management system (ATLAS), the
Informatica, Informatica Documentation Center, and access to the
Informatica user community.

Informatica Documentation
The Informatica Documentation team takes every effort to create accurate,
usable documentation. If you have questions, comments, or ideas about this
documentation, contact the Informatica Documentation team through email at
infa_documentation@informatica.com. We will use your feedback to improve
our documentation. Let us know if we can contact you regarding your
comments.

The Documentation team updates documentation as needed. To get the latest


documentation for your product, navigate to the Informatica Documentation
Center from http://my.informatica.com.

Informatica Web Site


You can access the Informatica corporate web site at
http://www.informatica.com. The site contains information about Informatica,
its background, upcoming events, and sales offices. You will also find product
and partner information. The services area of the site includes important
information about technical support, training and education, and
implementation services.

Informatica How-To Library


As an Informatica customer, you can access the Informatica How-To Library
at http://my.informatica.com. The How-To Library is a collection of resources
to help you learn more about Informatica products and features. It includes
articles and interactive demonstrations that provide solutions to common
problems, compare features and behaviors, and guide you through performing
specific real-world tasks.

- 16 -
Informatica Knowledge Base
As an Informatica customer, you can access the Informatica Knowledge Base
at http://my.informatica.com. Use the Knowledge Base to search for
documented solutions to known technical issues about Informatica products.
You can also find answers to frequently asked questions, technical white
papers, and technical tips. If you have questions, comments, or ideas about
the Knowledge Base, contact the Informatica Knowledge Base team through
email at KB_Feedback@informatica.com.

Informatica Multimedia Knowledge Base


As an Informatica customer, you can access the Informatica Multimedia
Knowledge Base at http://my.informatica.com. The Multimedia Knowledge
Base is a collection of instructional multimedia files that help you learn about
common concepts and guide you through performing specific tasks. If you
have questions, comments, or ideas about the Multimedia Knowledge Base,
contact the Informatica Knowledge Base team through email at KB_
Feedback@informatica.com.

- 17 -
Part 1: Introduction

Part 1: Introduction

Contents
• "Introduction to Informatica MDM Hub Administration" on page 19
• "Getting Started with the Hub Console" on page 29

- 18 -
Chapter 1: Introduction to
Informatica MDM Hub Administration

This chapter introduces and provides an overview of administering


Informatica MDM Multidomain Hub™ (hereinafter referred to as Informatica
MDM Hub). It is recommended for anyone who manages a Informatica MDM
Hub implementation.

Note: This document assumes that you have read the Informatica MDM Hub
Overview and have a basic understanding of Informatica MDM Hub
architecture and key concepts.

Chapter Contents
• "About Informatica MDM Hub Administrators" on page 19
• "Phases in Informatica MDM Hub Administration" on page 20
• "Summary of Administration Tasks" on page 21

About Informatica MDM Hub Administrators


Informatica MDM Hub administrators have primary responsibility for the
configuration of the Informatica MDM Hub system. Administrators access
Informatica MDM Hub through the Hub Console, which comprises a set of tools
for managing a Informatica MDM Hub implementation.

Informatica MDM Hub administrators use the Hub Console to:


• build the data model and other objects in the Hub Store
• configure and execute Informatica MDM Hub data management processes
• configure external application access to Informatica MDM Hub functionality
and resources
• monitor ongoing operations

For an introduction to using the Hub Console, see "Getting Started with the
Hub Console" on page 29.

- 19 -
Phases in Informatica MDM Hub
Administration

This section describes typical phases in Informatica MDM Hub administration.


These phases may vary for your Informatica MDM Hub implementation based
on your organization’s methodology.

Startup Phase
The startup phase involves installing and configuring core Informatica MDM
Hub components: Hub Store, Hub Server, Cleanse Match Server(s), and
cleanse adapters. For instructions on installing the Hub Store, Hub Server, and
Cleanse Match Servers, see the Informatica MDM Hub Installation Guide for
your platform. For instructions on setting up a cleanse adapter, see the
Informatica MDM Hub Cleanse Adapter Guide.

Note: The instructions in this document assume that you have already
completed the startup phase and are ready to begin configuring your
Informatica MDM Hub implementation.

Configuration Phase
After Informatica MDM Hub has been installed and set up, administrators can
begin configuring and testing Informatica MDM Hub functionality—the data
model and other objects in the Hub Store, data management processes,
external application access, and so on. This phase involves a dynamic,
iterative process of building and testing Informatica MDM Hub functionality to
meet the stated requirements of an organization. The bulk of the material in
this document refers to tasks associated with the configuration phase.

After a schema has been sufficiently built and the Informatica MDM Hub has
been properly configured, developers can build external applications to access
Informatica MDM Hub functionality and resources. For instructions on
developing external applications, see the Informatica MDM Hub Services
Integration Framework Guide.

- 20 -
Production Phase
After a Informatica MDM Hub implementation has been sufficiently configured
and tested, administrators deploy the Informatica MDM Hub in a production
environment. In addition to managing ongoing Informatica MDM Hub
operations, this phase can involve performance tuning to optimize the
processing of actual business data.

Summary of Administration Tasks


This section provides a summary of administration tasks.

Setting Up Security
In this document, "Setting Up Security" on page 621 describes the tasks
associated with setting up security in a Informatica MDM Hub implementation.
Setup tasks vary depending on the particular security requirements of your
Informatica MDM Hub implementation, as described in "Security
Implementation Scenarios" on page 625. Additional security tasks are
involved if external applications access your Informatica MDM Hub
implementation using Services Integration Framework (SIF) requests. For
more information, see "About Setting Up Security" on page 621, "Summary of
Security Configuration Tasks" on page 627, and "Configuration Tasks For
Security Scenarios" on page 628.

To configure security for a Informatica MDM Hub implementation using


Informatica MDM Hub’s internal security framework, you complete the
following tasks using tools in the Hub Console:
High-Level Tasks for Setting Up Security
Task Usage
"Managing the Global Required to define global password policies for all
Password Policy" on users according to your organization’s security
page 654 policies and procedures.
"Configuring Required to define user accounts for users to access
Informatica MDM Hub Informatica MDM Hub resources.
Users" on page 646
"Assigning Users to the Required to provide users with access to the
Current ORS Database" database(s) they need to use.
on page 661
"Configuring User Optional. To simplify security configuration tasks by
Groups" on page 658 configuring user groups and assign users.
"Securing Informatica Required in order to selectively and securely expose
MDM Hub Resources" Informatica MDM Hub resources to external
on page 629 applications.
"Configuring Roles" on Required to define roles and assign resource
page 638 privileges to them.
"Assigning Roles to Required to assign roles to users and (optionally)
Users and User Groups" user groups.

- 21 -
Task Usage
on page 662
"Managing Security Required if you are using external security providers
Providers" on page 664 to handle any portion of security in your Informatica
MDM Hub implementation.
"Configuring Access to Required to provide non-administrator users with
Hub Console Tools" on access to Hub Console tools.
page 737

Building the Data Model


In this document, "Building the Data Model" on page 50 describes how to
construct the schema (data model) used in your Informatica MDM Hub
implementation and stored in the Hub Store. It provides instructions for using
Hub Console tools to configure Operational Reference Stores (ORSs),
datasources, the data model, queries, packages, hierarchies, and other
metadata.
High-Level Tasks for Building the Data Model
Task Usage
"Creating Required for all Informatica MDM Hub implementations. For
Hub Store more information, see the instructions for installing the Hub
Databases" Store in the Informatica MDM Hub Installation Guide.
on page 52
"Configuring Required for all Informatica MDM Hub implementations. You
Operational must register an ORS so that Informatica MDM Hub can connect
Reference to it. For more information, see "Databases in the Hub Store"
Stores" on on page 51.
page 55
"Configuring Required only if the datasource was not automatically created
Datasources" upon registering an ORS. Every ORS requires a datasource
on page 71 definition in the application server environment. For more
information, see "About Datasources" on page 71.
"Configuring Required for each base object in your schema. Base objects are
Base used for a central business entity (such as customer, product,
Objects" on or employee) or a lookup table (such as country or state). For
page 82 more information, see "About the Schema" on page 73,
"Process Overview for Defining Base Objects" on page 83, and
"About Base Objects" on page 82.
"Configuring Required for all base objects, landing tables, and staging
Columns in tables. For more information, see "About Columns" on page
Tables" on 102.
page 102
"Configuring Required only when you want to explicitly define a foreign-key
Foreign-Key relationship (parent-child) between two base objects. For more
Relationships information, see "Process Overview for Defining Foreign-Key
Between Relationships" on page 114 and "About Foreign Key
Base Relationships" on page 113. For Hierarchy Manager, see
Objects" on "Configuring Hierarchies" on page 169 instead.
page 113
"Viewing Useful for visualizing your schema in a graphical format.
Your
Schema" on
page 119

- 22 -
Task Usage
"Configuring Required for creating queries used in packages. For more
Queries" on information, see "About Queries" on page 128 and "Configuring
page 127 Packages" on page 151.
Required for queries used by data stewards in the Merge
Manager tool. For more information, see the Informatica MDM
Hub Data Steward Guide.
"Configuring Required to allow external application users to access
Packages" on Informatica MDM Hub functionality using Services Integration
page 151 Framework (SIF) requests. For more information, see the
Informatica MDM Hub Services Integration Framework Guide.
For more information, see "About Packages" on page 152.
Required to allow data stewards to merge and update records
in the Hub Store using the Merge Manager and Data Manager
tools. For more information, see the Informatica MDM Hub
Data Steward Guide.

Configuring the Data Flow


In this document, "Configuring the Data Flow" on page 217 describes the flow
of data through the Informatica MDM Hub through a series of processes (land,
stage, load, match, consolidate, and publish), and provides instructions for
configuring each process using tools in the Hub Console.

Configuring the Land Process

To configure the land process for a base object, see "Land Process" on page
221, "Configuring the Land Process" on page 264, and the following topics:
High-Level Tasks for Configuring the Land Process
Task Usage
"Configuring Required to define a unique name internal name for each source
Source system (external applications or systems that provide data to
Systems" on Informatica MDM Hub). For more information, see "About
page 264 Source Systems" on page 265.
"Configuring Required to create landing tables, which provide intermediate
Landing storage in the flow of data from source systems into Informatica
Tables" on MDM Hub. For more information, see "About Landing Tables" on
page 269 page 269.

Configuring the Stage Process

To configure the stage process for a base object, see "Stage Process" on page
224, "Configuring the Stage Process" on page 274, and the following topics:
High-Level Tasks for Configuring the Stage Process
Task Usage
"Configuring Required to create staging tables, which provide temporary,
Staging Tables" intermediate storage in the flow of data from landing tables
on page 275 into base objects via load jobs. For more information, see
"About Staging Tables" on page 275.

- 23 -
Task Usage
"Mapping Required to enable Informatica MDM Hub to move data from
Columns a landing table to a staging table during the stage process,
Between and also to specify cleanse operations on columns of data
Landing and that are moved. For more information, see "About Mapping
Staging Tables" Columns" on page 286.
on page 286
"Configuring Required to set up data cleansing for a base object during the
Data Cleansing" stage process using the Informatica MDM Hub internal
on page 307 cleanse functionality. For more information, see "About Data
Cleansing in Informatica MDM Hub" on page 307 and the
following topics:
• "Configuring Cleanse Match Servers" on page 308 to
deploy Cleanse Match Servers that execute cleanse
operations and the match process for an Operational
Reference Store (ORS). For more information, see
"About the Cleanse Match Server" on page 308.
• "Configuring Cleanse Lists" on page 333 to specify a
logical grouping of cleanse functions that are executed at
run time in a predefined order. For more information,
see "About Cleanse Lists" on page 333.
• "Using Cleanse Functions" on page 314 to build and
execute cleanse functions that cleanse (standardize or
verify) data. For more information, see "About Cleanse
Functions" on page 314.

Configuring the Load Process

To configure the load process for a base object, see "Load Process" on page
227, "Configuring the Load Process" on page 343, and the following topics:
High-Level Tasks for Configuring the Load Process
Task Usage
"Configuring Used when multiple source systems contribute data to a column
Trust for in a base object. Required if you want to designate the relative
Source trust level (confidence factor) for each contributing source
Systems" on system. For more information, see "About Trust" on page 344.
page 344
"Configuring Required if you want to use validation rules to downgrade trust
Validation scores for cell data based on configured conditions. For more
Rules" on information, see "About Validation Rules" on page 353.
page 353

Configuring the Match Process

To configure the match process for a base object, see "Match Process" on page
245, "Configuring the Match Process" on page 363, and the following topics:
High-Level Tasks for Configuring the Match Process
Task Usage
"Configuring Match Required for each base object that will be involved in
Properties for a mapping. For more information, see "Match Properties"
Base Object" on on page 367.
page 366

- 24 -
Task Usage
"Configuring Match Required for match column rules involving related
Paths for Related records in either separate tables or in the same table. For
Records" on page more information, see "About Match Paths" on page 373.
373
"Configuring Match Required to specify the base object columns to use in
Columns" on page match column rules. For more information, see "About
387 Match Columns" on page 387.
"Configuring Match Required if you want to use match rule sets to execute
Rule Sets" on page different sets of match column rules at different stages in
399 the match process. For more information, see "About
Match Rule Sets" on page 399.
"Configuring Match Required to specify match column rules that determine
Column Rules for whether two records for a base object are similar enough
Match Rule Sets" to consolidate. For more information, see "About Match
on page 407 Column Rules" on page 408.
"Configuring Required to specify the base object columns (primary
Primary Key Match keys) to use in primary key match rules. For more
Rules" on page 434 information, see "About Primary Key Match Rules" on
page 434.
"Investigating the Useful for investigating the distribution of generated
Distribution of match keys upon completion of the match process. For
Match Keys" on more information, see "About Match Keys Distribution" on
page 438 page 438.
"Configuring Match Required for configuring matches involving non-US
Settings for Non- populations and multiple populations.
US Populations" on
page 699

Configuring the Consolidation Process

To configure the consolidation process for a base object, see "Consolidate


Process" on page 255 and "Configuring the Consolidate Process" on page 443.

Configuring the Publish Process

To configure the publish process for a base object, see "Publish Process" on
page 260, "Configuring the Publish Process" on page 449, and the following
topics:
High-Level Tasks for Configuring the Publish Process
Task Usage
"Configuring Required to specify global settings for all message queues
Global involving outbound Informatica MDM Hub messages.
Message
Queue
Settings" on
page 451
"Configuring Required to set up one or more message queue servers that
Message Informatica MDM Hub will use for incoming and outgoing
Queue messages. The message queue server must already be defined
Servers" on in your application server environment according to the
page 452 application server instructions. For more information, see
"About Message Queue Servers" on page 452.

- 25 -
Task Usage
"Configuring Required to set up one or more outbound message queues for a
Outbound message queue server. For more information, see "About
Message Message Queues" on page 454.
Queues" on
page 454
"Configuring Required for configuring message triggers for a base object.
Message Message queue triggers identify which actions within Informatica
Triggers" on MDM Hub are communicated to outside applications via
page 456 messages in message queues. For more information, see "About
Message Triggers" on page 456.

Executing Informatica MDM Hub Processes


In this document, "Executing Informatica MDM Hub Processes" on page 495
describes how to use Hub Console tools to run Informatica MDM Hub
processes, either:
• as batch jobs from the Hub Console, or
• as stored procedures using third-party job management tools to schedule
and manage job execution

Executing Processes in the Hub Console

To execute Informatica MDM Hub processes using tools in the Hub Console,
see "About Informatica MDM Hub Batch Jobs" on page 496, "Using Batch Jobs "
on page 496, and the following topics:
High-Level Tasks for Executing Informatica MDM Hub Process in the Hub Console
Task Usage
"Running Batch Required if you want to run individual batch jobs from the
Jobs Using the Hub Console using the Batch Viewer tool. For more
Batch Viewer information, see "Batch Viewer Tool" on page 501.
Tool" on page
501
"Running Batch Required if you want to run batch jobs in a group from the
Jobs Using the Hub Console, allowing you to configure the execution
Batch Group sequence for batch jobs and to execute batch jobs in parallel.
Tool" on page For more information, see "About Batch Groups" on page
512 512.

Executing Processes Using Job Management Tools

To execute and manage Informatica MDM Hub stored procedures on a


scheduled basis (using job management tools that control IT processes), see
"About Executing Informatica MDM Hub Batch Jobs" on page 559, "Writing
Custom Scripts to Execute Batch Jobs " on page 559 , and the following topics:

- 26 -
High-Level Tasks for Executing Informatica MDM Hub Processes Using Job Management
Tools
Task Usage
"Setting Up Job Required for writing job execution scripts for job
Execution Scripts" management tools. For more information, see "About Job
on page 560 Execution Scripts" on page 560 and "About the C_REPOS_
TABLE_OBJECT_V View" on page 560.
"Monitoring Job Required for determining the execution results of job
Results and execution scripts. For more information, see "Error
Statistics" on page Messages and Return Codes" on page 563 and "Job
563 Execution Status" on page 564.
"Executing Batch Required for executing batch jobs in groups via stored
Groups Using procedures using job scheduling software (such as Tivoli,
Stored CA Unicenter, and so on). For more information, see
Procedures" on "About Executing Batch Groups" on page 598.
page 598
"Developing Required for create, registering, and running custom
Custom Stored stored procedures for batch jobs. For more information,
Procedures for see "About Custom Stored Procedures" on page 604.
Batch Jobs" on
page 604

Configuring Hierarchies
If your Informatica MDM Hub implementation uses Hierarchy Manager to
manage hierarchies, you need to configure hierarchies and their related
objects, including entity icons, entity objects and entity types, relationship
base objects (RBOs) and relationship types, Hierarchy Manager profiles, and
Hierarchy Manager packages. For more information, see "Configuring
Hierarchies" on page 169

Configuring Workflow Integration


If your Informatica MDM Hub implementation integrates with a supported
workflow engine, you need to enable states for base objects and configure
other settings. For more information, see "Configuring State Management for
Base Objects" on page 162.

Other Administration Tasks


In this document, "Configuring Application Access" on page 610 and
"Appendixes" on page 697 provide additional information about
administration-related topics.
Other High-Level Administration Tasks
Task Usage
"Generating ORS- Required for application developers to generate ORS-
specific APIs and specific SIF request APIs using the SIF Manager tool in the
Message Hub Console.
Schemas" on
page 611

- 27 -
Task Usage
"Viewing Used for viewing the following types of user objects that
Registered are registered in the selected ORS: user exits, custom
Custom Code" on stored procedures, custom Java cleanse functions, and
page 678 custom button functions.
"Auditing Used for integration auditing to track activities associated
Informatica MDM with the exchange of data between Informatica MDM Hub
Hub Services and and external systems. For more information, see "About
Events" on page Integration Auditing" on page 684.
684
"Backing Up and Used for backing up and restoring a Informatica MDM Hub
Restoring implementation.
Informatica MDM
Hub" on page 706
"Configuring Required only to configure different character sets in a
International Informatica MDM Hub implementation.
Data Support" on
page 698
"Configuring User Required only if user exits are used. For more information,
Exits" on page see "About User Exits" on page 708.
708
"Viewing Used for remotely monitoring a Informatica MDM Hub
Configuration environment, showing configuration settings for the Hub
Details" on page Server, Cleanse Match Servers, Master Database, and
715 Operational Reference Stores.
"Implementing Used only if you want to create custom buttons for Hub
Custom Buttons Console users to provide on-demand, real-time access to
in Hub Console specialized data services. Applies only to the Merge
Tools" on page Manager, Data Manager, and Hierarchy Manager tools.
730

- 28 -
Chapter 2: Getting Started with the
Hub Console

This chapter introduces the Hub Console and provides a high-level overview of
the tools involved in configuring your Informatica MDM Hub implementation.

Chapter Contents
• "About the Hub Console" on page 29
• "Starting the Hub Console" on page 30
• "Navigating the Hub Console" on page 32
• "Informatica MDM Hub Workbenches and Tools" on page 46

About the Hub Console


Administrators and data stewards can access Informatica MDM Hub features
via the Informatica MDM Hub user interface, which is called the Hub Console.
The Hub Console comprises a set of tools. Each tool allows you to perform a
specific action, or a set of related actions.

Note: The available tools in the Hub Console depend on your Informatica
license agreement. Therefore, your Hub Console tool might differ from the
previous figure.

- 29 -
Starting the Hub Console
To access the Hub Console:
1. Open a browser window and enter the following URL:
http://YourHubHost:port/cmx/

where YourHubHost is your local Informatica MDM Hub host and port is the
port number. Check with your administrator for the correct port number.
Note: You must use an HTTP connection to start the Hub Console. SSL
connections are not supported.
The Informatica MDM Hub launch screen is displayed.

2. Click the Launch button.


The first time (only) that you launch Hub Console from a client machine,
Java Web Start downloads application files and displays a progress bar.

The Informatica MDM Hub Login dialog box is displayed.

- 30 -
3. Enter your user name and password.
Note: If you do not have any user names set up, contact Informatica
support.
4. Click OK.
After you have logged in with a valid user name and password,
Informatica MDM Hub will prompt you to choose a target database—the
Master Database or an Operational Reference Store(ORS) with which to
work.

The list of databases to which you can connect is determined by your


security profile.
• The Master Database stores Informatica MDM Hub environment
configuration settings—user accounts, security configuration, ORS
registry, message queue settings, and so on. A given Informatica MDM
Hub environment can have only one Master Database.
• An Operational Reference Store (ORS) stores the rules for
processing the master data, the rules for managing the set of master
data objects, along with the processing rules and auxiliary logic used
by the Informatica MDM Hub in defining the best version of the truth
(BVT). A Informatica MDM Hub configuration can have one or more
ORS databases.
Throughout the Hub Console, an icon next to an ORS indicates whether it
has been validated and, if so, whether the most recent validation resulted
in issues.

- 31 -
Image Meaning
Unknown. ORS has not been validated since it was initially created,
or since the last time it was updated.
ORS has been validated with no issues. No change has been made
to the ORS since the validation process was made.
ORS has been validated with warnings.
ORS has been validated and errors were found.

For more information, see "About the Hub Store" on page 51.
5. Select the Master Database or the ORS to which you want to connect.
6. Click Connect.
Note: You can easily change the target database once inside the Hub
Console, as described in "Changing the Target Database" on page 37.
The Hub Console screen is displayed (in which the Schema Manager is
selected from the Model workbench).

When you select a tool from the Workbenches page or start a process from the
Processes page, the window is typically divided into several panes:
Pane Description
Workbenches Displays one of the following:
/ Processes • List of workbenches and tools to which you have access (as
shown in the previous figure).
• List of the steps in the process that you are running.
Note: The workbenches and tools that you see depends on what
your company has purchased, as well as to what your
administrator has given you access. If you do not see a
particular workbench or tool when you log into the Hub Console,
then your user account has not been assigned permission to
access it.
Navigation Allows you to navigate items (a list of objects) in the current
Tree tool. For example, in the Schema Manager, the middle pane
contains a list of schema objects (base objects, landing tables,
and so on).
Properties Shows details (properties) for the selected item in the
Panel navigation tree, and possibly other panels if available in the
current tool. Some of the properties might be editable.

Navigating the Hub Console


This section describes how to navigate the Hub Console interface. Hub Console
is a collection of tools that you use to configure and manage your Informatica
MDM Hub implementation (see "Informatica MDM Hub Workbenches and
Tools" on page 46 for a complete list). Each tool allows you to focus on a
particular area of your Informatica MDM Hub implementation.

- 32 -
Toggling Between the Processes and Workbenches
Views
Informatica MDM Hub groups its tools in two different ways:
Pane Description
By Similar tools are grouped together by workbench—a logical
Workbenches collection of related tools.
By Process Tools are grouped into a logical workflow that walks you
through the tools and steps required for completing a task.

You can click the tabs at the left-most side of the Hub Console window to
toggle between the Processes and Workbenches views.

Note: When you log into Informatica MDM Hub, you see only those
workbenches and processes that contain the tools that your Informatica MDM
Hub security administrator has authorized you to use. The screen shots in this
document show the full set of workbenches, processes, and tools available.

Workbenches View

To view tools by workbench:


• Click the Workbenches tab on the left side of the page.

Hub Console displays a list of available workbenches on the Workbenches tab.


The Workbenches view organizes Hub Console tools by similar functionality.

The workbench names and tool descriptions are metadata-driven, as is the


way in which tools are grouped. It is possible to have customized tool
groupings. Therefore, the arrangement of tools and workbenches that you see
after you log in to Hub Console might differ somewhat from the previous
figure.

Processes View

To view tools by process:


• Click the Processes tab on the left side of the page.

Hub Console displays a list of available processes on the Processes tab. Tools
are organized into common sequences or processes.

Processes step you through a logical sequence of tools to complete a specific


task. The same tool can belong to several processes, and can appear many
times in one process.

- 33 -
Starting a Tool in the Workbenches View
To start a Hub Console tool from the Workbenches view:
1. In the Workbenches view, expand the workbench that contains the tool
that you want to start (see "Informatica MDM Hub Workbenches and Tools"
on page 46).
2. If necessary, expand the workbench node to show the tools associated
with that workbench.
3. Click the tool.
If you selected a tool that requires a different database, the Hub Console
prompts you to select it.

All tools in the Configuration workbench (Databases, Users, Security


Providers, Tool Access, Message Queues, Metadata Manager, and
Enterprise Manager) require a connection to the master database. All other
tools require a connection to an ORS.

The Hub Console displays the tool that you selected.

Acquiring Locks to Change Settings in the Hub


Console
In the Hub Console, a lock is required to make changes to the underlying
schema. All non-data steward tools (except the ORS security tools) are in
read-only mode unless you acquire a lock. Hub Console locking allows
multiple users to make changes to the Informatica MDM Hub schema at the
same time.

Types of Locks

In the Hub Console, the Write Lock menu provides two types of locks:
Type of Description
Lock
exclusive Allows only one user to make changes to the underlying ORS,
lock preventing any other users from changing the ORS while the
exclusive lock is in effect. For more information, see "Acquiring an
Exclusive Lock" on page 36.
write Allows multiple users to making changes to the underlying
lock metadata at the same time. Write locks can be obtained on the

- 34 -
Type of Description
Lock
Master Database or on an ORS. For more information, see
"Acquiring a Write Lock" on page 36.

Note: Locks cannot be obtained on an ORS that is in production mode. If an


ORS is in production mode and you attempt to obtain a write lock, you will see
a message stating that you cannot acquire the lock. For more information, see
"Editing ORS Properties" on page 64.

Tools that Require a Lock

The following tools require a lock in order to make changes:


Master Database ORS
Databases Mappings
Users Cleanse Match Server
Security Providers Cleanse Functions
Tool Access Queries
Message Queues Packages
Metadata Manager Schema Manager
Schema Viewer
Secure Resources
Hierarchy Manager
Roles
Users and Groups
Batch Group
Systems and Trust
SIF Manager
Hierarchies

Note: The data steward tools—Data Manager, Merge Manager, and Hierarchy
Manager—do not require write locks. For more information about these tools,
see the Informatica MDM Hub Data Steward Guide. The Audit Manager does
not require write locks, either.

Automatic Lock Expiration

The Hub Console takes care of refreshing the lock every 60 seconds on the
current connection. The user can manually release a lock according to the
instructions in "Releasing a Lock" on page 36. If a user switches to a different
database while holding a lock, then the lock is automatically released. If the
Hub Console is terminated, then the lock expires after one minute.

Server Caching and Hub Console Locks

When no locks are in effect in the Hub Console, the Hub Server caches
metadata and other configuration settings for performance reasons. As soon
as a Hub Console user acquires a write lock or exclusive lock, caching is

- 35 -
disabled, the cache is emptied, and Informatica MDM Hub retrieves this
information from the database instead. When all locks are released, caching is
enabled again.

Acquiring a Write Lock

Write locks allow multiple users to edit data in the Hub Console at the same
time. However, write locks do not prevent those users from editing the same
data at the time time. In such cases, the most recently-saved changes prevail.

To acquire a write lock in Hub Console:


1. From the Write Lock menu, choose Acquire Lock.
• If the lock has already been acquired by someone else, then the login
name and machine address of that person is displayed.
• If the ORS in production mode, then a message is displayed explaining
that you cannot acquire the lock.
• If the lock is acquired successfully, then the tools are in read-write
mode. Multiple users can have a write lock per ORS or in the Master
Database.
2. When you are finished, you can explicitly release the write lock according
to the instructions in "Releasing a Lock" on page 36.

Acquiring an Exclusive Lock

To acquire an exclusive lock in Hub Console:


1. From the Write Lock menu, choose Clear Lock to clear any write locks
held by other users, as described in "Clearing Locks" on page 36.
2. From the Write Lock menu, choose Acquire Exclusive Lock.
If the ORS is in production mode, then a message is displayed explaining
that you cannot acquire the exclusive lock.
3. When you are finished making changes, release the exclusive lock, as
described in "Releasing a Lock" on page 36.

Releasing a Lock

To release a lock in Hub Console:


• From the Write Lock menu, choose Release Lock.

Clearing Locks

You can force the release of any locks—write or exclusive locks—held by other
users. You might want to do this, for example, to obtain an exclusive lock on

- 36 -
the ORS. Because other users are not warned to save changes before their
write locks are released, you should use this only when necessary.

To clear all locks:


• From the Write Lock menu, choose Clear Lock.
Hub Console releases any locks on the ORS.

Changing the Target Database


The status bar at the bottom of the Hub Console window always shows:
• the name of the target database to which you connected
• the user name you used to log in

To change the target database in the Hub Console, do one of the following.
1. On the status bar, click the database name.

Hub Console prompts you to choose a target database with which to work.

For a description of the types of databases that you can select, see
"Starting the Hub Console" on page 30.
2. Select the Master Database or the ORS to which you want to connect.
3. Click Connect.

Logging in as a Different User


To log in as a different user in the Hub Console:
1. Click the user name on the status bar.
2. From the Options menu, choose Re-Login As....
3. Specify the user name and password for the user account that you want to
use.

- 37 -
Changing the Password for a User
To change the password for the currently logged-in user in the Hub Console:
1. From the Options menu, choose Change Password.
2. Specify the password that you want to use instead.
3. Click OK.

Using the Navigation Tree in the Navigation Pane


The navigation tree in the Hub Console allows you to view and manage a
hierarchical collection of objects. This section uses the Schema Manager as an
example, but the functionality described in this section also applies to using
the navigation tree for the following Hub Console tools: Message Queues,
Mappings, Queries, Packages, Schema, Users and Groups, and the Batch
Viewer.

Parent and Child Nodes

Each named object is represented as a node in the hierarchy tree. A node that
contains other nodes is called a parent node. A node that belongs to a parent
node is called a child node.

In the following example in the Schema Manager, the Address base object is
the parent node to the associated child nodes (Columns, Cross-References,
and so on).

Showing and Hiding Child Nodes

To show child nodes beneath a parent node:


• Click the plus (+) sign next to the parent node.

To hide child nodes beneath a parent node:


• Click the minus (-) sign next to the parent node.

Sorting by Display Name

The display name is the name of an object as it appears in the navigation tree.
You can change the order in which the objects are displayed in the navigation
tree by clicking Sort By in the tree options area and selecting the appropriate
sort option.

- 38 -
Choose from the following sort options:
• Display Name (a-z) sorts the objects in the tree alphabetically according
to display name.
• Display Name (z-a) sorts the objects in the tree in descending
alphabetical order according to display name.

Filtering Items

You can filter the items shown in the navigation tree by clicking the Filter area
at the bottom of the left pane and selecting the appropriate filter option. The
figures in this section are from the Schema Manager, but the sample
principles apply to other Hub Console tools for which filtering is available.

Choose from the following filter options:


• No Filter (All Items)—Removes any filter that was previously defined.
• One Item—Displays a drop-down list above the navigation tree from
which to select an item.
In the Schema Manager, for example, you can choose Table type or Table.
If you choose Table type, you click the down arrow to display a list of table
types from which to select for your filter.

• If you choose Table, you click the down arrow to display a list of tables
from which to select for your filter.

• Some Items—Allows you to select one or more items.


For example, in the Schema Manager, you can choose tables based on
either the table type or table name. When you choose Some Items, the
Hub Console displays the Define Item Filter button above the navigation
tree.

- 39 -
• Click the Define Item Filter button.

• Select the item(s) that you want to include in the filter, and then click
OK.
Note: Use the No Filter (All Items) option to remove the filter.

Changing the Item View

Certain Hub Console tools show a View or View By area below the navigation
tree.
• In the Schema Manager, you can show or hide the public Informatica MDM
Hub items by clicking the View area below the navigation tree and
choosing the appropriate command.

For example, you can view all system tables.

- 40 -
• In the Mappings tool, you can view items by mapping, staging table, or
landing table.
• In the Packages tool, you can view items by package or by table.
• In the Users and Groups tool, you can display sub groups and sub users. In
the Batch Viewer, you can group jobs by table, date, or procedure type.

Searching For Items

When there is no filter, or when the Some Items filter is selected, Hub Console
displays a Find area above the navigation tree so that you can search for
items by name.

For example, in the Schema Manager, you can search for tables and columns.
1. Click anywhere in the Find area to display the Find window.

- 41 -
2. Type the name (or first few letters of the name) that you want to find.
3. Click the F3 - Find button.
The Hub Console highlights the matched item(s). In the following example,
the Schema Manager displays the list of tables and highlights the table
matches the find criteria.

- 42 -
4. Click anywhere in the Find area to hide the Find window.

Running Commands On Objects in the Navigation Tree

To run commands on an object in the navigation tree, do one of the following:


• Right-click an object name to display a pop-up menu of commands that
you can perform on the object.
OR
• Select an object in the navigation tree, and then choose the command you
want from the Hub Console menu at the top of the window.

Note: Whenever possible, this document describes the first approach—right-


clicking an object in the navigation tree and choosing a command from the
pop-up menu. Alternatively, however, you can always choose the command
from the Hub Console menu.

For example, in the Schema Manager, you can right-click on certain types of
objects in the navigation tree to see a popup menu of the commands available
for the selected object.

- 43 -
Adding, Editing, and Removing Objects Using
Command Buttons
This section describes generally how you use command buttons to add, edit,
and delete objects in the Hub Console.

Command Buttons

If you have access to create, modify, or delete objects in a Hub Console


window, and if you have acquired a write lock ("Acquiring a Write Lock" on
page 36), you might see some or all of the following command buttons in the
Properties panel. There are other command buttons as well.
Button Name Description
Add Add a new object.

Edit Edit a property for the selected item in the Properties panel.
Indicates that the property is editable.
Delete Remove the selected item.

Save Save changes.

The following figure shows an example of command buttons on the right side
of the properties panel for the Secure Resources tool.

To see a description about what a command button does, hold the mouse over
the button to display a tooltip.

Adding Objects

To add an object:
1. Acquire a write lock.

2. In the Hub Console tool, click the Add button.


The Hub Console displays an Add object window, where object is the name
of the type of object that you are adding.
3. Specify the object properties.
4. Click OK.

Editing Object Properties

To edit an object’s properties:


1. Acquire a write lock.

- 44 -
2. In the Hub Console tool, select the object whose properties you want to
edit.

3. For each property that you want to edit, click the Edit button next to it,
and specify the new value.

4. Click the Save button to save your changes.

Removing Objects

To remove an object:
1. Acquire a write lock.
2. In the Hub Console tool, select the object that you want to remove.

3. Click the Remove button.


4. If prompted to confirm deletion, choose the appropriate option (OK or
Yes) to confirm deletion.

Customizing the Hub Console Interface


To customize the Hub Console interface:
1. From the Options menu, choose Options.
The Options dialog box is displayed.

2. Specify the options you want, including:


• General tab: Specify whether to show wizard welcome screens, and
whether to save window sizes and positions.
• Quick Launch tab: Specify tools that you want to appear as icons in a
tool bar below the menu.

Showing Version Details


To show version details about the currently-installed Informatica MDM Hub:
1. In the Hub Console, choose Help | About.

- 45 -
The Hub Console displays the About Informatica MDM Hub dialog.

2. Click Installation Details.


The Hub Console displays the Installation Details dialog.

3. Click Close.
4. Click Close.

Informatica MDM Hub Workbenches and Tools


This section provides an overview of the Informatica MDM Hub workbenches
and tools.

Tools in the Configuration Workbench


Icon Tool Description
Name
Databases Register and manage Operational
Reference Stores (ORSs). For more
information, see "Configuring Operational

- 46 -
Icon Tool Description
Name
Reference Stores and Datasources" on
page 54.
Users Define users and specify which databases
they can access. Manage global and
individual password policies. Note that
Informatica MDM Hub supports external
authentication for users, such as LDAP. For
more information, see "Configuring
Informatica MDM Hub Users" on page 646.
Security Configure security providers, which are
Providers third-party organizations that provide
security services (authentication,
authorization, and user profile services)
for users accessing Informatica MDM Hub.
For more information, see "Managing
Security Providers" on page 664.
Tool Define which Hub Console tools and
Access processes a user can access. By default,
new user accounts do not have access to
any tools until access is explicitly assigned.
For more information, see "Configuring
Access to Hub Console Tools" on page 737.
Message Define inbound and outbound message
Queues queue interfaces to Informatica MDM Hub.
For more information, see "Configuring the
Publish Process" on page 449.
Metadata Validate Operational Reference Store
Manager (ORS) metadata, promote changes
between repositories, import objects into
repositories, and export repositories. For
more information, see the Informatica
MDM Hub Metadata Manager Guide.
Enterprise View configuration details and version
Manager information for the Hub Server, Cleanse
Servers, the Master Database, and
Operational Reference Stores. For more
information, see "Viewing Configuration
Details" on page 715.

Tools in the Model Workbench


Icon Tool Name Description
Schema Define base objects, relationships, history and security
requirements, staging and landing tables, validation rules,
match criteria, and other data model attributes. For more
information, see "Building the Schema" on page 73.
Schema View and navigate the current schema. For more
Viewer information, see "Viewing Your Schema" on page 119.
Systems Name the source systems that can provide data for
and Trust consolidation in Informatica MDM Hub. Define the trust
settings associated with each source system for each base
object column. For more information, see "Configuring
Source Systems" on page 264 and "Configuring Trust for
Source Systems" on page 344.
Queries Define query groups and queries used by packages. For
more information, see "Configuring Queries" on page 127.

- 47 -
Icon Tool Name Description
Packages Define packages (table views). For more information, see
"Configuring Packages" on page 151.
Cleanse Define cleanse functions to perform on your data. For more
Functions information, see "Using Cleanse Functions" on page 314.
Mappings Map cleansing function outputs to target columns in staging
tables. For more information, see "Mapping Columns
Between Landing and Staging Tables" on page 286.
Hierarchies Set up the structures required to view and manipulate data
relationships in Hierarchy Manager. For more information,
see "Configuring Hierarchies" on page 169.

Tools in the Security Access Manager Workbench


Icon Tool Description
Name
Secure Manage secure resources in Informatica MDM Hub. Configure
Resources the status (Private, Secure) for each Informatica MDM Hub
resource, and define resource groups to organize secure
resources. For more information, see "Securing Informatica
MDM Hub Resources" on page 629.
Roles Define roles and privilege assignments to resources and
resource groups. Assign roles to users and user groups. For
more information, see "Configuring Roles" on page 638.
Users and Manage the users and user groups within a single Hub Store.
Groups For more information, see "Setting Up Security" on page 621.

Tools in the Data Steward Workbench


For more information about these tools, see the Informatica MDM Hub Data
Steward Guide.
Icon Tool Description
Name
Data Manage the content of consolidated data, view cross-
Manager references, edit data, view history and unmerge consolidated
records. For more information, see the Informatica MDM Hub
Data Steward Guide.
Merge Review and merge the matched records that have been
Manager queued for manual merging. For more information, see the
Informatica MDM Hub Data Steward Guide.
Hierarchy Define and manage hierarchical relationships in their Hub
Manager Store. For more information, see the Informatica MDM Hub
Data Steward Guide.

Tools in the Utilities Workbench


Icon Tool Description
Name
Batch Configure and run batch groups, which are collections of
Group individual batch jobs (for example, Stage, Load, and Match
jobs) that can be executed with a single command. For more
information, see "Running Batch Jobs Using the Batch Viewer
Tool" on page 501.

- 48 -
Icon Tool Description
Name
Batch Execute batch jobs to cleanse, load, match or auto-merge data,
Viewer and view job logs. For more information, see "Running Batch
Jobs Using the Batch Viewer Tool" on page 501.
Cleanse View Cleanse Match Server information, including name, port,
Match server type, and whether server is on or offline. For more
Server information, see "About the Cleanse Match Server" on page
308.
Audit Configure auditing and debugging of application requests and
Manager message queue events. For more information, see "Auditing
Informatica MDM Hub Services and Events" on page 684.
SIF Generate ORS-specific Services Integration Framework (SIF)
Manager request APIs. SIF Manager generates and deploys the code to
support SIF request APIs for packages, remote packages,
mappings, and cleanse functions in an ORS. Once generated,
the ORS-Specific APIs are available as a Web service and via
the Informatica Client JAR. For more information, see
"Generating ORS-specific APIs and Message Schemas" on page
611.
User View registered user exits, user stored procedures, custom
Object Java cleanse functions, and custom GUI functions for an ORS.
Registry For more information, see "Viewing Registered Custom Code"
on page 678.

- 49 -
Part 2: Building the Data Model

Part 2: Building the Data Model

Contents
• "About the Hub Store" on page 51
• "Configuring Operational Reference Stores and Datasources" on page 54
• "Building the Schema" on page 73
• "Configuring Queries and Packages" on page 127
• "State Management" on page 159
• "Configuring Hierarchies" on page 169

- 50 -
Chapter 3: About the Hub Store

The Hub Store is where business data is stored and consolidated in


Informatica MDM Hub. The Hub Store contains common information about all
of the databases that are part of your Informatica MDM Hub implementation.

Chapter Contents
• "Databases in the Hub Store" on page 51
• "How Hub Store Databases Are Related" on page 51
• "Creating Hub Store Databases" on page 52
• "Version Requirements" on page 52

Databases in the Hub Store


The Hub Store is a collection of databases that includes:
Element Description
Master Contains the Informatica MDM Hub environment configuration
Database settings—user accounts, security configuration, ORS registry,
message queue settings, and so on. A given Informatica MDM Hub
environment can have only one Master Database. The default
name of the Master Database is CMX_SYSTEM.
In the Hub Console, the tools in the Configuration workbench
(Databases, Users, Security Providers, Tool Access, and Message
Queues) manage configuration settings in the Master Database.
Operational Database that contains the master data, content metadata, the
Reference rules for processing the master data, the rules for managing the
Store set of master data objects, along with the processing rules and
(ORS) auxiliary logic used by the Informatica MDM Hub in defining the
best version of the truth (BVT). A Informatica MDM Hub
configuration can have one or more ORS databases. The default
name of an ORS is CMX_ORS.

Users for Hub Store databases are created globally—within the Master
Database—and then assigned to specific ORSs. The Master Database also
stores site-level information, such as the number of incorrect log-in attempts
allowed before a user account is locked out.

How Hub Store Databases Are Related


A Informatica MDM Hub implementation contains one Master Database and
zero or more ORSs. If no ORS exists, then only the Configuration workbench
tools are available in the Hub Console. A Informatica MDM Hub
implementation can have multiple ORSs, such as separate ORSs for
development and production, or separate ORSs for each geographical location
or for different parts of the organization.

- 51 -
You can access and manage multiple ORSs from one Master Database. The
 Master Database stores the connection settings and properties for each ORS.

Note: An ORS can be registered in only one Master Database. Multiple Master
Databases cannot share the same ORS. A single ORS cannot be associated
with multiple Master Databases.

Creating Hub Store Databases


Databases are initially created and configured when you install Informatica
MDM Hub.
• To create the Master Database and one ORS, you run the setup.sql script.
• To create an individual ORS, you run the setup_ors.sql script.

For more information, see the Informatica MDM Hub Installation Guide.

Version Requirements
Different versions of the Informatica MDM Hub cannot operate together in the
same environment. All components of your installation must be the same
version, including the Informatica MDM Hub software and the databases in the
Hub Store.

- 52 -
If you want to have multiple versions of Informatica MDM Hub at your site,
you must install each version in a separate environment. If you try to work
with a different version of a database, you will receive a message telling you
to upgrade the database to the current version.

- 53 -
Chapter 4: Configuring Operational
Reference Stores and Datasources

This chapter describes how to configure Operational Reference Store (ORS)


and datasources for the Hub Store using the Databases tool in the Hub
Console.

Chapter Contents
• "Before You Begin" on page 54
• "About the Databases Tool" on page 54
• "Starting the Databases Tool" on page 55
• "Configuring Operational Reference Stores" on page 55
• "Configuring Datasources" on page 71

Before You Begin


Before you begin, you must have installed Informatica MDM Hub, created the
Master Database and at least one ORS (running the setup.sql script creates
both) according to the instructions in the Informatica MDM Hub Installation
Guide. You can create additional ORSs by running the setup_ors.sql script.

About the Databases Tool


After the Hub Store has been created, you can use the Databases tool in the
Hub Console to complete the following tasks:
• Register an ORS so that the Master Reference Manager can connect to it.
Registration stores the database connection properties in the Master
Database.
• Define an ORS datasource in the application server environment for
Informatica MDM Hub.
An ORS datasource contains a set of properties for the ORS, such as the
location of the database server, the name of the database, the network
protocol used to communicate with the server, the database user ID and
password, and so on.

Note: The Databases tool refers to an ORS as a database.

- 54 -
Starting the Databases Tool
To start the Databases tool:
1. In the Hub Console, connect to your Master Database. For more
information, see "Changing the Target Database" on page 37.
2. Expand the Informatica Configuration workbench and then click
Databases.
The Hub Console displays the Databases tool (in which a registered ORS is
selected).

The Databases tool displays the following areas:


Column Description
Number of Number of ORSs currently defined in the Hub Store.
databases
Database List List of registered Informatica MDM Hub ORS
databases.
Database Properties Database properties for the selected ORS.

Configuring Operational Reference Stores


This section describes how to configure an ORS in your Hub Store. If you need
assistance with configuring the ORS, consult with your database
administrator. For more information about Operational Reference Stores, see
"Databases in the Hub Store" on page 51 and the Informatica MDM Hub
Installation Guide.

- 55 -
Registering an ORS
Note: Registering an ORS will fail if you try to register an ORS that does not
contain the Informatica MDM Hub repository objects or Informatica MDM Hub
procedures.

To register an ORS:
1. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.

3. Click the button.


The Databases tool launches the Connection Wizard and prompts you to
select a database type.

4. Accept the default (Oracle) and choose Next.


The Connection Wizard prompts you to select an Oracle connection
method.

- 56 -
Method Description
Service Connect to Oracle via the service name.
SID Connect to Oracle via the Oracle System ID.
For more information about SERVICE and SID names, see your Oracle
documentation.
5. Accept the connection type you want and choose Next.
The Connection Wizard prompts you to specify connection properties based
on your selected connection type. (Fields in bold are required.)

- 57 -
Service Connection Type

SID Connection Type

- 58 -
Connection Type Properties
Property Description
Database Name for this ORS as it will be displayed in the Hub Con-
Display sole.
Name
Machine Prefix given to keys to uniquely identify records from this
Identifier instance of the Hub Store.
Database IP address or name (if supported on your network) of the
hostname server hosting the Oracle database.
SID Oracle System Identifier (SID) that refers to the
instance of the Oracle database running on the server.
Displayed only if the selected connection type is SID.
Service Name of the Oracle SERVICE used to connect to the
Oracle database. Displayed only if the selected Oracle
Connection Type is Service.
Port The TCP port of the Oracle listener running on the Oracle
database server. The Oracle installation default is 1521.
Oracle Name by which the database is known on your network as
defined in the application server’s TNSNAMES.ORA file. For
TNS example:
 Name mydatabase.mycompany.com
This value is set when you install Oracle. See your Oracle
documentation to learn more about this name.
Schema Name of the ORS.
Name
User User name for the ORS. By default, this is the user name that
was specified in the script used to create the ORS. This user
name owns all of the ORS database objects in the Hub Store.
If a proxy user has been configured for this ORS, then you can
specify the proxy user instead. For instructions on creating
ORS databases and defining proxy users, see the Informatica
MDM Hub Installation Guide.
Password Password associated with the User Name for the ORS.
• For Oracle, this password is case-insensitive.
• For DB2, this password is case-sensitive.
By default, this is the password associated with the user name
that was specified in the script used to create the ORS.
If a proxy user has been configured for this ORS, then you
specify the password for the proxy user instead. For
instructions on running of the setup_ors.sql script and
defining proxy users, see the Informatica MDM Hub
Installation Guide.
Note: The Schema Name and the User Name are both the name of the
ORS that was specified in the script used to create the ORS. If you need
this information, consult your database administrator.
6. Specify the connection properties and choose Next.

- 59 -
The Connection Wizard displays a summary of selected connection
properties.
Service Connection Type

SID Connection Type

- 60 -
Additional Connection Properties
Property Description
Connection Connect URL. Default is automatically generated by the
URL Connection Wizard. Format:
Service connection type:
jdbc:oracle:thin:@//database_host:port/service_name
SID connection type:
jdbc:oracle:thin:@//database_host:port/sid
For a service connection type (only), you have the option to
customize and subsequently test a different connection URL.
Example:
jdbc:oracle:thin:@//orclhost:1521/mdmorcl.mydomain.com
Create Check (select) to create the datasource on the application server
datasource after registration. For WebLogic users, you will need to specify
after the WebLogic username and password.
registration

7. For a service connection type, if you want to change the default URL, click
the Edit button. The Connection Wizard prompts you to specify a different
URL:

Specify the URL (it can differ from the URL specified when running the
database creation script described in the Informatica MDM Hub Installation
Guide) and then click OK.
8. If you want to create the datasource on the application server after
registration, check (select) the Create datasource after registration
check box.
Informatica MDM Hub uses the datasources provided by the application
server.
Note for WebLogic: If you are using WebLogic, a dialog box prompts you
for your username and password. This process writes only to the Master
Database. The ORS and datasource need not be available at registration
time.

- 61 -
If you do not check this option, then you will need to manually configure
the datasource, as described in "Configuring Datasources" on page 71.
9. Click OK.
10. Test your database connection settings. For more information, see
"Testing ORS Connections" on page 67.
Note: When you register an ORS that has been used elsewhere, and if the
ORS already has Cleanse Match Servers registered and no other servers
get registered, then you need to re-register one of the Cleanse Match
Servers. This updates the data in c_repos_db_release.

Editing ORS Registration Properties


Note: Only certain ORS registration properties are editable. For non-editable
properties, you must instead unregister and re-register the ORS with the new
properties.

To edit registration settings for an ORS:


1. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Select the ORS that you want to configure.

4. Click the button.


The Databases tool displays the Update Database Registration dialog box
for the selected ORS.

- 62 -
Service Connection Type

SID Connection Type

5. Edit any of the following settings:


Property Description
Database Display Name for this ORS as it will be displayed
 Name in the Hub Console.
Machine Identifier Prefix given to keys to uniquely identify
records from this instance of the Hub

- 63 -
Property Description
Store.
Oracle TNS name Name by which the database is known on
your network as defined in the application
server’s TNSNAMES.ORA file.
Password By default, this is the password associated
with the user name that was specified when
the ORS was created.
If a proxy user has been configured for this
ORS, then you specify the password for the
proxy user instead. For instructions on
running of the setup_ors.sql script and
defining proxy users, see the Informatica
MDM Hub Installation Guide.
Update datasource Update the datasource on the appli-
after registration cation server with the updated settings.
6. To update the datasource on the application server with the modified
settings, select (check) the Update datasource after registration
check box.
Note: Updating the datasource settings might cause the JDBC connection
pool settings to be reset to the default values. Be sure to check the JDBC
connection pool settings before and after you click OK so that you can
reapply any customizations to the JDBC connection pool settings.
7. Click OK.
The Databases tool saves your changes.
8. Test your updated database connection settings. For more information,
see "Testing ORS Connections" on page 67.

Editing ORS Properties


To change properties for a registered ORS:
1. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Select the ORS that you want to configure.
The Databases tool displays the database properties for the selected ORS.

- 64 -
The following table describes these properties.

Property Description
Database Type Oracle or DB2.
Database ID Identification for the ORS. This ID is used in SIF requests.
The database ID lookup is case-sensitive.

SID Connection Type:
hostname-sid-databasename
Service Connection Type:
servicename-databasename
When registering a new ORS, the host, server, and
database names are normalized.
• Host name is converted to lowercase.
• Database name is converted to uppercase (the
standard for schemas, tables, etc.).
The normalization of each field can be done on a database-
specific basis so that it can be changed if needed.
JNDI Displays the datasource JNDI name for the selected ORS.
Datasource This is the JNDI name that is configured for this JDBC
Name connection on the application server.
SID Connection Type:
jdbc/siperian-hostname-sid-databasename-ds
Service Connection Type:
jdbc/siperian-servicename-databasename-ds
Machine Prefix given to keys to uniquely identify records from this
Identifier instance of the Hub Store.

- 65 -
Property Description
GETLIST Limit Limits the number of records returned through SIF search
(records) requests, such as searchQuery, searchMatch,
getLookupValues, and so on.
Production Specifies whether this ORS is in production mode.
Mode • If not enabled (unchecked, the default), production
mode is disabled, allowing authorized users to edit
metadata for this ORS in the Hub Console.
• If enabled (checked), then production mode is
enabled. Users cannot make changes to the metadata
for this ORS. If a user attempts to acquire a write lock
on an ORS in production mode, the Hub Console will
display a message explaining that the lock cannot be
obtained.
Note: Only Informatica MDM Hub administrator users can
change this setting.
For more information, see "Changing an ORS to Production
Mode" on page 69.
Transition Specifies whether this ORS is running in transition mode.
Available only if Production Mode is enabled for this ORS.
Mode
• If selected (checked), then transition mode is enabled,
allowing users to execute Metadata Manager Promote
actions.
• If not selected (the default), then transition mode is
not enabled.
For more information, see the Informatica MDM Zero
Downtime (ZDT) Install Guide and Informatica MDM Zero
Downtime (ZDT) User Guide.
Batch API Specifies whether this ORS will allow row-level locking for
Interoperability concurrently-executing, asynchronous SIF and batch
operations.
• If selected (checked), then row-level locking is
allowed for asynchronous SIF and batch operations.
• If not selected (the default), row-level locking is
unavailable.
For more information, see "Row-level Locking" on page
740
ZDT Enabled Specifies whether this ORS is running in Zero Downtime
(ZDT) mode.
• If selected (checked), then ZDT is enabled.
• If not selected (the default), then ZDT is not enabled.
For more information, see the Informatica MDM Zero
Downtime (ZDT) Install Guide and Informatica MDM Zero
Downtime (ZDT) User Guide.

4. To change a property, click the button next to it, and edit the
property.

5. Click the Save button to save your changes.


If production mode is enabled for an ORS, then the Databases tool displays
a lock icon next to it in the list.

- 66 -
Testing ORS Connections
To test a Hub Store connection to an ORS:
1. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
2. Select the ORS that you want to test.

3. Click the button.

The Test Database command tests for:


• the database connection parameters via the JDBC connection
• the existence of the datasource
• a valid connection via the datasource
• a valid ORS version
Note for WebSphere: If the test connection fails through the Hub
Console, verify that the test connection is successful from the WebSphere
Console. The JNDI name is case sensitive and should match what is
generated in the Hub Console.
4. Click OK.

Changing Passwords
To change passwords for the Master Database or an ORS, you need to make
changes first on your database server and possibly on your application server
as well.

Changing the Password for the Master Database

To change the Master Database password:


1. On your database server, change the password for the CMX_SYSTEM
database.
2. Log into the administration console for your application server and edit the
datasource connection information, specifying the new password for CMX_
SYSTEM, and then saving your changes.

- 67 -
Changing the Password for an ORS

To change the password for an ORS:


1. On your database server, change the password for the ORS schema.
2. Start the Hub Console and select Master Database as the target database.
For more information, see "Changing the Target Database" on page 37.
3. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
4. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
5. Select the ORS that you want to configure.
6. In the Database Properties panel, make a note of the JNDI Datasource
Name for the selected ORS.
7. Log into the administration console for your application server and edit the
datasource connection information for this ORS, specifying the new
password for the noted JNDI Datasource name, and then saving your
changes.
8. In the Databases tool, test the connection to the database according to the
instructions in "Testing ORS Connections" on page 67.

Encrypting Passwords

In order to successfully change the schema password, you must change it in


the data sources defined in the application server. This password is not
encrypted, because the application server protects it. In addition to updating
the data sources on the application server, Informatica MDM Hub requires that
the password to be encrypted and stored in various tables.

Steps to Encrypt New Passwords

To encrypt the new password, execute the following command from the
prompt:
Usage:
java -classpath siperian-common.jar com.siperian.common.security.Blowfish
[key_type] plain_text_password

where key_type is either DB_PASSWORD_KEY (default) or PASSWORD_KEY.

The results will be echoed to the terminal window:


Plaintext Password: your_new_password
Encrypted Password: encrypted password

For example, if admin is your new password, then the command would be:

- 68 -
java -classpath siperian-common.jar com.siperian.common.security.Blowfish
PASSWORD_KEY admin
Plaintext Password: admin
Encrypted Password: A75FCFBCB375F229

Steps to Update Passwords for Your Schema

Execute the following commands to update the passwords for your ORS and
Master Database:

To update your ORS database password:


UPDATE C_REPOS_DB_RELEASE SET DB_PASSWORD = '';
COMMIT;

To update your Master Database password:


UPDATE C_REPOS_DATABASE SET PASSWORD = '' WHERE USER_NAME =

CMX_SYSTEM/ORS User and Passwords

User-name and passwords that can be changed when installing/configuring the


MRM:
• The CMX_SYSTEM user should not be changed.
• The CMX_SYSTEM password can be changed after the MRM is installed. You
need to change the password for the CMX user in Oracle, and you need to
set the same password in the datasource on the application server.
• The CMX_ORS user and password can be changed when the setup_ors.sql
is run. You need to use the same password when registering the ORS in the
Hub Console.

Changing an ORS to Production Mode


The Hub Console allows administrators to lock the design of an ORS by
enabling production mode. Once production mode is enabled, write locks and
exclusive locks are not permitted, and no changes can be made to the schema
definition in the ORS. When a Hub Console user attempts to place a lock on an
ORS for which production mode is enabled, the Hub Console displays a
message to the user explaining that the lock cannot be obtained because the
ORS is in production mode. For more information, see "Acquiring Locks to
Change Settings in the Hub Console" on page 34.

To change the production mode flag for an ORS:


1. Log into the Hub Console with administrator-level privileges to the
Informatica MDM Hub implementation.
In order to change this setting, you must have sufficient privileges to run
the Databases tool and be able to obtain a lock on the Master Database.

- 69 -
2. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
3. Clear any exclusive locks on the ORS.
Note: This setting cannot be changed if the ORS is locked exclusively.
4. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
5. Select the ORS that you want to configure.
The Databases tool displays the database properties for the selected ORS.
6. Change the setting of the Production Mode check box, as described in
"Editing ORS Properties" on page 64.
Select (check) the check box to enable production mode, or clear
(uncheck) it to disable it.

7. Click the Save button to save your changes.

Unregistering an ORS
Unregistering an ORS removes the connection information to this ORS from
the Master Database and removes the datasource definition from the
application server environment.

To unregister an ORS:
1. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Select the ORS that you want to unregister.

4. Click the button.


Note: If you are running WebLogic, enter the WebLogic user name and
password when prompted.
The Databases tool prompts you to confirm unregistering the ORS.
5. Click Yes.

- 70 -
Configuring Datasources
This section describes how to configure datasources for an ORS. Every ORS
requires a datasource definition in the application server environment.

About Datasources
In Informatica MDM Hub, a datasource specifies properties for an ORS, such
as the location of the database server, the name of the database, the database
user ID and password, and so on. A Informatica MDM Hub datasource points to
a JDBC resource defined in your application server environment. To learn
more about JDBC datasources, see your application server documentation.

Managing Datasources in WebLogic


For WebLogic application servers, whenever you attempt to add, delete, or
update a datasource, Informatica MDM Hub prompts you to specify the
application server administrative username and password. If you are
performing multiple operations in the Databases tool, this dialog box
remembers the last username that was entered, but always requires you to
enter the password.

Creating Datasources
You might need to explicitly create a datasource if, for example, you created
an ORS using a different application server, or if you did not check (select) the
Create datasource after registration check box when registering the
ORS.

To create a datasource:
1. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Right-click the ORS in the Databases list, and then choose Create
Datasource.
Note: If you are running WebLogic, enter the WebLogic user name and
password when prompted.
The Databases tool creates the datasource and displays a progress
message.

- 71 -
4. Click OK.

Removing Datasources
If you have registered an ORS with a configured datasource, you can use the
Databases tool to manually remove its datasource definition from your
application server. After removing the datasource definition, however, the
ORS will still appear in Hub Console. To completely remove a database from
the Hub Console, you need to unregister it (see "Unregistering an ORS" on
page 70).

To remove a datasource:
1. Start the Databases tool. For more information, see "Starting the
Databases Tool" on page 55.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Right-click an ORS in the Databases list, and then choose Remove
Datasource.
Note: If you are running WebLogic, enter the WebLogic user name and
password when prompted.
The Databases tool removes the datasource and displays a progress
message.

4. Click OK.

- 72 -
Chapter 5: Building the Schema

This chapter explains how to design and build your schema in Informatica
MDM Hub.

Chapter Contents
• "Before You Begin" on page 73
• "About the Schema" on page 73
• "Starting the Schema Manager" on page 81
• "Configuring Base Objects" on page 82
• "Configuring Columns in Tables" on page 102
• "Configuring Foreign-Key Relationships Between Base Objects" on page
113
• "Viewing Your Schema" on page 119

Before You Begin


Before you begin, you must have installed Informatica MDM Hub and created
the Hub Store (including on Operational Reference Store) according to the
instructions in the Informatica MDM Hub Installation Guide.

About the Schema


The schema is the data model that is used in your Informatica MDM Hub
implementation. Informatica MDM Hub does not impose or require any
particular schema. The schema exists inside Informatica MDM Hub and is
independent of the source systems providing data to Informatica MDM Hub.

Note: The process of designing the schema for your Informatica MDM Hub
implementation is outside the scope of this document. It is assumed that you
have developed a data model—using industry-standard data modeling
methodologies—that is based on a thorough understanding of your
organization’s requirements and in-depth knowledge of the data you are
working with.

The Informatica schema is a flexible, repository-driven model that supports


the data structure of any vertical business sector. The Hub Store is the
database that underpins Informatica MDM Hub and provides the foundation of
Informatica MDM Hub’s functionality. Every Informatica MDM Hub installation
has a Hub Store, which includes one Master Database and one or more
Operational Reference Store (ORS) databases. Depending on the configuration

- 73 -
of your system, you can have multiple ORS databases in an installation. For
example, you could have a development ORS, a testing ORS, and a production
ORS. For more information, see "About the Hub Store" on page 51 and
"Configuring Operational Reference Stores and Datasources" on page 54.

Before you begin to implement the schema, you must understand the basic
structure of the underlying Informatica MDM Hub schema and its components.
This section introduces the most important tables in an ORS and how they
work together.

Note: You must use tools in the Hub Console to define and manage the
consolidated schema—you cannot make changes directly to the database. For
example, you must use the Schema Manager to define tables and columns.
For details, see "Requirements for Defining Schema Objects" on page 77.

Types of Tables in an Operational Reference Store


An ORS contains both tables that you configure and system support tables.

Configurable Tables

The following types of Informatica MDM Hub tables are used to model
business reference data. You must explicitly create and configure these
tables.
Types of Configurable Tables in an ORS
Type Description
of
Table
base Used to store data for a central business entity (such as
object customer, product, or employee) or a lookup table (such
as country or state). In a base object table (or simply a
base object), you can consolidate data from multiple
source systems and use trust settings to determine the
most reliable value of each base object cell. You can
define one-to-many relationships between base objects.
Base objects must be explicitly created and configured
according to the instructions in "Process Overview for
Defining Base Objects" on page 83.
landing Used to receive batch loads from a source system.
table Landing tables must be explicitly created and configured
according to the instructions in "Configuring Landing
Tables" on page 269.
staging Used to load data into a base object. Mappings are
table defined between landing tables and staging tables to
specify whether and how data is cleansed and
standardized when it is moved from a landing table to a
staging table. Staging tables must be explicitly created
and configured according to the instructions in
"Configuring Staging Tables" on page 275.

- 74 -
Infrastructure Tables

The following types of Informatica MDM Hub infrastructure tables are used to
manage and support the flow of data in the Hub Store. Informatica MDM Hub
automatically creates, configures, and maintains these tables whenever you
configure base objects.
Types of Infrastructure Tables in an ORS
Type of Description
Table
cross- Used for tracking the origin of each record in the base object.
reference Named according to the following pattern:
table C_baseObjectName_XREF
where baseObjectName is the root name of the base object (for
example, C_PARTY_XREF). For this reason, this table is sometimes
referred to as the XREF table. When you create a base object,
Informatica MDM Hub automatically creates a cross-reference
table to store information about data coming from source systems.
For more information, see "Cross-Reference Tables" on page 86.
history Used if history is enabled for a base object (see "Enable History"
table on page 90). Named according to the following pattern:
C_baseObjectName_HIST—base object history table, as described
in "Base Object History Tables" on page 89.
C_baseObjectName_HXRF—cross-reference history table, as
described in "Cross-Reference History Tables" on page 89.
where baseObjectName is the root name of the base object (for
example, C_PARTY_HIST and C_PARTY_HXRF).
Informatica MDM Hub creates and maintains several different
history tables to provide detailed change-tracking options,
including merge and unmerge history, history of the pre-cleansed
data, history of the base object, and the cross-reference history.
match key Contains the match keys that were generated for all base object
table records. Named according to the following pattern:
C_baseObjectName_STRP
where baseObjectName is the root name of the base object (for
example, C_PARTY_STRP). For more information, see "Match Key
Tables" on page 240.
match Contains the pairs of matched records in the base object resulting
table from the execution of the match process on this base object.
Named according to the following pattern:
C_baseObjectName_MTCH
where baseObjectName is the root name of the base object (for
example, C_PARTY_MTCH). For more information, see "Populating
the Match Table with Match Pairs" on page 252
external Uses input (C_baseObjectName_EMI) and output (C_
match baseObjectName_EMO) tables.
table • The EMI contains records to match against the records in the
base object.
• The EMO table contains the output data for External Match jobs.
Each row in the EMO represents a pair of matched records—
one from the EMI table and one from the base object:
For more information, see "External Match Jobs" on page 535 and
"External Match Jobs" on page 572.

- 75 -
Type of Description
Table
temporary Informatica MDM Hub creates various temporary tables as needed
tables while processing data (such as during batch jobs). Once the
temporary tables are no longer needed, they are automatically
and periodically removed by a background process.

Supported Relationships Among Data

Informatica MDM Hub supports one:many and many:many relationships


among tables, as well as hierarchical relationships between records in the
same base object. In Informatica MDM Hub, relationships between records
can be defined in various ways.

The following table describes these types of relationships.

Type of Description
Relationship
foreign key One base object (the child) contains a foreign key column, which
relationship contains values that match values in the primary key column of
between another base object (the parent). For more information, see
base objects "Process Overview for Defining Foreign-Key Relationships" on
page 114 and "Configuring Foreign-Key Relationships Between
Base Objects" on page 113.
records Within a base object, records are related to each other
within the hierarchically. Allows you to define many-to-many relationships
same base within the base object. For more information, see "Intra-Table
object Paths" on page 377.

Once these relationships are configured in the Hub Console, you can use these
relationships to configure match column rules by defining match paths
between records. For more information, see "Configuring Match Paths for
Related Records" on page 373.

- 76 -
Requirements for Defining Schema Objects
This section describes requirements for configuring schema objects.

Make Schema Changes Only in the Hub Console

Informatica MDM Hub maintains schema consistency, provided that all


model changes are done usingthe Hub Console tools, and that no
changes are made directly to the database. Informatica MDM Hub provides all
the tools necessary for maintaining the schema.

Think Before You Change the Schema

Important: Schema changes can involve risk to data and should be


approached in a managed and controlled manner. You should plan the changes
to be made and analyze the impact of the changes before making them. You
should also back up the database before making any changes.

You Must Have a Write Lock to Change the Schema

In order to make any changes to the schema, you must have a write lock. For
more information, see "Acquiring a Write Lock" on page 36.

Rules for Database Object Names

Database object names cannot be longer than 22 characters.

Reserved Strings for Database Object Names

Note: To understand which Hub processes create which tables and how to best
manage these tables, please refer to the “Transient Tables” technical note
found on the SHARE portal.

Informatica MDM Hub creates metadata objects that use prefixes and suffixes
added to the names you use for base objects. In order to avoid confusion and
possible data loss, database object names must not use the following strings
as either names or suffixes.
_BVTB _STRPT _TMIN BVLNK_ TCMN_ TGV_
_BVTC _T _TML0 BVTX_ TCMO_ TGV1_
_BVTV _TBKF _TMMA BVTXC_ TCRN_ TLL
_C _TBVB _TMNX BVTXV_ TCRO_ TMA_
_CL _TBVC _TMP0 CLC_ TCSN_ TMF_
_D _TBVV _TMST CSC_ TCSO_ TMMA_

- 77 -
_DLT _TC0 _TNPMA CTL TCVN_ TMR_
_EMI _TC1 _TPMA EXP_ TCVO_ TPBR_
_EMO _TDEL _TPRL GG TCXN_ TRBX_
_HIST _TEMI _TRAW HMRG TCXO_ TUCA_
_HUID _TEMO _TRLG LNK TDCC_ TUCC_
_HXRF _TEMP _TRLT M TDEL_ TUCF_
_JOBS _TEST _TSD PRL TDUMP_ TUCR_
_L _TGA _TSI T_verify_ TUCT_
_LINK _TGA1 _TSNU TBDL_ TFK_ TUCX_
_LMH _TGB _TSTR TBOX_ TFX_ TUDL_
_LMT _TGB1 _TUID TBXR_ TGA_ TUGR_
_MTBM _TGC _TVXR TCBN_ TGB_ TUHM_
_MTCH _TGC1 _VCT TCBO_ TGB1_ TUID_
_MTFL _TMG0 _XREF TCCN_ TGC_ TUK_
_MTFU _TMG1 BV0_ TCCO_ TGC1_ TUPT_
_MVLE _TMG2 BV1_ TCGN_ TGD_ TUTR_
_OPL _TMG3 BV2_ TCGO_ TGF_ TVXRD_
_ORAW _TMGA BV3_ TCHN_ TGM_ TXDL_
_STRP _TMGB BV5_ TCHO_ TGMD_ TXPR_

Reserved Column Names

The following column names are reserved and cannot be used for user-defined
columns.
AFFECTED_LEVEL_CODE ORIG_TGT_ROWID_OBJECT
AFFECTED_ROWID_COLUMN PKEY_SRC_OBJECT
AFFECTED_ROWID_OBJECT PKEY_SRC_OBJECT1
AFFECTED_ROWID_XREF PKEY_SRC_OBJECT2
AFFECTED_SRC_VALUE PREFERRED_KEY_IND
AFFECTED_TGT_VALUE PROMOTE_IND
AUTOLINK_IND PUT_UPDATE_MERGE_IND
AUTOMERGE_IND REPOINTED_IND
CONSOLIDATION_IND ROOT_IND
CREATE_DATE ROU_IND
CREATOR ROWID_GROUP
CTL_ROWID_OBJECT ROWID_JOB
DATA_COUNT ROWID_KEY_CONSTRAINT
DATA_ROW ROWID_MATCH_RULE

- 78 -
DELETED_BY ROWID_OBJECT
DELETED_DATE ROWID_OBJECT_MATCHED
DELETED_IND ROWID_OBJECT_NUM
DEP_PKEY_SRC_OBJECT ROWID_OBJECT1
DEP_ROWID_SYSTEM ROWID_OBJECT2
DIRTY_IND ROWID_SYSTEM
ERROR_DESCRIPTION ROWID_TASK
FILE_NAME ROWID_USER
FIRSTV ROWID_XREF
GENERATED_XREF ROWID_XREF1
GROUP_ID ROWID_XREF2
GVI_NO ROWKEY
HIST_CREATE_DATE RULE_NO
HIST_UPDATE_DATE SDSRCFLG
HSI_ACTION SEQ
HUB_STATE_IND SOURCE_KEY
INTERACTION_ID SOURCE_NAME
INVALID_IND SRC_LUD
LAST_ROWID_SYSTEM SRC_ROWID
LAST_UPDATE_DATE SRC_ROWID_OBJECT
LASTV SRC_ROWID_XREF
LOST_VALUE SSA_DATA
MATCH_REVERSE_IND SSA_KEY
MERGE_DATE STRIP_DATE
MERGE_OPERATION_ID TGT_ROWID_OBJECT
MERGE_UPDATE_NULL_ALLOW_IND TOTAL_BO_IND
MERGE_VIA_UNMERGE_IND TREE_UNMERGE_IND
MRG_SRC_ROWID_OBJECT UNLINK_IND
MRG_TGT_ROWID_OBJECT UNMERGE_DATE
NULL_INDICATOR_BITMAP UNMERGE_IND
NUM_CONTR UNMERGE_OPERATION_ID
OLD_AFFECTED UPDATED_BY
ONLINE_IND WIN_VALUE
ORIG_ROWID_OBJECT_MATCHED XREF_LUD

If you use a reserved column name, a warning message is displayed. For


example: "The column physical name "XREF_LUD" is a reserved name.
Reserved names cannot be used."

- 79 -
Other Reserved Words
ADD DECLARE LANGUAGE SELECT
ADMIN DEFAULT LEVEL SEQUENCE
AFTER DELETE LIKE SESSION
ALL DESC MAX SET
ALLOCATE DISTINCT MIN SIZE
ALTER DOUBLE MODIFY SMALLINT
AND DROP MODULE SOME
ANY DUMP NATURAL SPACE
ARRAY EACH NEW SQL
AS ELSE NEXT SQLCODE
ASC END NONE SQLERROR
AT ESCAPE NOT SQLSTATE
AUTHORIZATION EXCEPT NULL START
AVG EXCEPTION NUMERIC STATEMENT
BACKUP EXEC OF STATISTICS
BEFORE EXECUTE OFF SUM
BEGIN EXISTS OLD TABLE
BETWEEN EXIT ON TEMPORARY
BLOB FETCH ONLY TERMINATE
BOOLEAN FILE OPEN THEN
BY FLOAT OPTION TIME
CASCADE FOR OR TO
CASE FOREIGN ORDER TRANSACTION
CHAR FORTRAN OUT TRIGGER
CHARACTER FOUND PLAN TRUNCATE
CHECK FROM PRECISION UNDER
CHECKPOINT FUNCTION PRIMARY UNION
CLOB GO PRIOR UNIQUE
CLOSE GOTO PRIVILEGES UPDATE
COLUMN GRANT PROCEDURE USE
COMMIT GROUP PUBLIC USER
CONNECT HAVING READ USING
CONSTRAINT IF REAL VALUES
CONSTRAINTS IMMEDIATE REFERENCES VARCHAR
CONTINUE IN REFERENCING VIEW
COUNT INDEX RETURN WHEN

- 80 -
CREATE INDICATOR REVOKE WHENEVER
CURRENT INSERT ROLE WHERE
CURSOR INT ROLLBACK WHILE
CYCLE INTEGER ROW WITH
DATABASE INTERSECT ROWS WORK
DATE INTO SAVEPOINT WRITE
DEC IS SCHEMA FALSE
DECIMAL KEY SECTION TRUE

Adding Columns for Technical Reasons

For purely technical reasons, you might want to add columns to a base object.
For example, for a segment match, you must add a segment column. For
more information on adding columns for segment matches, see "Segment
Matching" on page 422.

We recommend that you distinguish columns added to base objects for purely
technical reasons from those added for other business reasons, because you
generally do not want to include these columns in most views used by data
stewards. Prefixing these column names with a specific identifier, such as
CSTM_, is one way to easily filter them out.

Starting the Schema Manager


You use the Schema Manager in the Hub Console to define the schema, staging
tables, and landing tables. The Schema Manager is also used to define rules
for match and merge, validation, and message queues.

To start the Schema Manager:


• In the Hub Console, expand the Model workbench, and then click Schema.
The Hub Console displays the Schema Manager.

The Schema Manager is divided into two panes.


Pane Description
Navigation Shows (in a tree view) the core schema objects: base objects and
pane landing tables. Expanding an object in the tree shows you the
property groups available for that object.
Properties Shows the properties for the selected object in the left-hand pane.
pane Clicking any node in the schema tree displays the corresponding
properties page (that you can view and edit) in the right-hand
pane.

For general instructions about using the Schema Manager, see "Navigating the
Hub Console" on page 32. You must use the Schema Manager when defining

- 81 -
tables in an ORS, as described in "Requirements for Defining Schema Objects"
on page 77.

Configuring Base Objects


This section describes how to configure base objects for your Informatica
MDM Hub implementation.

About Base Objects


In Informatica MDM Hub, central business entities—such as customers,
accounts, products, or employees—are represented in tables called base
objects. A base object is a table in the Hub Store that contains collections of
data about individual entities—such as customer A, customer B, customer C,
and so on.

Each individual entity has a single master record—the best version of the
truth—for that entity. An individual entity might have additional records in the
base object (contributing records) that contain the “multiple versions of the
truth” that need to be consolidated into the master record. Consolidation is the
process of merging duplicate records into a single consolidated record that
contains the most reliable cell values from all of the source records.

Important: You must use the Schema Manager to define base objects—you
cannot configure them directly in the database. For more information, see
"Requirements for Defining Schema Objects" on page 77.

Relationships Between Base Objects and Other


Tables in the Hub Store
The following figure shows base objects in relation to other tables in the Hub
Store.

- 82 -
Process Overview for Defining Base Objects
To define a base object:
1. Using the Schema Manager, create a base object table according to the
instructions in "Creating Base Objects" on page 95.
The Schema Manager automatically adds system columns, as described in
"Base Object Columns" on page 84.
2. Add the user-defined columns that will contain business data according to
the instructions in "Configuring Columns in Tables" on page 102.
Note: Column names cannot be longer than 26 characters.
3. While configuring column properties, specify which column(s) will use
trust to determine the most reliable value when different source systems

- 83 -
provide different values for the same cell. For more information, see
"Configuring Trust for Source Systems" on page 344.
4. For this base object, create one staging table per source system according
to the instructions in "Configuring Staging Tables" on page 275. For each
staging table, select the base object columns that you want to include.
5. Create any landing tables that you need to store data from source
systems. For more information, see "Configuring Landing Tables" on page
269.
6. Map the landing tables to the staging tables according to the instructions in
"Mapping Columns Between Landing and Staging Tables" on page 286.
If any columns need data cleansing, specify the cleanse function in the
mapping according to the instructions in "Configuring Data Cleansing" on
page 307.
Each staging table must get its data from one landing table (with any
intervening cleanse functions), but the same landing table can provide
data to more than one staging table. Map the primary key column of the
landing table to the PKEY_SRC_OBJECT column in the staging table.
7. Populate each landing table with data using an ETL tool or some other
process, as described in "Land Process" on page 221.

Base Object Columns


Base objects have two types of columns:
Column Description
Type
system Columns that are automatically created and maintained by the
columns Schema Manager.
user- Columns that have been added by users according to the
defined instructions in "Configuring Columns in Tables" on page 102.
columns

Base objects have the following system columns.


System Columns for Base Objects
Physical Name Data Description
Type
(Size)
ROWID_OBJECT CHAR Primary key. Unique value assigned by
(14) Informatica MDM Hub whenever a new record is
inserted into the base object.
CREATOR VARCHAR User or process responsible for creating the
(50) record.
CREATE_DATE DATE Date on which the record was created.
UPDATED_BY VARCHAR User or process responsible for the most recent
(50) update on the record.
LAST_UPDATE_ DATE Date of the most recent update to any cell on the
DATE record.

- 84 -
Physical Name Data Description
Type
(Size)
CONSOLIDATION_ INT Integer value indicating the consolidation state of
IND this record. Valid values are:
• 1=unique (represents the best version of the
truth)
• 2=ready for consolidation
• 3=ready for match; this record is a match
candidate for the currently-executing match
process
• 4=available for match; this record is new
(load insert) or has been updated (load
update) and needs to undergo the match
process
• 9=on hold (data steward has put this record on
hold until further notice)
For more information, see "Consolidation Status
for Base Object Records" on page 219.
DELETED_IND INT Reserved for future use.
DELETED_BY VARCHAR Reserved for future use.
(50)
DELETED_DATE DATE Reserved for future use.
LAST_ROWID_ CHAR The identifier of the system responsible for the
SYSTEM (14) most recent update to any cell in the base object
record.
Foreign key referencing ROWID_SYSTEM column
on C_REPOS_SYSTEM table.
DIRTY_IND INT Used to determine whether the tokenize process
generates match keys for this record. Valid values
are:
• 0 = record is up to date
• 1 = record is new or has been updated and
needs to be tokenized
After the record has been tokenized, this flag is
reset to zero (0). For more information, see "Base
Object Records Flagged for Tokenization" on page
243.
INTERACTION_ID INT For state-enabled base objects only. Interaction
identifier that is used to protect a pending cross-
reference record from updates that are not part of
the same process as the original cross-reference
record. For details, see "Protecting Pending
Records Using the Interaction ID" on page 161.
HUB_STATE_IND INT For state-enabled base objects only. Integer value
indicating the state of this record. Valid values
are:
• 0=Pending
• 1=Active (Default)
• -1=Deleted
For details, see "Hub State Indicator" on page
160.

- 85 -
Cross-Reference Tables
This section describes cross-reference tables in the Hub Store.

About Cross-Reference Tables

Each base object has one associated cross-reference table (or XREF table),
which is used for tracking the lineage (origin) of records in the base object.
Informatica MDM Hub automatically creates a cross-reference table when you
create a base object. Informatica MDM Hub uses cross-reference tables to
translate all source system identifiers into the appropriate ROWID_OBJECT
values.

Records in Cross-Reference Tables

Each row in the cross-reference table represents a separate record from a


source system. If multiple sources provide data for a single column (for
example, the phone number comes from both the CRM and ERP systems), then
the cross-reference table contains separate records from each source system.
Each base object record will have one or more associated cross-reference
records.

The cross-reference record contains:


• an identifier for the source system that provided the record
• the primary key value of that record in the source system
• the most recent cell value(s) provided by that system

Load Process and Cross-Reference Tables

The load process populates cross-reference tables. During load inserts, new
records are added to the cross-reference table. During load updates, changes
are written to the affected cross-reference record(s).

Data Steward Tools and Cross-Reference Tables

Cross-reference records are visible in the Merge Manager and can be modified
using the Data Manager. For more information, see the Informatica MDM Hub
Data Steward Guide.

Relationships Between Base Objects and Cross-Reference Tables

The following figure shows an example of the relationships between base


objects, cross-reference tables, and C_REPOS_SYSTEM.

- 86 -
Columns in Cross-reference Tables

Cross-reference tables have the following system columns. Note that cross-
reference tables have a unique key representing the combination of the PKEY_
SRC_OBJECT and ROWID_SYSTEM columns.
Physical Name Data Type Description
(Size)
ROWID_XREF NUMBER Primary key that uniquely identifies this record in
(38) the cross-reference table.
PKEY_SRC_ VARCHAR2 Primary key value from the source system. Multi-
OBJECT (255) field/multi-column keys from source systems must
be concatenated into a single key value using the
Informatica MDM Hub internal cleanse process (see
"About Data Cleansing in Informatica MDM Hub" on
page 307) or external cleanse process (an ETL tool
or some other data loading utility).
ROWID_ CHAR (14) Foreign key to C_REPOS_SYSTEM, which is the
SYSTEM Informatica MDM Hub repository table that stores a
Informatica MDM Hub identifier and description of
each source system that can populate the ORS. For
more information, see "Configuring Source
Systems" on page 264.
ROWID_ CHAR (14) Foreign key to the base object. Unique value
OBJECT assigned by Informatica to the associated record in
the base object.
SRC_ LUD DATE Last source update date. Updated only when an
update is received from the source system.
CREATOR VARCHAR2 User or process responsible for creating the cross-
(50) reference record.
CREATE_DATE DATE Date on which the cross-reference record was

- 87 -
Physical Name Data Type Description
(Size)
created.
UPDATED_BY VARCHAR2 User or process responsible for the most recent
(50) update to the cross-reference record.
LAST_ DATE Date of the most recent update to any cell in the
UPDATE_DATE cross-reference record. Can be updated as
applicable during the load and consolidation
processes.
DELETED_IND NUMBER Reserved for future use.
(38)
DELETED_BY VARCHAR2 Reserved for future use.
(50)
DELETED_ DATE Reserved for future use.
DATE
PUT_UPDATE_ NUMBER Indicates whether a record has been edited using
MERGE_IND (38) the Data Manager.
INTERACTION_ NUMBER For state-enabled base objects only. Interaction
ID (38) identifier that is used to protect a pending cross-
reference record from updates that are not part of
the same process as the original cross-reference
record. For more information, see "Protecting
Pending Records Using the Interaction ID" on page
161.
HUB_STATE_ NUMBER For state-enabled base objects only. Integer value
IND (38) indicating the state of this record. Valid values are:
• 0=Pending
• 1=Active (Default)
• -1=Deleted
For more information, see "Hub State Indicator" on
page 160.
PROMOTE_IND NUMBER For state-enabled base objects only. Integer value
(38) indicating the promotion status. Used by the
Promote job to determine whether to promote the
record to an ACTIVE state. Valid values are:
• 0=Do not promote this record
• 1=Promote this record to ACTIVE
This value is not changed to 0 during the Promote
job if the record is not promoted.
For more information, see "Promoting Records
Using the Promote Batch Job" on page 166.

History Tables
This section describes history tables in the Hub Store. If history is enabled for
a base object (see "Enable History" on page 90), then Informatica MDM Hub
maintains history tables for base objects and cross-reference tables. History
tables are used by Informatica MDM Hub to provide detailed change-tracking
options, including merge and unmerge history, history of the pre-cleansed
data, history of the base object, the cross-reference history, and so on.

- 88 -
Base Object History Tables

A history-enabled base object has a single history table (named C_


baseObjectName_HIST) that contains historical information about data
changes in the base object. Whenever a record is added or updated in the base
object, a new record is inserted into the base object history table to capture
the event.

Cross-Reference History Tables

A history-enabled base object has a single cross-reference history table


(named C_baseObjectName_HXRF) that contains historical information about
data changes in the cross-reference table. Whenever a record changes in the
cross-reference table, a new record is inserted into the cross-reference
history table to capture the event.

Base Object Properties


This section describes the basic and advanced properties for base objects.

Basic Base Object Properties

This section describes the basic base object properties.

Item Type

The type of table that you are adding. Select Base Object.

Display Name

The name of this base object as it will be displayed in the Hub Console. Enter a
descriptive name.

Physical Name

The actual name of the table in the database. Informatica MDM Hub will
suggest a physical name for the table based on the display name that you
enter. Make sure that you do not use any reserved name suffixes, as
described in "Rules for Database Object Names" on page 77.

Data Tablespace

The name of the data tablespace. Read-only. For more information, see the
Informatica MDM Hub Installation Guide.

- 89 -
Index Tablespace

The name of the index tablespace. Read-only. For more information, see the
Informatica MDM Hub Installation Guide.

Description

A brief description of this base object.

Enable History

Specifies whether history is enabled for this base object. If enabled,


Informatica MDM Hub keeps a log of records that are inserted, updated, or
deleted for this base object. You can use the information in history tables for
audit purposes. For more information, see "History Tables" on page 88.

Advanced Base Object Properties

This section describes the advanced base object properties.

Complete Tokenize Ratio

When the percentage of the records that have changed is higher than this
value, a complete re-tokenization is performed. If the number of records to be
tokenized does not exceed this threshold, then Informatica MDM Hub deletes
the records requiring re-tokenization from the match key table, calculates the
tokens for those records, and then reinserts them into the match key table.
The default value is 60. For more information, see "Tokenize Process" on page
240.

Note: Deleting can be a slow process. However, if your Cleanse Match Server
is fast and the network connection between Cleanse Match Server and the
database server is also fast, then you may test with a much lower tokenization
threshold (such as 10%). This will enable you to determine whether there are
any gains in performance.

Allow constraints to be disabled

During the initial load/updates—or if there is no real-time, concurrent


access—you can disable the referential integrity constraints on the base object
to improve performance. The default value is 1, signifying that constraints are
disabled. For more information, see "Load Process" on page 227 and
"Configuring the Load Process" on page 343.

- 90 -
Duplicate Match Threshold

This parameter is used only with the Match for Duplicate Data job for initial
data loads. The default value is 0. To enable this functionality, this value must
be set to 2 or above. For more information, see "Match for Duplicate Data
Jobs" on page 552 and the Informatica MDM Hub Data Steward Guide.

Load Batch Size

The load process inserts and updates batches records in the base object. The
load batch size specifies the number of records to load per batch cycle (default
is 1000000). For more information, see "Loading Records by Batch" on page
231, and "Configuring the Load Process" on page 343.

Max Elapsed Match Minutes

This specifies the execution timeout (in minutes) when executing a match
rule. If this time limit is reached, then the match process (whenever a match
rule is executed, either manually or via a batch job) will exit. If a match
process is executed as part of a batch job, the system should move onto the
next match. It will stop if this is a single match process. The default value is
20. Increase this value only if the match rule and data are very complex.
Generally, rules are able to complete with 20 minutes (the default). For more
information, see "Match Process" on page 245 and "Configuring the Match
Process" on page 363.

Parallel Degree

Oracle only. This specifies the degree of parallelism set on the base object
table and its related tables. It does not take effect for all batch processes, but
can have a beneficial effect on performance when it is used. However, its use
is constrained by the number of CPUs on the database server machine, as well
as the amount of memory available. The default value is 1.

Requeue On Parent Merge

If this value is greater than zero, when parents are merged, the related child
records are set as unconsolidated. If set, when parents are merged, then
related child records are flagged as New again (consolidation indicator is 4,
see "Consolidation Status for Base Object Records" on page 219) so that they
can be matched. The default value is 0. For more information, see
"Consolidation Indicator" on page 219 and "Immutable Rowid Object" on page
443.

- 91 -
Generate Match Tokens on Load

If selected (checked), then the tokenize process (see "Tokenize Process" on


page 240) executes after the completion of the load process. This is useful for
intertable match scenarios in which the parent must be loaded first, followed
by the child match/merge. By not generating match tokens for the parent, the
child match/merge will not need to update any of the parent records in the
match key table.

Once the child match/merge is complete, you can run the match process on
the parent to force it to tokenize. This is also useful in cases where you have a
limited window in which to perform the load process. Not tokenizing will save
time in the load process, at the cost of tokenizing the data later.

You must tokenize before you match your data. For more information, see
"Load Process" on page 227, "Generating Match Tokens (Optional)" on page
239, and "Generating Match Tokens During Load Jobs" on page 544.

Generate Match Tokens on Put

You can PUT data into a base object using the Data Manager (see the
Informatica MDM Hub Data Steward Guide). If you are using the Data
Manager to PUT data, you can enable (check) this value to tokenize your data
later. Performing this operation later allows you to process PUT requests
faster. Use this only when you know that the data will not be matched
immediately. For more information, see "Tokenize Process" on page 240.

Note: Do not use the Generate Match Tokens on Put option if you are using the
SIF API. If you have this parameter enabled, your SIF Put and CleansePut
requests will fail. Use the Tokenize request instead. Enable Generate Match
Tokens on Put only if you are not using the SIF API and you want data steward
updates from the Hub Console to be tokenized immediately. For more
information, see "Editing Base Object Properties" on page 95.

Match Flag Audit Table

Specifies whether a match flag audit table is created.


• If checked (selected), then an audit table (BusinessObjectName_FMHA) is
created and populated with the userID of the user who, in Merge Manager,
queued a manual match record for automerging. For more information
about the Merge Manager tool, see the Informatica MDM Hub Data
Steward Guide.
• If unchecked (not selected), then the Updated_By column is set to the
userID of the person who executed the Automerge batch job.

- 92 -
For more information, see "Match Process" on page 245 and "Configuring the
Match Process" on page 363.

API “lock wait” interval (seconds)

Specifies the maximum number of seconds that a SIF request will wait to
obtain a row-level lock. Applies only if row-level locking is enabled for an
ORS, as described in "Enabling Row-level Locking on an ORS" on page 741. For
more information, see "Row-level Locking" on page 740.

Batch “lock wait” interval (seconds)

Specifies the maximum number of seconds that a batch job will wait to obtain
a row-level lock. Applies only if row-level locking is enabled for an ORS, as
described in "Enabling Row-level Locking on an ORS" on page 741. For more
information, see "Row-level Locking" on page 740.

Enable State Management

Specifies whether Informatica MDM Hub manages the system state for
records in this base object. By default, state management is disabled. Select
(check) this check box to enable state management for this base object in
support of approval workflows. If enabled, this base object is referred to in
this document as a state-enabled base object. For more information, see
"State Management" on page 159 and "Enabling State Management" on page
163.

Note: If the base object has custom query, when you disable state
management on the base object, you will always get a warning pop-up
window, even when the hub_state_ind is not included in the custom query.

Enable History of Cross-Reference Promotion

For state-enabled base objects, specifies whether Informatica MDM Hub


maintains the promotion history for cross-reference records that undergo a
state transition from PENDING (0) to ACTIVE (1). By default, this option is
disabled. For more information, see "State Management" on page 159 and
"Enabling the History of Cross-Reference Promotion" on page 163.

Base Object Style

Select the style (merge or link) for this base object.


• A merge-style base object (the default) is used with Informatica MDM
Hub’s match and merge capabilities.

- 93 -
• A link-style base object is used with Informatica MDM Hub’s match and
link capabilities. If selected, Informatica MDM Hub creates a LINK table for
this base object.
If you change a link-style base object back to a merge-style base object,
the Schema Manager prompts you to confirm whether you want to drop the
LINK table.

Lookup Indicator

Specifies how values are retrieved in the Informatica MDM Hub Business Data
Director.
• If selected (enabled), then the Business Data Director displays drop-down
lists of lookup values.
• If not selected (disabled), then the Business Data Director displays a
search wizard that prompts users to select a value from a data table.

- 94 -
Creating Base Objects
To create each base object in your schema:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click in the left pane of the Schema Manager and choose Add Item
from the popup menu.
The Schema Manager displays the Add Table dialog box.

4. Specify the basic base object properties. For more information, see "Basic
Base Object Properties" on page 89.
5. Click OK.
The Schema Manager creates the new base table in the Operational
Reference Store (ORS), along with any support tables, and then adds the
new base object table to the schema tree.

Editing Base Object Properties


To edit the properties of an existing base object:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, select the base object that you want to modify.
The Schema Manager displays the Basic tab of the Base Object Properties
page.

- 95 -
4. For each property that you want to edit on the Basic tab, click the Edit
button next to it, and specify the new value. For more information, see
"Basic Base Object Properties" on page 89.
5. If you want, check (select) the Enable History check box to have
Informatica MDM Hub keep a log of records that are inserted, updated, or
deleted. You can use a history table for audit purposes.
6. To modify other base object properties, click the Advanced tab.

- 96 -
7. Specify the advanced properties for this base object. For more
information, see "Advanced Base Object Properties" on page 90.
8. In the left pane, click Match/Merge Setup beneath the base object’s name.

9. Specify the match / merge object properties. At a minimum, consider


configuring the following properties:
• maximum number of matches for manual consolidation (see
"Maximum Matches for Manual Consolidation" on page 368)
• number of rows per match job batch cycle (see "Number of Rows per
Match Job Batch Cycle" on page 368)

To edit a property, click the button and enter a new value.

10. Click the button to save your changes.


For more information about setting the properties for matching and
merging, see "Configuring Match Properties for a Base Object" on page
366.

Configuring Custom Indexes for Base Objects


This section describes how to configure custom indexes for a base object.

About Custom Indexes

When you configure columns for a base object, system indexes are created
automatically for primary keys and unique columns. In addition, Informatica
MDM Hub automatically drops and creates system indexes as needed when
executing batch jobs or stored procedures.

- 97 -
A custom index is a optional, supplemental index for a base object that you
can define and have Informatica MDM Hub maintain automatically. Custom
indexes are non-unique.

You might want to add a custom index to a base object for performance
reasons. For example, suppose an external application calls the SIF
SearchQuery request to search a base object by last name. If the base object
has a custom index on the last name column, the last name search is
processed more quickly. For custom indexes that are registered in
Informatica MDM Hub, custom indexes are automatically dropped and
recreated during batch execution to improve performance.

You have the option to manually define indexes outside the Hub Console using
a database utility for your database platform. For example, you could create a
function-based index—such as Upper(Last_Name) in the index expression—in
support of some specialized operation. However, if you add a user-defined
index which are not supported by the Schema Manager, then the custom index
is not registered with Informatica MDM Hub, and you are responsible for
maintaining that index—Informatica MDM Hub will not maintain it for you. If
you do not properly maintain the index, you risk affecting batch processing
performance.

Navigating to the Custom Index Setup Node


1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, expand the tree beneath the base object you want to
work with.
4. Click the Custom Index Setup node.
The Schema Manager displays the Custom Index Setup page.

Creating a Custom Index

To add a new custom index:

- 98 -
1. In the Schema Manager, navigate to the Custom Index Setup node for the
base object that you want to work with, as described in "Navigating to the
Custom Index Setup Node" on page 98.

2. Click the Add button.


The Schema Manager creates a new custom index (NI_C_
BaseObjectName_inc, where inc is a incremented number) and displays
the list of columns in the base object.

3. Select the column(s) that you want in the custom index.

4. Click the Save button to save your changes.

If an index already exists for the selected column(s), the Schema Manager
displays an error message and does not create the index.

- 99 -
Click OK to close the dialog box.

Editing a Custom Index

To change a custom index, you must delete the existing custom index and add
a new custom index with the columns that you want.

Deleting a Custom Index

To delete a custom index:


1. In the Schema Manager, navigate to the Custom Index Setup node for the
base object that you want to work with, as described in "Navigating to the
Custom Index Setup Node" on page 98.
2. In the Indexes list, select the custom index that you want to delete.

3. Click the Delete button.


The Schema Manager prompts you to confirm deletion.
4. Click Yes.

Viewing the Impact Analysis of a Base Object


The Schema Manager allows you to view all of the tables, packages, and
queries associated with a base object. You would typically do this before
deleting a base object to ensure that you do not delete other associated
objects by mistake.

To view the impact analysis for a base object:


1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, select the base object that you want to view.
4. Right-click the mouse and choose Impact Analysis.
The Schema Manager displays the Table Impact Analysis dialog box.

- 100 -
5. Click Close.

Deleting Base Objects


To delete a base object:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, select the base object that you want to delete.
4. Right-click the mouse and choose Remove.
The Schema Manager prompts you to confirm deletion.
5. Choose Yes.
The Schema Manager asks you whether you want to view the impact
analysis before deleting the base object.
6. Choose No if you want to delete the base object without viewing the
impact analysis.
The Schema Manager removes the deleted base object from the schema
tree.

- 101 -
Configuring Columns in Tables
After you have created a table (base object or landing table), you use the
Schema Manager to define the columns for that table according to the
"Requirements for Defining Schema Objects" on page 77. You must use the
Schema Manager to define columns in tables—you cannot configure them
directly in the database.

Note: In the Schema Manager, you can also view the columns for cross-
reference tables and history tables, but you cannot edit them.

About Columns
This section provides general information about table columns.

Types of Columns in ORS Tables

Tables in the Hub Store contain two types of columns:


Column Description
system A column that Informatica MDM Hub automatically creates and
columns maintains. System columns contain metadata.
user- Any column in a table that is not a system column. User-defined
defined columns are added in the Schema Manager and usually contain
columns business data.

Warning: The system columns contain Informatica MDM Hub metadata. Do


not alter Informatica MDM Hub metadata in any way. Doing so will cause
Informatica MDM Hub to behave in unpredictable ways and you can lose data.

For more information about system columns in Hub Store tables, see:
• "Base Object Columns" on page 84
• "Columns in Cross-reference Tables" on page 87
• "History Tables" on page 88
• "Building the Schema" on page 73
• "Landing Table Columns" on page 269
• "Staging Table Columns" on page 275

Data Types for Columns

Informatica MDM Hub uses a common set of data types for columns that map
directly to the following Oracle and DB2 data types.

- 102 -
Note: For information regarding the available data types, refer to the product
documentation for your database platform.
Informatica MDM Hub Data Type Oracle Data Type DB2 Data Type
CHAR CHAR CHAR
VARCHAR VARCHAR2 VARCHAR
NVARCHAR2 NVARCHAR2
NCHAR NCHAR
DATE DATE DATE
NUMBER NUMBER NUMERIC
INT INTEGER INT or INTEGER

Column Properties

Informatica MDM Hub columns have the following properties.


Column Properties
Property Description
Display Name for this column as it will be displayed in the Hub Console.
Name
Physical Actual name of the column in the table. Informatica MDM Hub will
Name suggest a physical name for the column based on the display name
that you enter.
Note: For physical names of columns, do not use:
• any reserved column names, as described in "Reserved Column
Names" on page 78
• the dollar sign ($) character
Nullable Enable (check) this option if the column can be empty (null).
• If null values are allowed, you do not need to specify a default
value.
• If null values are not allowed, then you must specify a default
value.
Data For character data types, you can specify the length. For certain
Type numeric data types, you can specify the precision and scale. For
more information, see "Data Types for Columns" on page 102.
Has Enable (check) this option if this column has a default value.
Default
Default Used if no value is provided for the column but the column cannot be
null. Required for Unique columns.
Trust Enable (check) this option if this column will contain values from
more than one source system, and you want to use trust to
determine the most reliable value. If you do not enable trust for the
column, then the most recent value will always be used. For more
information, see "Enabling Trust for a Column" on page 349 and
"Configuring Trust for Source Systems" on page 344.
Unique Enable (check) this option to enforce unique column constraints on
from a staging table. Most organizations use the primary key from
the source system for the lookup value. A record with a duplicate
value in this column will be rejected.
Note: Unique columns must have a configured Default value.
Warning: Avoid enabling the Unique option on base objects that
might be consolidated. If you have a base object with a unique
column and then load the same key from different systems, the

- 103 -
Property Description
insert into this base object fails. To use this feature, you must have
unique keys across all systems.
Validate Enable (check) this option if validation rule(s) will be configured for
this column. Validation rules are applied during the load process to
downgrade trust scores for cell values in this column. For more
information, see "Enabling Validation Rules for a Column" on page
354.
Apply Determines the survivorship of null values for put operations and
Null during the consolidation process.
Values • By default, this option is disabled. Trust scores for cells
containing null values are automatically downgraded so that,
during put operations or consolidation, null values are unlikely to
win over non-null values. Instead, non-null values from the next
available trusted source would survive.

Note: If a column value has been updated to NULL from the


Data Manager or Merge Manager tool, then Null can win over a
Not Null value.
• If enabled (checked), trust scores for cells containing null values
are calculated normally, and null values might overwrite non-
null values during put operations or consolidation. If you want to
reduce trust on cells containing null data, you must write
validation rules to do so.
GBID Enable (check) this option if you want to define this column as the
Global Business Identifier (GBID) for this object. Examples include a
social security number, a driver’s license number, and so on. Doing
so eliminates the need to custom-define identifiers. You can
configure any number of GBID columns for API access and batch
loads. For more information, see "Global Identifier (GBID) Columns"
on page 104.
Note: To be configured as a GBID column, the column must be an
INT data type or it must have exactly 255 characters in length for
one of the following data types: CHAR, NCHAR, VARCHAR, and
NVARCHAR2.
Putable Specifies whether SIF requests can put (insert or update) values
into this system column. Applies to any system column except
ROWID_OBJECT and CONSOLIDATION_IND.
Note: All user-defined columns are putable.
• If selected (enabled), then SIF requests can put (insert or
update) values into this system column.
• If not selected (the default), then SIF requests cannot insert or
update values into this system column.

Global Identifier (GBID) Columns

A Global Business Identifier (GBID) column contains common identifiers (key


values) that allow you to uniquely and globally identify a record based on your
business needs. Examples include:
• Identifiers defined by applications external to Informatica MDM Hub, such
as ERP (SAP or Siebel customer numbers) or CRM systems.
• Identifiers defined by external organizations, such as industry-specific
codes (AMA numbers, DEA numbers. and so on), or government-issued

- 104 -
identifiers (social security number, tax ID number, driver’s license
number, and so on).

Note: To be configured as a GBID column, the column must be an integer,


CHAR, VARCHAR, NCHAR, or NVARCHAR column type. A non-integer column
must be exactly 255 characters in length.

In the Schema Manager, you can define multiple GBID columns in a base
object. For example, an employee table might have columns for social
security number and driver’s license number, or a vendor table might have a
tax ID number.

A Master Identifier (MID) is a common identifier that is generated by a system


of reference or system of record that is used by others (for example, CIF,
legacy hubs, CDI/MDM Hub, counterparty hub, and so on). In Informatica
MDM Hub, the MID is the ROWID_OBJECT, which uniquely identifies individual
records from various source systems.

GBIDs do not replace the ROWID_OBJECT. GBIDs provide additional ways to


help you integrate your Informatica MDM Hub implementation with external
systems, allowing you to query and access data through unique identifiers of
your own choosing (using SIF requests, as described in the Informatica MDM
Hub Services Integration Framework Guide). In addition, by configuring GBID
columns using already-defined identifiers, you can avoid the need to custom-
define identifiers.

GBIDs help with the traceability of your data. Traceability is keeping track of
the data so that you can determine its lineage—which systems, and which
records from those systems, contributed to consolidated records. When you
define GBID columns in a base object, the Schema Manager creates a
separate table for this base object (the table name ends with _HUID) that
tracks the old and new values (current/obsolete value pairs).

For example, suppose two of your customers (both of which had different tax
ID numbers) merged into a single company, and one tax ID number survived
while the other one became obsolete. If you defined the taxID number column
as a GBID, Informatica MDM Hub could help you track both the current and
historical tax ID numbers so that you could access data (via SIF requests)
using the historical value.

Note: Informatica MDM Hub does not perform any data verification or error
detection on GBID columns. If the source system has duplicate GBID values,
then those duplicate values will be passed into Informatica MDM Hub.

- 105 -
Columns in Staging Tables

The columns for staging tables cannot be defined using the column editor.
Staging table columns are a special case, as they are based on some or all
columns in the staging table’s target object. You use the Add/Edit Staging
Table window to select the columns on the target table that can be populated
by the staging table. Informatica MDM Hub then creates each staging table
column with the same data types as the corresponding column in the target
table. See "Configuring Staging Tables" on page 275 for more information on
choosing the columns for staging tables.

Maximum Number of Columns for Base Objects

A base object cannot have more than 200 user-defined columns if it will have
match rules that are configured for automatic consolidation. For more
information, see "Flagging Matched Records for Automatic or Manual
Consolidation" on page 254 and "Specifying Consolidation Options for Matched
Records" on page 408.

Navigating to the Column Editor


To configure columns for base objects and landing tables:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Expand the schema tree for the object to which you want to add columns.
4. Select Columns.
The Schema Manager displays column definitions in the Properties pane.

Note: In the above example, the schema shows ANSI SQL data types that
Oracle converts to its own data types. For more information, see "Data
Types for Columns" on page 102.

The Column Editor displays a “locked” icon next to system columns.

- 106 -
Command Buttons in the Column Editor

The Properties pane in the Column Editor contains the following command
buttons:
Button Name Description
Add Add new columns. For more information, see "Adding
Columns" on page 108.
Delete Remove existing columns. For more information, see
"Deleting Columns" on page 112.
Move Move the selected column up in the display order. For more
Up information, see "Changing the Column Display Order" on
page 112.
Move Move the selected column down in the display order. For more
Down information, see "Changing the Column Display Order" on
page 112.
Import Add new columns by importing column definitions from
another table. For more information, see "Importing Column
Definitions From Another Table" on page 109.
Expand Expand the table columns view. For more information, see
View "Expanding the Table Columns View" on page 107.
Restore Restore the table columns view. For more information, see
View "Expanding the Table Columns View" on page 107.
Save Saves changes to the column definitions.

Showing or Hiding System Columns

You can toggle the Show System Columns check box to show or hide system
columns. For more information, see "Types of Columns in ORS Tables" on
page 102.

Expanding the Table Columns View

You can expand the properties pane to display all the column properties in a
single pane. By default, the Schema Manager displays column definitions in a
contracted view.

To show the expanded table columns view:

- 107 -
• Click the button.

The Schema Manager displays the expanded table columns view.

To show the default table columns view:

• Click the button

The Schema Manager displays the default table columns view.

Adding Columns
To add a column:
1. Navigate to the column editor for the table that you want to configure. For
more information, see "Navigating to the Column Editor" on page 106.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. Click the button.


The Schema Manager displays an empty row.

4. For each column, specify its properties. For more information, see
"Column Properties" on page 103.

5. Click the button to save the columns you have added.

- 108 -
Importing Column Definitions From Another Table
To import some of the column definitions from another table:
1. Navigate to the column editor for the table that you want to configure. For
more information, see "Navigating to the Column Editor" on page 106.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. Click the Import Schema button.


The Import Schema dialog is displayed.

4. Specify the connection properties for the schema that you want to import.
If you need more information about the connection information to specify
here, contact your database administrator.
The settings for the User name / Password fields depend on whether proxy
users are configured for your Informatica MDM Hub implementation.
• If proxy users are not configured (the default), then the user name will
be the same as the schema name.
• If proxy users are configured, then you must specify the custom user
name / password so that Informatica MDM Hub can use those
credentials to access the schema.
For more information about proxy user support, see the Informatica MDM
Hub Installation Guide.
5. Click Next.
Note: The database you enter does not need to be the same as the
Informatica ORS that you’re currently working in, nor does it need to be a
Informatica ORS.

- 109 -
The only restriction is that you cannot import from a relational database
that is a different type from the one in which you are currently working.
For example, if your database is an Oracle database, then you can import
columns only from another Oracle database.
The Schema Manager displays a list of the tables that are available for
import.

6. Select that table that you want to import.


7. Click Next.
The Schema Manager displays a list of columns for the selected table.

8. Select the column(s) you want to import.


9. Click Finish.

10. Click the Save button to save the column(s) that you have added.

Editing Column Properties


Once columns have been added and saved, you can change certain column
properties. Before you make any changes, however, bear in mind that once a
table has been defined and saved, you cannot:

- 110 -
• reduce the length of a CHAR, VARCHAR, NCHAR, or NVARCHAR2 field
• change the scale or precision of a NUMBER field

Important: As with any schema changes that are attempted after the tables
have been populated with data, manage changes to columns in a planned and
controlled fashion, and ensure that the appropriate database backups are done
before making changes.

To change column properties:


1. Navigate to the column editor for the table that you want to configure. For
more information, see "Navigating to the Column Editor" on page 106
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. For each column, you can change the following properties. Be sure to read
about the implications of changing a property before you make the change.
For more information about each property, see "Column Properties" on
page 103.

Property Notes for Editing Values in This Column


Display Name for this column as it will be displayed in the Hub Console.
Name
Length You can only increase the length of a CHAR, VARCHAR, NCHAR, or NVARCHAR2
field.
Default Used if no value is provided for the column but the column cannot be null.
Trust Note: You need to synchronize metadata if you enable trust. If you enable trust for
a column on a table that already contains data, you will be warned that your trust
settings have changed and that you need to run the trust Synchronization batch job
in the Batch Viewer tool before doing any further loads to the table (see "Running
Synchronize Batch Jobs After Changes to Trust Settings" on page 352). Informatica
MDM Hub will automatically make sure that the Synchronization job is available in
the Batch Viewer tool. For more information, see "Using Batch Jobs " on page 496.
Warning: You must execute the synchronization process before you run any more
Load jobs. Otherwise, the trusted values used to populate the column will be
incorrect.
Warning: Beware and be very careful about disabling (unchecking) trust for
columns that already contain data. Disabling trust results in the removal of columns
from some of the underlying metadata tables and the resultant loss of data.
If you inadvertently disable trust and save that change, you should correct your
error by enabling trust again and immediately running the Synchronization job to
recreate the metadata.
Unique Enabling the Unique indicator will fail if the column already contains duplicate
values. As noted before, it is recommended that you avoid using the Unique option,
particularly on base objects that might be merged.
Validate Warning: Beware when disabling validation, which results in the loss of metadata
for the associated column. This should be approached with caution and should only
be done with certainty.
Putable Enable this property for system columns into which you want to put data (insert or
update) using SIF requests. Applies to any system column except ROWID_OBJECT
and CONSOLIDATION_IND.

4. Click the Save button to save your changes.

- 111 -
Changing the Column Display Order
You can move columns up or down in the display order. Changing the display
order does not affect the physical table in the database.

To change the column display order:


1. Navigate to the column editor for the table that you want to configure. For
more information, see "Navigating to the Column Editor" on page 106
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the column that you want to move.
4. Do one of the following:

• Click the button to move the selected column up in the display


order.

• Click the button to move the selected column down in the display
order.

5. Click the button to save your changes.

Deleting Columns
Removing columns should be approached with extreme caution. Any data that
has already been loaded into a column will be lost when the column is
removed. It can also be a slow process due to the number of underlying tables
that could be affected. You must save the changes immediately after
removing the existing columns.

To delete a column from base objects and landing tables:


1. Navigate to the column editor for the table that you want to configure. For
more information, see "Navigating to the Column Editor" on page 106
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Scroll the column definitions in the Properties pane and select a column
that you want to delete.

4. Click the button.


The Schema Manager prompts you to confirm deletion.
5. Click Yes.
The Schema Manager removes the deleted column definition from the list.

6. Click the button to save your changes.

- 112 -
Configuring Foreign-Key Relationships
Between Base Objects
This section describes how to configure foreign key relationships between
base objects in your Informatica MDM Hub implementation. For a general
overview of foreign key relationships, see "Process Overview for Defining
Foreign-Key Relationships" on page 114. For more information about parent-
child relationships, see "Configuring Match Paths for Related Records" on page
373.

About Foreign Key Relationships


In Informatica MDM Hub, a foreign key relationship establishes an association
between two base objects via matching columns. In a foreign-key
relationship, one base object (the child) contains a foreign key column, which
contains values that match values in the primary key column of another base
object (the parent).

Types of Foreign Key Relationships in ORS Tables

There are two types of foreign-key relationships in Hub Store tables.


Type Description
system foreign Automatically defined and enforced by Informatica MDM
key relationships Hub to protect the referential integrity of your schema.
user-defined Custom foreign key relationships that are manually defined
foreign key according to the instructions later in this section.
relations

Parent and Child Base Objects


The following diagram shows a foreign key relationship between parent and
child base objects. The foreign key column in the child base object points to
the ROWID_OBJECT column in the parent base object.

- 113 -
Process Overview for Defining Foreign-Key
Relationships
To create a foreign-key relationship:
1. Create the parent table. For more information, see "Creating Base
Objects" on page 95.
2. Create the child table. For more information, see "Deleting Base Objects"
on page 101.
3. Define the foreign key relationship between them according to the
instructions in "Adding Foreign-Key Relationships" on page 115.

If the child table contains generated keys from the parent table, the load
process copies the appropriate primary key value from the parent table into
the child table.

- 114 -
Adding Foreign-Key Relationships
To add a foreign-key relationship between two base objects:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, expand a base object (the base object that will be the
child in the relationship).
4. Right-click Relationships.
The Schema Manager displays the Properties tab of the Relationships
page.

5. Click the button.


The Schema Manager displays the Add Relationship dialog.

6. Define the new relationship by selecting:


• a column in the Relate from tree, and
• a column in the Relate to tree
7. If you want, check (select) the Create associated index check box if
you want to create an index on this a foreign key relationship. Metadata is
defined in the ORS that an index exists.

- 115 -
8. Click OK.
9. Click the Diagram tab to view the foreign-key relationship diagram.

10. Click the button to save your changes.

Note: After you have created a relationship, if you go back and try to create
another relationship, the column is not displayed because it is in use. When
you delete the relationship, the column will be displayed.

Editing Foreign-Key Relationships


You can change only the Lookup Display Name in a foreign key relationship. To
 change any other properties, you need to delete the relationship, add it again,
and specify the properties you want.

To edit the lookup display name for a foreign-key relationship between two
base objects:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, expand a base object and right-click Relationships.
The Schema Manager displays the Properties tab of the Relationships
page.

- 116 -
4. On the Properties tab, click the foreign-key relationship whose properties
you want to view.
The Schema Manager displays the relationship details.

5. Click the Edit button next to the Lookup Display Name and specify the
new value.
6. If you want, select the Has Associated Index check box to add an index
on this foreign key relationship, or clear it to remove an existing index.

7. Click the button to save your changes.

Configuring Lookups for Foreign-Key Relationships


After you have created a foreign key relationship, you can configure a lookup
for the column. A lookup causes Informatica MDM Hub to retrieve a data value
from a parent table during the load process. For example, if an Address
staging table includes a CONSUMER_CODE_FK column, you could have
Informatica MDM Hub perform a lookup to the ROWID_OBJECT column in the
Consumer base object and retrieve the ROWID_OBJECT value of the
associated parent record in the Consumer table. For more information, see
"Configuring Lookups For Foreign Key Columns" on page 283.

Deleting Foreign-Key Relationships


You can delete any user-defined foreign-key relationship that has been added
according to the instructions in "Adding Foreign-Key Relationships" on page
115. You cannot delete the system foreign key relationships that Informatica

- 117 -
MDM Hub automatically defines and enforces to protect the referential
integrity of your schema.

To delete a foreign-key relationship between two base objects:


1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, expand a base object and right-click Relationships.
4. On the Properties tab, click the foreign-key relationship that you want to
delete.

5. Click the button.


The Schema Manager prompts you to confirm deletion.
6. Click Yes.
The Schema Manager deletes the foreign key relationship.

7. Click the button to save your changes.

- 118 -
Viewing Your Schema
You can use the Schema Viewer tool in the Hub Console to visualize the
schema in an ORS. The Schema Viewer is particularly helpful for visualizing a
complex schema.

Starting the Schema Viewer


Note: The Schema Viewer can also be launched from within the Metadata
Manager, as described in the Informatica MDM Hub Metadata Manager Guide.
Once started, however, the instructions for using the Schema Viewer are the
same, regardless of where it was launched from.

To start the Schema Viewer tool:


• In the Hub Console, expand the Model workbench, and then click Schema
Viewer.
The Hub Console starts the Schema Viewer and loads the data model,
showing a progress dialog.

The Hub Console displays the Schema Viewer tool.

Panes in the Schema Viewer

The Schema Viewer is divided into two panes.


Pane Description
Diagram Shows a detailed diagram of your schema.
pane
Overview Shows an abstract overview of your schema. The gray box
pane highlights the portion of the overall schema diagram that is
currently displayed in the diagram pane. Drag the gray box to move
the display area over a particular portion of your schema.

Command Buttons in the Schema Viewer

The Diagram Pane in the Schema Viewer contains the following command
buttons:
Button Name Description
Zoom Zooms in and magnifies a smaller area of the schema
In diagram, as described in "Zooming In" on page 120.
Zoom Zooms out and displays a larger area of the schema diagram,
Out as described in "Zooming Out" on page 121.

- 119 -
Button Name Description
Zoom Zooms out to displays the entire schema diagram, as
All described in "Zooming All" on page 121.
Layout Toggles between a hierarchic and orthogonal view, as
described in "Switching Views of the Schema Diagram" on
page 121.
Options Shows or hides column names and controls the orientation of
the hierarchic view, as described in "Configuring Schema
Viewer Options" on page 123.
Save Saves the schema diagram as a JPG file, as described in
"Saving the Schema Diagram as a JPG Image" on page 124.
Print Prints the schema diagram, as described in "Printing the
Schema Diagram" on page 125.

Zooming In and Out of the Schema Diagram


You can zoom in and out of the schema diagram.

Zooming In

To zoom into a portion of the schema diagram:


• Click the button.

The Schema Viewer magnifies a portion of the screen.

Note that the gray highlight box in the Overview Pane has grown smaller to
indicate the portion of the schema that is displayed in the diagram pane.

- 120 -
Zooming Out

To zoom out of the schema diagram:


• Click the button.

The Schema Viewer zooms out of the schema diagram.

Note that the gray box in the Overview Pane has grown larger to indicate a
larger viewing area.

Zooming All

To zoom all of the schema diagram, which means that the entire schema
diagram is displayed in the Diagram Pane:
• Click the button.

The Schema Viewer zooms out to display the entire schema diagram.

Switching Views of the Schema Diagram


The Schema Viewer displays the schema diagram in two different views:
hierarchic and orthogonal.

Hierarchic View

The following figure shows an example of the hierarchic view (the default).

- 121 -
Orthogonal View

The following figure shows the same schema in the orthogonal view.

Toggling Views

To switch between the hierarchic and orthogonal views:


• Click the Layout button.

- 122 -
The Schema Viewer displays the other view.

Navigating to Related Design Objects and Batch


Jobs
Right-clicking on an object in the Schema Viewer displays a context menu.

The context menu displays the following commands.


Command Description
Go to Launches the Schema Manager and displays this base object
BaseObject with an expanded base object node.
Go to Launches the Schema Manager and displays the selected
Staging staging table under the associated base object.
Table
Go to Launches the Mappings tool and displays the properties for
Mapping the selected mapping.
Go to Job Launches the Batch Viewer and displays the properties for the
selected batch job.
Go to Batch Launches the Batch Group tool.
Groups

Configuring Schema Viewer Options


To configure Schema Viewer options:
1. Click the button.
The Schema Viewer displays the Options dialog.

2. Specify the options you want.

- 123 -
Pane Description
Show Controls whether column names appear in the entity boxes.
column • Check (select) this option to display column names in the
names entity boxes.
• Uncheck (clear) this option to hide column names and
display only entity names in the entity boxes.
Orientation Controls the orientation of the schema hierarchy. One of the
following values:
• Top to Bottom (default)—Hierarchy goes from top to
bottom, with the highest-level node at the top.
• Bottom to Top—Hierarchy goes from bottom to top,
with the highest-level node at the bottom.
• Left to Right—Hierarchy goes from left to right, with the
highest-level node at the left.
• Right to Left—Hierarchy goes from right to left, with the
highest-level node at the right.
In the following example, column names are hidden.

3. Click OK.

Saving the Schema Diagram as a JPG Image


To save the schema diagram as a JPG image:
1. Click the button.
The Schema Viewer displays the Save dialog.

- 124 -
2. Navigate to the location on the file system where you want to save the JPG
file.
3. Specify a descriptive name for the JPG file.
4. Click Save.
The Schema Viewer saves the file.

Printing the Schema Diagram


To print the schema diagram:
1. Click the button.
The Schema Viewer displays the Print dialog.

2. Select the print options that you want.

Pane Description
Print Scope of what to print:
Area • Print All—Print the entire schema diagram.
• Print viewable—Print only the portion of the schema diagram
that is currently visible in the Diagram Pane.
Page Page output options, such as media, orientation, and margins.
Settings
Printer Printer options based on available printers in your environment.
Settings
3. Click Print.
The Schema Viewer sends the schema diagram to the printer.

- 125 -
- 126 -
Chapter 6: Configuring Queries and
Packages

This chapter describes how to configure Informatica MDM Hub to provide


queries and packages that data stewards and applications can use to access
data in the Hub Store.

Chapter Contents
• "Before You Begin" on page 127
• "About Queries and Packages" on page 127
• "Configuring Queries" on page 127
• "Configuring Packages" on page 151

Before You Begin


Before you begin to define queries and packages, you must have:
• installed Informatica MDM Hub and created the Hub Store according to the
instructions in the Informatica MDM Hub Installation Guide
• built the schema according to the instructions "Building the Schema" on
page 73

About Queries and Packages


In Informatica MDM Hub, a query is a request to retrieve data from the Hub
Store. A package is a public view of one or more underlying tables in
Informatica MDM Hub. A package is based on a query, which can select
records from a table or from another package. Queries and packages go
together. Queries define the criteria for selecting data, and packages are
views that users use to operate on that data. A query can be used in multiple
packages. For more information, see:
• "Configuring Queries" on page 127
• "Configuring Packages" on page 151

Configuring Queries
This section describes how to create and modify queries using the Queries tool
in the Hub Console. The Queries tool allows you to create simple, advanced,
and custom queries.

- 127 -
About Queries
In Informatica MDM Hub, a query is a request to retrieve data from the Hub
Store. Just like any SQL-based query statement, Informatica MDM Hub
queries allow you to specify, via the Hub Console, the criteria used to retrieve
that data—tables and columns to include, conditions for filtering records, and
sorting and grouping the results. Queries that you save in the Queries tool can
be used in packages, and data stewards can use them in the Data Manager and
Merge Manager tools.

Query Capabilities

You can define a query to:


• return selected columns
• filter the result set with a WHERE clause
• use complex query syntax, such as GROUP BY, ORDER BY, and HAVING
clauses
• use aggregate functions, such as SUM, COUNT, and AVG

Types of Queries

You can create the following types of queries:


Type Description
query Created by selecting tables and columns, and configuring query
conditions, sort by, and group by options, according to the
instructions in "Configuring Queries" on page 130.
custom Created by specifying a SQL statement according to the instructions in
query "Configuring Custom Queries" on page 147.

How Schema Changes Affect Queries

Queries are dependent on the base object columns from which they retrieve
data. If changes are made to the column configuration in the base object
associated with a query, then the queries—including custom queries—are
updated automatically. For example, if a column is renamed, then the name is
updated in any dependent queries. If a column is deleted in the base object,
then the consequences depend on the type of query:
• For a custom query, the query becomes invalid and must be manually
fixed in the Queries tool or the Packages tool. Otherwise, if executed, an
invalid query will return an error.
• For all other queries, the column is removed from the query, as well as
from any packages that depend on the query.

- 128 -
Starting the Queries Tool
To start the Queries tool:
• Expand the Model workbench and then click Queries.
The Hub Console displays the Queries tool.

The Queries tool is divided into two panes:


Pane Description
navigation Displays a hierarchical list of configured queries and query
pane groups.
properties Displays the properties of the selected query or query group.
pane

Configuring Query Groups


This section describes how to configure query groups.

About Query Groups

A query group is a logical group of queries. A query group is simply a


mechanism for organizing queries in the Queries tool.

Adding Query Groups

To add a query group:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click in the navigation pane and choose New Query Group.
The Queries tool displays the Add Query Group window.

- 129 -
4. Enter a descriptive name for this query group.
5. Enter a description for this query group.
6. Click OK.
The Queries tool adds the new query group to the tree.

Editing Query Group Properties

To edit query group properties:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation pane, select the query group that you want to configure.

4. For each property that you want to edit, click the Edit button next to it,
and specify the new value.

5. Click the Save button to save your changes.

Deleting Query Groups

You can delete an empty query group but not a query group that contains
queries.

To delete a query group:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation pane, right-click the empty query group that you want to
delete, and choose Delete Query Group.
The Queries tool prompts you to confirm deletion.
4. Click Yes.

Configuring Queries
This section describes how to configure queries.

Adding Queries

To add a query:

- 130 -
1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the query group to which you want to add the query.
4. Right-click in the Queries pane and choose New Query.
The Queries tool displays the New Query Wizard.
5. If you see a Welcome screen, click Next.

6. Specify the following query properties:

Property Description
Query name Descriptive name for this query.
Description Option description of this query.
Query Group Select the query group to which this query belongs.
Select primary table Primary table from which this query retrieves data.
7. Do one of the following:
• If you want the query to retrieve all columns and all records from the
primary table, click Finish to complete the process of creating the
query.
• If you want to specify selection criteria, click Next and continue.
The Queries tool displays the Select query columns window.

- 131 -
8. Select the query columns from which you want the query to retrieve data.
Note: PUT-enabled packages require the Rowid Object column in the
query.
9. Click Finish.
The Queries tool adds the new query to the tree.
10. Refine the query criteria by proceeding to the instructions in "Editing Query
Properties" on page 132.

Editing Query Properties

Once you have created a query, you can modify its properties to refine the
criteria it uses to retrieve data from the ORS.

To modify the query properties:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation tree, select the query that you want to modify.
The current query properties are displayed in the properties pane.

- 132 -
The properties pane displays the following set of tabs:

Tab Description
Tables Tables associated with this query. Corresponds to the SQL
FROM clause. For more information, see "Configuring the
Table(s) in a Query" on page 134.
Select Columns associated with this query. Corresponds to the SQL
SELECT clause. For more information, see "Configuring the
Column(s) in a Query" on page 136.
Conditions Conditions associated with this query. Determines selection
criteria for individual records. Corresponds to the SQL WHERE
clause. For more information, see "Configuring Conditions for
Selecting Records of Data" on page 139.
Sort Sort order for the results of this query. Corresponds to the SQL
ORDER BY clause. For more information, see "Specifying the
Sort Order for Query Results" on page 142.
Grouping Grouping for the results of this query. Corresponds to the SQL
GROUP BY clause. "Specifying the Grouping for Query Results"
on page 144.
SQL Displays the SQL associated with the selected query settings.
"Viewing the SQL for a Query" on page 147.
4. Make the changes you want.

5. Click the Save button.


The Queries tool validates your query settings and prompts you if it finds
errors.

- 133 -
Configuring the Table(s) in a Query

The Tables tab displays the table(s) from which the query will retrieve
information. The information in this tab corresponds to the SQL FROM clause.

Adding a Table to a Query

To add a table to a query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Tables tab.

4. Click the button.


The Queries tool prompts you to select the table you want to add.

5. Select a table and then click OK.


If one or more other tables exist on the Tables tab, the Queries tool might
prompt you to select a foreign key relationship between the table you just
added and another table.

6. If prompted, select a foreign key relationship (if you want), and then click
OK.
The Queries tool displays the added table in the Tables tab.

- 134 -
For multiple tables, the Queries tool displays all added tables in the Tables
tab.

If you specified a foreign key between tables, the corresponding key


columns are linked. Also, if tables are linked by foreign key relationships,
then the Queries tool allows you to select the type of join for this query.

7. Click the Save button.

Deleting a Table from a Query

A query must have multiple tables in order for you to remove a table. You
cannot remove the last table in a query.

To remove a table from a query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Tables tab.
4. Select the table that you want to delete.

5. Click the button.


The Queries tool removes the selected table from the query.

6. Click the Save button.

- 135 -
Configuring the Column(s) in a Query

The Select tab displays the list of column(s) in one or more source tables from
which the query will retrieve information. The information in this tab
corresponds to the SQL SELECT clause.

Adding Table Column(s) to a Query

To add a table column to a query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Select tab.

4. Click the button.


The Queries tool prompts you to select from a list of one or more tables.

5. Expand the list for the table containing the column that you want to add.

- 136 -
The Queries tool displays the list of columns for the selected table.

6. Select the column(s) you want to include in the query.


7. Click OK.
The Queries tool adds the selected column(s) to the list of columns on the
Select tab.

8. Click the Save button.

Removing Table Column(s) from a Query

To remove a table column from the query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Select tab.
4. Select one or more column(s) that you want to remove.

5. Click the button.


The Queries tool removes the selected column(s) from the query.

6. Click the Save button.

- 137 -
Changing the Column Order

To change the order in which the columns will appear in the result set (if the
list contains multiple columns):
1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Select tab.
4. Select one column that you want to move.
5. Do one of the following:

• To move the selected column up the list, click the button.

• To move the selected column up the list, click the button.


The Queries tool moves the selected column up or down.

6. Click the Save button.

Adding Functions

You can add aggregate functions to your queries (such as COUNT, MIN, or
MAX). At run time, these aggregate functions appear in the usual syntax for
the SQL statement used to execute the query—such as:
select col1, count(col2) as c1 from table_name group by col1

To add a function to a table column:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Select tab.

4. Click the button.


The Queries tool prompts you to select the function you want to add.

5. If you want, select a different column.


6. Select the function that you want to use on the selected column.

- 138 -
7. Click OK.

8. Click the Save button.

Adding Constants

To add a constant to a table column:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Select tab.

4. Click the button.


The Queries tool prompts you to specify the constant that you want to add.

5. Select the data type from the list.


6. Enter a value that is compatible with the selected data type.
7. Click OK.

8. Click the Save button.

Configuring Conditions for Selecting Records of Data

The Conditions tab displays a list of condition(s) that the query will use to
select records from the table. A comparison is a query condition that involves
one column, one operator, and either another column or a constant value. The
information in this tab corresponds to the SQL WHERE clause.

Operators

For an operator, you can select one of the following values.

- 139 -
Operator Description
= Equals.
<> Does not equal.
IS NULL
IS NOT NULL
LIKE Value in the comparison column must be like the search value
(includes column values that match the search value). For example,
if the search value is %JO% for the last_name column, then the
parameter will match column values like “Johnson”, “Vallejo”,
“Major”, and so on.
NOT Value in the comparison column must not be like the search value
LIKE (excludes column values that match the search value). For
example, if the search value is %JO% for the last_name column,
then the parameter will omit column values like “Johnson”,
“Vallejo”, “Major”, and so on.
< Less than.
<= Less than or equal to.
> Greater than.
>= Greater than or equal to.

Adding a Comparison

To add a comparison to this query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Conditions tab.

4. Click the button.


The Queries tool prompts you to add a comparison.

5. If you want, select a different column.


6. Select the operator that you want to use on the selected column.
7. Select the type of comparison (Constant or Column).
• If you select Column, then select a column from the Edit Column drop-
down list.

• If you selected Constant, then click the button, specify the


constant that you want to add, and then click OK.

- 140 -
8. Click OK.
The Queries tool adds the comparison to the list on the Conditions tab.

9. Click the Save button.

Editing a Comparison

To edit a comparison in this query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Conditions tab.
4. Select the comparison that you want to edit.

5. Click the Edit button.


The Queries tool prompts you to edit the comparison.

6. Change the settings you want according to the instructions in "Adding a


Comparison" on page 140.
7. Click OK.
The Queries tool updates the comparison in the list on the Conditions tab.

8. Click the Save button.

Removing a Comparison

To remove a comparison from this query:

- 141 -
1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Conditions tab.
4. Select the comparison that you want to remove.

5. Click the button.


The Queries tool removes the selected comparison from the query.

6. Click the Save button.

Specifying the Sort Order for Query Results

The Sort By tab displays a list of column(s) containing the values that the
query will use to sort the query results at run time. The information in this tab
corresponds to the SQL ORDER BY clause.

Selecting the Sort Columns

To select the sort columns in this query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Sort tab.

4. Click the button.


The Queries tool prompts you to select sort columns.

- 142 -
5. Expand the list for the table containing the column(s) that you want to
select for sorting.
The Queries tool displays the list of columns for the selected table.

6. Select the column(s) you want to use for sorting.


7. Click OK.
The Queries tool adds the selected column(s) to the list of columns on the
Sort By tab.
8. Do one of the following:
• Enable (check) the Ascending check box to sort records in ascending
order for the specified column.
• Disable (uncheck) the Ascending check box to sort records in
descending order for the specified column.

9. Click the Save button.

- 143 -
Removing Table Column(s) from a Sort Order

To remove a table column from the sort by list:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Sort tab.
4. Select one or more column(s) that you want to remove.

5. Click the button.


The Queries tool removes the selected column(s) from the sort by list.

6. Click the Save button.

Changing the Column Order

To change the order in which the columns will appear in the result set (if the
list contains multiple columns):
1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Sort tab.
4. Select one column that you want to move.
5. Do one of the following:

• To move the selected column up the list, click the button.

• To move the selected column up the list, click the button.


The Queries tool moves the selected column up or down a record.

6. Click the Save button.

Specifying the Grouping for Query Results

The Grouping tab displays a list of column(s) containing the values that the
query will use for grouping the query results at run time. The information in
this tab corresponds to the SQL GROUP BY clause.

- 144 -
Selecting the Grouping Columns

To select the grouping columns in this query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Grouping tab.

4. Click the button.


The Queries tool prompts you to select grouping columns.

5. Expand the list for the table containing the column(s) that you want to
select for grouping.
The Queries tool displays the list of columns for the selected table.

- 145 -
6. Select the column(s) you want to use for grouping.
7. Click OK.
The Queries tool adds the selected column(s) to the list of columns on the
Grouping tab.

8. Click the Save button.

Removing Table Column(s) from a Grouping Order

To remove a table column from the grouping list:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Grouping tab.
4. Select one or more column(s) that you want to remove.

5. Click the button.


The Queries tool removes the selected column(s) from the grouping list.

6. Click the Save button.

- 146 -
Changing the Column Order

To change the order in which the columns will be grouped in the result set (if
the list contains multiple columns):
1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Grouping tab.
4. Select one column that you want to move.
5. Do one of the following:

• To move the selected column up the list, click the button.

• To move the selected column up the list, click the button.


The Queries tool moves the selected column up or down a record.

6. Click the Save button.

Viewing the SQL for a Query

The SQL tab displays the SQL statement that corresponds to the query options
you have specified for the selected query.

Configuring Custom Queries


This section describes how to configure custom queries in the Queries tool.

About Custom Queries

A custom query is simply a query for which you supply the SQL statement
directly, rather than building it according to the instructions in "Configuring

- 147 -
Queries" on page 130. Custom queries can be used in packages and in the data
steward tools.

Adding Custom Queries

To add a custom query:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the query group to which you want to add the query.
4. Right-click in the Queries pane and choose New Custom Query.
The Queries tool displays the New Custom Query Wizard.
5. If you see a Welcome screen, click Next.

6. Specify the following custom query properties:

Property Description
Query name Descriptive name for this query.
Description Option description of this query.
Query Group Select the query group to which this query belongs.
7. Click Finish.
The Queries tool displays the newly-added custom query.

- 148 -
8. Click the Edit button next to the SQL field.
9. Enter the SQL query according to the syntax rules for your database
platform.

10. Click the Save button.


If an error occurs when the query is submitted to the database, then the
Queries tool displays the database error message.

Fix any errors and save your changes.

Editing a Custom Query

Once you have created a custom query, you can modify its properties to refine
the criteria it uses to retrieve data from the ORS.

To modify the custom query properties:


1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

- 149 -
3. In the navigation tree, select the custom query that you want to modify.
4. Edit the property settings that you want to change, clicking the Edit button

next to the field if applicable.

5. Click the Save button.


The Queries tool validates your query settings and prompts you if it finds
errors.

Deleting a Custom Query

You delete a custom query in the same way in which you delete a regular
query. For more information, see "Removing Queries" on page 151.

Viewing the Results of Your Query


To view the results of your query:
1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. In the navigation tree, expand the query for which you want to view the
results.
3. Click View.
The Queries tool displays the results of your query.

Viewing the Query Impact Analysis


The Queries tool allows you to view the packages based on a given query,
along with any tables and columns used by the query.

- 150 -
To view the impact analysis of a query:
1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Expand the query group associated with the query you want to select.
3. Right click the query and choose Impact Analysis from the pop-up menu.
4. The Queries tool displays the Impact Analysis dialog.

5. Expand the list next to a table to display the columns associated with the
query, if you want.
6. Click Close.

Removing Queries
If a query has multiple packages based on it, remove those packages first
before attempting to remove the query.

To remove a query:
1. In the Hub Console, start the Queries tool according to the instructions in
"Starting the Queries Tool" on page 129.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Expand the query group associated with the query you want to remove.
4. Select the query you want to remove.
5. Right click the query and choose Delete Query from the pop-up menu.
The Queries tool prompts you to confirm deletion.
6. Click Yes.
The Queries tool removes the query from the list.

Configuring Packages
This section describes how to create and modify PUT and display packages.
You use the Packages tool in the Hub Console to define packages.

- 151 -
About Packages
A package is a public view of one or more underlying tables in Informatica
MDM Hub. Packages represent subsets of the columns in those tables, along
with any other tables that are joined to the tables. A package is based on a
query. The underlying query can select a subset of records from the table or
from another package. For more information, see "Configuring Queries" on
page 127.

What Packages Are Used For

Packages are used for:


• defining user views of the underlying data
• updating data via the Hub Console or applications that invoke Services
Integration Framework (SIF) requests. Some—but not all of the—SIF
requests use packages. For more information, see the Informatica MDM
Hub Services Integration Framework Guide.

How Packages Are Used

Packages are used in the following ways:


• The Informatica MDM Hub security model uses packages to control access
to data for third-party applications that access Informatica MDM Hub
functionality and resources using the Services Integration Framework
(SIF). To learn more, see "About Setting Up Security" on page 621 and the
Informatica MDM Hub Services Integration Framework Guide.
• The Merge Manager and Data Manager tools use packages to determine
the ways in which data stewards can view data. For more information, see
the Informatica MDM Hub Data Steward Guide.
• Hierarchy Manager uses packages. For more information, see the
"Configuring Hierarchies" on page 169 and “Using the Hierarchy Manager”
in the Informatica MDM Hub Data Steward Guide.

Note: If an underlying package changes, the changes are not propagated


through all package layers. Workaround: Do not create a package on a
package OR rebuild the top package when changing the bottom query.

Packages and SECURE Resources

Packages are configured as either SECURE or PRIVATE resources. For more


information, see "Securing Informatica MDM Hub Resources" on page 629.

- 152 -
When to Create a Package

You must create a package if you want your Informatica MDM Hub
implementation to:
• Merge and update records in the Hub Store using the Merge Manager and
Data Manager tools. For more information, see the Informatica MDM Hub
Data Steward Guide.
• Allow an external application user to access Informatica MDM Hub
functionality using Services Integration Framework (SIF) requests. For
more information, see the Informatica MDM Hub Services Integration
Framework Guide.

In most cases, you create one set of packages for the Merge Manager and
Data Manager tools, and a different set of packages for external application
users.

PUT-Enabled and Display Packages

There are two types of packages:


• PUT-enabled packages can be used to update data.
• Display packages cannot be used to update data.

You must use PUT-enabled packages when you:


• execute the SIF put request, which inserts or updates records
• use the Merge Manager and Data Manager tools

PUT-enabled packages:
• cannot include joins to other tables
• cannot be based on system tables or other packages
• cannot be based on queries that have constant columns, aggregate
functions, or group by settings

Note: In the Merge Manager Setup screen, a PUT-enabled package is referred


to as a merge package. The Merge Manager also allows you to choose a
display package.

Starting the Packages Tool


To start the Packages tool:
1. Select the Packages tool in the Model workbench.
The Packages tool is displayed.
2. Select a package in the list.
The Packages tool displays properties for the selected package.

- 153 -
The Packages tool is divided into two panes:

Pane Description
navigation pane Displays a hierarchical list of configured packages.
properties pane Displays the properties of the selected package.

Adding Packages
To add a new package:
1. In the Hub Console, start the Packages tool according to the instructions in
"Starting the Packages Tool" on page 153.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click in the Packages pane and choose New Package.
The Packages tool displays the New Package Wizard.
Note: If the welcome screen is displayed, click Next.

4. Specify the following information.

Field Description
Display Name of this package as it will be displayed in the Hub
Name Console.
Physical Actual name of the package in the database. Informatica MDM
Name Hub will suggest a physical name for the package based on the
display name that you enter.
Description Description of this package.
Enable PUT To create a PUT package, check (select) to insert or update
records into base object tables.
Note: Every package that you use for merging data or
updating data must be PUT-enabled.
If you do not enable PUT, you create a display (read-only)
package.
Secure Check (enable) to make this package a secure resource, which
Resource allows you to control access to this package. Once a package

- 154 -
Field Description
is designated as a secure resource, you can assign privileges
to it in the Roles tool. For more information, see "Securing
Informatica MDM Hub Resources" on page 629, and "Assigning
Resource Privileges to Roles" on page 641.
5. Click Next.
The New Package Wizard displays the Select Query dialog.

6. If you want, click New Query Group to add a new query group, as
described in "Configuring Query Groups" on page 129.
7. If you want, click New Query to add a new query, as described in
"Configuring Queries" on page 130.
8. Select a query.
Note: For PUT-enabled packages:
• only queries with ROWID_OBJECT can be used
• custom queries cannot be used
9. Click Finish.
The Packages tool adds the news package to the list.

Modifying Package Properties


To edit the package properties:
1. In the Hub Console, start the Packages tool according to the instructions in
"Starting the Packages Tool" on page 153.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the package to configure.
4. In the properties panel, change any of the package properties that have an

edit button to the right.

- 155 -
5. If you want, expand the package in the packages list.
6. To change the query, select Query beneath the package and modify the
query as described in "Editing Query Properties" on page 132.

7. To display the package view, select View beneath the package.

- 156 -
Refreshing Packages After Changing Queries
If a query has been changed, then any packages based on that query must be
refreshed.

To refresh a package:
1. In the Hub Console, start the Packages tool according to the instructions in
"Starting the Packages Tool" on page 153.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the package that you want to refresh.
4. From the Packages menu, choose Refresh.

Note: If after a refresh the query remains out of synch with the package, then
simply check (select) or uncheck (clear) any columns for this query. For more
information, see "Configuring the Column(s) in a Query" on page 136.

Specifying Join Queries


You can choose to allows data stewards to view base object information, along
with information from the other tables, in the Data Manager or Merge
Manager.

- 157 -
To expose this information:
1. Create a PUT-enabled base object package.
2. Create a query to join the PUT-enabled base object package with the other
tables.
3. Create a display package based on the query you just created.

Removing Packages
To remove a package:
1. In the Hub Console, start the Packages tool according to the instructions in
"Starting the Packages Tool" on page 153.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the package to remove.
4. Right click the package and choose Delete Package.
The Packages tool prompts you to confirm deletion.
5. Click Yes.
The Packages tool removes the package from the list.

- 158 -
Chapter 7: State Management

This chapter describes how to configure state management in your


Informatica MDM Hub implementation.

Chapter Contents
• "Before You Begin" on page 159
• "About State Management in Informatica MDM Hub" on page 159
• "State Transition Rules for State Management" on page 161
• "Configuring State Management for Base Objects" on page 162
• "Modifying the State of Records" on page 164
• "Rules for Loading Data" on page 168

Before You Begin


Before you begin to use state management, you must have:
• installed Informatica MDM Hub and created the Hub Store according to the
instructions in the Informatica MDM Hub Installation Guide
• built a schema (see "About the Schema" on page 73)

About State Management in Informatica MDM


Hub
Informatica MDM Hub supports workflow tools by storing pre-defined system
states for base object and cross-reference records. By enabling state
management on your data, Informatica MDM Hub offers the following
additional flexibility:
• Allows integration with workflow integration processes and tools
• Supports a “change approval” process
• Tracks intermediate stages of the process (pending records)

- 159 -
System States
System state describes how base object records are supported by Informatica
MDM Hub. The following table describes the supported system states.
System States
State Description
ACTIVE Default state. Record has been reviewed and approved. Active
records participate in Hub processes by default.
This is a state associated with a base object or cross reference
record. A base object record is active if at least one of its cross
reference records is active. A cross reference record contributes to
the consolidated base object only if it is active.
These are the records that are available to participate in any
operation. If records are required to go through an approval
process, then these records have been through that process and
have been approved.
Note that Informatica MDM Hub allows matches to and from
PENDING and ACTIVE records.
PENDING Pending records are records that have not yet been approved for
general usage in the Hub. These records can have most operations
performed on them, but operations have to specifically request
pending records. If records are required to go through an approval
process, then these records have not yet been approved and are in
the midst of an approval process.
If there are only pending cross-reference records, then the Best
Version of the Truth (BVT) on the base object is determined through
trust on the PENDING records.
Note that Informatica MDM Hub allows matches to and from
PENDING and ACTIVE records.
DELETED Deleted records are records that are no longer desired to be part of
the Hub’s data. These records are not used in processes (unless
specifically requested). Records can only be deleted explicitly and
once deleted can be restored if desired. When a record that is
pending is deleted, it is physically deleted, does not enter the
DELETED state, and cannot be restored.
Note that Informatica MDM Hub does not include records in the
DELETED state for trust and validation rules.

Hub State Indicator


All base objects and cross-reference tables have a system column, HUB_
STATE_IND, that indicates the system state for records in those tables. This
column contains the following values associated with system states:
System State Value
ACTIVE (Default) 1
PENDING 0
DELETED -1

- 160 -
Protecting Pending Records Using the Interaction
ID
You can not use the tools in the Hub Console to change the state of a base
object or cross-reference record from PENDING to ACTIVE state if the
interaction_ID is set. The Interaction ID column is used to protect a pending
cross-reference record from updates that are not part of the same process as
the original cross-reference record. Use one of the state management SIF API
requests, instead. For more information, see the .

Note: The Interaction ID can be specified through any API. However, it cannot
be specified when performing batch processing. For example, records that are
protected by an Interaction ID cannot be updated by the Load batch process.

The protection provided by interaction IDs is outlined in the following table.


Note that in the following table the Version A and Version B examples are
used to represent the situations where the incoming and existing interaction
ID do and do not match:
Incoming Interaction ID Existing Interaction ID
Version A Version B Null
Version A OK Error OK
Version B Error OK OK
Null Error Error OK

State Transition Rules for State Management


This section describes transition rules for state management.

About State Transition Rules

State transition rules determine whether and when a record can change from
one state to another. State transition for base object and cross-reference
records can be enabled using the following methods:
• Using the Data Manager or Merge Manager tools in the Hub Console, as
described in the Informatica MDM Hub Data Steward Guide.
• Promote batch job (see "Promote Jobs" on page 552)
• SiperianClient API (see the Informatica MDM Hub Services Integration
Framework Guide)

State transition rules differ for base object and cross-reference records.

Transition Rules for Base Object Records


State Description
ACTIVE • Can transition to DELETED state.

- 161 -
State Description
• Can transition to PENDING state only if the base object record
becomes DELETED and a pending cross-reference record is
added.
PENDING • Can transition to ACTIVE state. This transition is called
promotion. For more information, see "Modifying the State of
Records" on page 164.
• Cannot transition to DELETED state. Instead, a PENDING record
is physically removed from the Hub.
DELETED • Can transition to ACTIVE state only if cross-reference records
are restored.
• Cannot transition to PENDING state.

Transition Rules for Cross-reference (XREF) Records


State Description
ACTIVE • Can transition to DELETED state.
• Cannot transition to PENDING state.
PENDING • Can transition to ACTIVE state. This transition is called
promotion. For more information, see "Modifying the State of
Records" on page 164.
• Cannot transition to DELETED state. Instead, a PENDING record
is physically removed from the Hub.
DELETED • Can transition to ACTIVE state. This transition is called restore.
• Cannot transition to PENDING state.

Hub States and Base Object Record Value


Survivorship
When there are active and pending (or deleted) cross-references together in
the same base object record, whether after a merge, put, or load, the values
on the base object record reflect only the values from the active cross-
reference records. As such:
• ACTIVE values always prevail over PENDING and DELETED values.
• PENDING values always prevail over DELETED values.

Configuring State Management for Base


Objects
You can configure state management for base objects using the Schema tool.
How you configure the base object depends on your focus. Once you enable
state management for a base object, you can also configure the following
options for the base object:
• Enable the history of cross-reference promotion (see "Enabling the History
of Cross-Reference Promotion" on page 163)
• Include pending records in the match process (see "Enabling Match on
Pending Records" on page 163)

- 162 -
• Enable message queue triggers for a state-enabled base object record
(see "Enabling Message Queue Triggers for State Changes" on page 164)

Enabling State Management


State management is configured per base object and is disabled by default—it
must be explicitly enabled.

To enable state management for a base object:


1. Open the Model workbench and click Schema.
2. In the Schema tool, select the desired base object.
3. Click the Enable State Management checkbox on the Advanced tab of
the Base Object properties.

Note: If the base object has custom query, when you disable state
management on the base object, you will always get a warning pop-up
window, even when the hub_state_ind is not included in the custom query.

Enabling the History of Cross-Reference Promotion


When the History of Cross-Reference Promotion option is enabled, the Hub
creates and stores history information in the _HXPR table for a base object
each time an cross-reference belonging to a record in this base object
undergoes a state transition from PENDING (0) to ACTIVE (1).

To enable the history of cross-reference promotion for a base object:


1. Open the Model workbench and click on the Schema tool.
2. In the Schema tool, select the desired base object.
3. Click the Enable State Management checkbox on the Advanced tab of
the Base Object properties.
4. Click the History of Cross-Reference Promotion checkbox on the
Advanced tab of Base Object properties.

Enabling Match on Pending Records


By default, the match process includes only active records and ignores
pending records. For state management-enabled objects, to include pending
records in the match process, match pending records must be explicitly
enabled.

To enable match on pending records for a base object:


1. Open the Model workbench and click on the Schema tool.
2. In the Schema tool, select the desired base object.

- 163 -
3. Click the Enable State Management checkbox on the Advanced tab of
the Base Object properties.
4. Select Match/Merge Setup for the base object.
5. Click the Enable Match on Pending Records checkbox on the
Properties tab of Match/Merge Setup Details panel.

Enabling Message Queue Triggers for State Changes


Informatica MDM Hub uses message triggers to identify which actions are
communicated to outside applications using messages in message queues.
When an action occurs for which a rule is defined, a message is placed in the
message queue. A message trigger specifies the queue in which messages are
placed.

Informatica MDM Hub enables you to trigger message events for base object
record when a pending update occurs. The following message triggers are
available for state changes to base object or cross-reference records:

Event Trigger Action


Add new pending A new pending record is created.
data
Update existing A pending base object record is updated.
pending data
Pending update; only A pending cross-reference record is updated. This
XREF changed event includes the promotion of a record.
Delete base object A base object record is soft deleted.
data
Delete XREF data A cross-reference record is soft deleted.
Delete pending base A base object record is hard deleted.
object data
Delete pending XREF A cross-reference record is hard deleted.
data

To enable the message queue triggers on a pending update for a base object:
1. Open the Model workbench and click on Schema.
2. In the Schema tool, click the Trigger on Pending Updates checkbox for
message queues in the Message Queues tool.

For more information about message queues and message triggers, including
how to enable message queue triggers for state changes to base object and
cross-reference records, see "Configuring Message Triggers" on page 456.

Modifying the State of Records


Promotion of a record is the process of changing the system state of individual
records in Informatica MDM Hub from PENDING state to the ACTIVE state. You

- 164 -
can set a record for promotion immediately using the Data Steward tools, or
you can flag records to be promoted at a later time using the Promote batch
process.

Promoting Records in the Data Steward Tools


You can immediately promote PENDING base object or cross-reference
records to ACTIVE state using the tools in the Data Steward workbench (that
is, the Data Manager or Merge Manager). You can also flag these records for
promotion at a later time using either tool. For more information about using
the Hub Console to perform these tasks, see the Informatica MDM Hub Data
Steward Guide.

Flagging Base Object or Cross-reference Records for Promotion at


a Later Time

To flag base object or cross-reference records for promotion at a later time


using the Data Manager:
1. Open the Data Steward workbench and click on the Data Manager tool.
2. In the Data Manager tool, click on the desired base object or cross-
reference record.

3. Click the Flag for Promote button on the associated panel.

Note: If HUB_STATE_IND is set to read-only for a package, the Set


Record State button is disabled (greyed-out) in the Data Manager and
Merge Manager Hub Console tools for the associated records. However,
the Flag for Promote button remains active because it doesn’t directly
alter the HUB_STATE_IND column for the record(s).
In addition, the Flag for Promote button will always be active for link-
style base objects because the Data Manager does not load the cross-
references of link-style base objects.
4. Run a batch job to promote records that are flagged for promotion. For
more information, see "Promoting Records Using the Promote Batch Job"
on page 166.

Promoting Matched Records Using the Merge Manager

To promote matched records at a later time using the Merge Manager:


1. Open the Data Steward workbench and click on the Merge Manager tool.
2. In the Merge Manager tool, click on the desired matched record.
3. Click on the Flag for Promote button on the Matched Records panel.

- 165 -
You can now promote these PENDING cross-reference records using the
Promote batch job.

Promoting Records Using the Promote Batch Job


You can run a batch job to promote records that are flagged for promotion
using the Batch Viewer or Batch Group tool.

Setting Up a Promote Batch Job Using the Batch Viewer

To set up a batch job using the Batch Viewer to promote records flagged for
promotion:
1. Flag the desired PENDING records for promotion.
For more information, see "Modifying the State of Records" on page 164.
2. Open the Utilities workbench and click on the Batch Viewer tool.
3. Click on the Promote batch job under the Base Object node displayed in
the Batch Viewer.
4. Select Promote flagged records abc.
Where abc represents the associated records that you have previously
flagged for promotion.
5. Click Execute Batch button to promote the records flagged for
promotion.

Setting Up a Promote Batch Job Using the Batch Group Tool

To add a Promote Batch job using the Batch Group Tool to promote records
flagged for promotion:
1. Flag the desired PENDING records for promotion.
For more information, see "Modifying the State of Records" on page 164.
2. Open the Utilities workbench and click on the Batch Group tool.

- 166 -
3. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
4. Right-click the Batch Groups node in the Batch Group tree and choose Add
Batch Group from the pop-up menu (or select Add Batch Group from
the Batch Group menu). For more information, see "Adding Batch Groups"
on page 514.
5. In the batch groups tree, right click on any level, and choose the desired
option to add a new level to the batch group.
The Batch Group tool displays the Choose Jobs to Add to Batch Group
dialog. For more information, see "Adding Levels to a Batch Group" on
page 516.
6. Expand the base object(s) for the job(s) that you want to add.
7. Select the Promote flagged records in [XREF table] job.
8. Click OK.
The Batch Group tool adds the selected job(s) to the batch group.

9. Click the button to save your changes.

- 167 -
You can now execute the batch group job. For more information, see
"Executing a Batch Group" on page 524.

Rules for Loading Data


The load batch process loads records in any state. The state is specified as an
input column on the staging table. The input state can be specified in the
mapping as a landing table column or it can be derived. If an input state is not
specified in the mapping, then the state is assumed to be ACTIVE (for Load
inserts). When a record is updated through a Load batch job and the incoming
state is null, the existing state of the record to update will remain unchanged.

The following table describes how input states affect the states of existing
cross-reference records.
Existing ACTIVE PENDING DELETED No XREF No Base
XREF (Load by Object
State: rowid) Record
Incoming
XREF
State:
ACTIVE Update Update + Update + Insert Insert
Promote Restore
PENDING Pending Pending Pending Pending Pending
Update Update Update + Update Insert
Restore
DELETED Soft Hard Hard Delete Error Error
Delete Delete
Undefined Treat Treat as Treat as Treat as Treat as
as PENDING DELETED ACTIVE ACTIVE
ACTIVE

Note: If history is enabled, after a hard delete, the HUB_STATE_IND is


changed to -9 in the cross-reference history table (HXRF) when cross-
references are deleted. The history table (HIST) will have the HUB_STATE_
IND set to -9 if the base object record is deleted.

- 168 -
Chapter 8: Configuring Hierarchies

This chapter explains how to configure Informatica Hierarchy Manager (HM)


using the Hierarchies tool in the Hub Console. The chapter describes how to
set up your data and how to configure the components needed by Hierarchy
Manager for your Informatica MDM Hub implementation, including entity
types, hierarchies, relationships types, packages, and profiles. For
instructions on using the Hierarchy Manager, see the Informatica MDM Hub
Data Steward Guide. This chapter is recommended for Informatica MDM Hub
administrators and implementers.

Chapter Contents
• "About Configuring Hierarchies" on page 169
• "Starting the Hierarchies Tool" on page 178
• "Configuring Hierarchies" on page 191
• "Configuring Relationship Base Objects and Relationship Types" on page
193
• "Configuring Packages for Use by HM" on page 205
• "Configuring Profiles" on page 211

About Configuring Hierarchies


Informatica MDM Hub administrators use the Hierarchies tool to set up the
structures required to view and manipulate data relationships in Hierarchy
Manager. Use the Hierarchies tool to define Hierarchy Manager components—
such as entity types, hierarchies, relationships types, packages, and profiles—
for your Informatica MDM Hub implementation.

When you have finished defining Hierarchy Manager components, you can use
the package or query manager tools to update the query criteria.

Note: Packages have to be configured for use in HM as well, and the profile
has to be validated.

To understand the concepts in this chapter, you must be familiar with the
concepts in the following chapters in this guide:
• "Building the Schema" on page 73
• "Configuring Queries and Packages" on page 127
• "Configuring the Consolidate Process" on page 443
• "Setting Up Security" on page 621

- 169 -
Before You Begin
Before you begin to configure your Hierarchy Manager (HM) system, you must
have completed the following tasks:
• Start with a blank ORS or a valid ORS and register the database in CMX_
SYSTEM, as described in "Registering an ORS" on page 56.

• Verify that you have a license for Hierarchy Manager. For details, consult
your Informatica MDM Hub administrator.
• Perform data analysis, as described in "Preparing Your Data for Hierarchy
Manager" on page 170.

Overview of Configuration Steps


To configure Hierarchy Manager, complete the following steps:
1. Start the Hub Console, as described in "Starting the Hub Console" on page
30.
2. Launch the Hierarchies tool, as described in "Starting the Hierarchies Tool"
on page 178.
If you have not already created the Repository Base Object (RBO) tables,
Hub Console walks you through the process, as described in "Creating the
HM Repository Base Objects" on page 178.
3. Create entity objects and types, as described in "Configuring Entity Objects
and Entity Types" on page 181.
4. Create hierarchies, as described in "Configuring Hierarchies" on page 191.
5. Create relationship objects and types, as described in "Configuring
Relationship Base Objects and Relationship Types" on page 193.
6. Create packages, as described in "Configuring Packages for Use by HM" on
page 205.
7. Configure profiles, as described in "Deleting Relationship Types from a
Profile" on page 215.
8. Validate the profile, as described in "Validating Profiles" on page 213.

Note: The same options you see on the right-click menu in the Hierarchy
Manager are also available on the Hierarchies menu.

Preparing Your Data for Hierarchy Manager


To make the best use of HM, you should analyze your information and make
sure you have done the following:
• Verified that your data sources are valid and trustworthy.

- 170 -
For more information on security issues, see "Setting Up Security" on page
621.
• Created valid schema to work with Informatica MDM Hub and the HM.
For more information on schemas and how to create them, see "Building
the Schema" on page 73.
• Created all relationships between your entities, including:
• Hierarchical relationships:
• All child entities must have a valid parent entity related to them.
Your data cannot have any ‘orphan’ child entities when it enters
HM.
• All hierarchies must be validated (see "Informatica MDM Hub
Processes" on page 218).
• Foreign key relationships.
For a general overview of foreign key relationships, see "Process
Overview for Defining Foreign-Key Relationships" on page 114. For
more information about parent-child relationships, see "Configuring
Match Paths for Related Records" on page 373.
• One-hop and multi-hop relationships (direct and indirect relationships
between entities). For more information on these kinds of
relationships, see the Informatica MDM Hub Data Steward Guide.
• Derived HM types.
• Consolidated duplicate entities from multiple source systems.
For example, a group of entities (Source A) might be the same as another
group of entities (Source B), but the two groups of entities might have
different group names. Once the entities are identified as being identical,
the two groups can be consolidated.
For more information on consolidation, see "Informatica MDM Hub
Processes" on page 218.
• Grouped your entities into logical categories, such as physician’s names
into the “Physician” category.
For more information on how to group your data, see "Configuring
Operational Reference Stores and Datasources" on page 54.
• Made sure that your data complies with the rules for:
• Referential integrity.
• Invalid data.
• Data volatility.
For more information on these database concepts, see a database
reference text.

- 171 -
Use Case Example of How to Prepare Data for
Hierarchy Manager
This section contains an example of how to manipulate your data before it
enters Informatica MDM Hub and before it is viewed in Hierarchy Manager.
Typically, a company’s data would be much larger than the example given
here.

Scenario

John has been tasked with manipulating his company’s data so that it can be
viewed and used within Hierarchy Manager in the most efficient way. To
simplify the example, we are describing a subset of the data that involves
product types and products of the company, which sells computer
components.

The company sells three types of products: mice, trackballs, and keyboards.
Each of these product types includes several vendors and different levels of
products, such as the Gaming keyboard and the TrackMan trackball.

Methodology

This section describes the method of data simplification.

Step 1 - Organizing Data into the Hierarchy

In this step you organize the data into the Hierarchy that will then be
translated into the HM configuration.

John begins by analyzing the product and product group hierarchy. He


organizes the products by their product group and product groups by their
parent product group. The sheer volume of data and the relationships
contained within the data are difficult to visualize, so John lists the categories
and sees if there are relationships between them.

The following table (which contains data from the Marketing department)
shows an example of how John might organize his data.

- 172 -
Note: Most data sets will have many more items.

The table shows the data that will be stored in the Products BO. This is the BO
to convert (or create) in HM. The table shows Entities, such as Mice or Laser
Mouse. The relationships are shown by the grouping, that is, there is a
relationship between Mice and Laser Mouse. The heading values are the Entity
Types: Mice is a Product Group and Laser Mouse is a Product. This Type is
stored in a field on the Product table.

Organizing the data in this manner allows John to clearly see how many
entities and entity types are part of the data, and what relationships those
entities have.

The major category is ProdGroup, which can include both a product group
(such as mice and pointers), the category Product, and the products
themselves (such as the Trackman Wheel). The relationships between these
items can be encapsulated in a relationship object, which John calls Product
Rel. In the information for the Product Rel, John has explained the
relationships: Product Group is the parent of both Product and Product Group.

Step 2 - Creating Relationship Base Object Tables

Having analyzed the data, John comes to the following conclusions:


• Product (the BO) should be converted to an Entity Object.
• Product Group and Product are the Entity Types.
• Product Rel is the Relationship Object to be created.
• The following relationship types (not all shown in the table) need to be
created:
• Product is the parent of Product (not shown)
• Product Group is the parent of Product (such as with the Mice to Laser
Mouse example).
• Product Group is the parent of Product Group, such as with Mice +
Pointers being the parent of Mice).

John begins by accessing the Hierarchy Tool. When he accesses the tool, the
system creates the Relationship Base Object Tables (RBO tables). RBO tables
are essentially system base objects that are required base objects containing
specific columns. They store the HM configuration data, such as the data that
you see in the table in Step 1.

For instructions on how to create base objects, see "Configuring Base Objects"
on page 82. This section describes the choices you would make when you
create the example base objects in the Schema tool.

- 173 -
You must create and configure a base object for each entity object and
relationship object that you identified in the previous step. In the example,
you would create a base object for Product and convert it to an HM Entity
Object. The Product Rel BO should be created in HM directly (an easier
process) instead of converting. Each new base object is displayed in the
Schema panel under the category Base Objects. Repeat this process to create
all your base objects.

In the next section, you configure the base objects so that they are optimized
for HM use.

Step 3 - Configuring Base Objects

You created the two base objects (Product and Product Rel) in the previous
section. This section describes how to configure them.

Configuring a base object involves filling in the criteria for the object’s
properties, such as the number and type of columns, the content of the staging
tables, the name of the cross-reference tables (if any), and so on. You might
also enable the history function, set up validation rules and message triggers,
create a custom index, and configure the external match table (if any).

Whether or not you choose these options and how you configure them depends
on your data model and base object choices.

In the example, John configures his base objects as the following sections
explain.

Note: Not all components of the base-object creation are addressed here,
only the ones that have specific significance for data that will be used in the
HM. For more information on the components not discussed here, see
"Building the Schema" on page 73.

Columns

- 174 -
This table shows the Product BO after conversion to an HM entity object. In
this list, only the Product Type field is an HM field.

Every base object has system columns and user-defined columns. System
columns are created automatically, and include the required column: Rowid
Object. This is the Primary key for each base object table and contains a
unique, Hub-generated value. This value cannot be null because it is the HM
lookup for the class code. HM makes a foreign key constraint in the database
so a ROWID_OBJECT value is required and cannot be null.

For the user-defined columns, John choose logical names that would
effectively include information about the products, such as Product Number,
Product Type, and Product Description. These same column and column values
must appear in the staging tables.

Staging Tables

John makes sure that all the user-defined columns from the staging tables are
added as columns in the base object, as the graphic above shows. The Lookup
column shows the HM-added lookup value.

Notice that several columns in the Staging Table (Status Cd, Product Type, and
Product Type Cd) have references to lookup tables. You can set these
references up when you create the Staging Table. You would use lookups if
you do not want to hardcode a value in your staging table, but would rather
have the server look up a value in the parent table.

Most of the lookups are unrelated to HM and are part of the data model. The
Rbo Bo Class lookup is the exception because it was added by HM. HM adds
the lookup on the product Type column.

Note: When you are converting entities to entity base objects (entities that
are configured to be used in HM), you must have lookup tables to check the
values for the Status Cd, Product Type, and Product Type Cd.

Warning: HM Entity objects do not require start and end dates. Any start and
end dates would be user defined. However, Rel Objects do use these. Do not
create new Rel Objects with different names for start and end dates. These
are already provided.

- 175 -
Step 4 - Creating Entity Types

You create entity types in the Hierarchy Tool. John creates two entity types:
ProdGroup and Product Type. The following figure shows the completed
Product Entity Type information.

Each entity type has a code that derives from the data analysis and the design.
In this example, John chose to use Product as one type, and Product Group as
another.

This code must be referenced in the corresponding RBO base object table. In
this example, the code Product is referenced in the C_RBO_BO_CLASS table.
The value of the BO_CLASS_CODE is ‘Product’.

The following figure shows the relationship between the HM entity objects and
HM relationship objects to the RBO tables:

- 176 -
When John has completed all the steps in this section, he will be ready to
create other HM components, such as packages, and to view his data in the
HM. For example, the following graphic shows the relationships that John has
set up in the Hierarchies Tool, displayed in the Hierarchy Manager. This
example shows the hierarchy involving Mice devices fully. For more
information on how to use HM, see the Informatica MDM Hub Data Steward
Guide.

- 177 -
Starting the Hierarchies Tool
To start the Hierarchies tool:
• In the Hub Console, do one of the following:
• Expand the Model workbench, and then click Hierarchies.
OR

• In the Hub Console toolbar, click the Hierarchies tool button


The Hub Console displays the Hierarchies tool.

If you are setting up the Hierarchies tool, see "Creating the HM Repository
Base Objects" on page 178. If you already have RBO tables set up, see
"Configuring Entity Icons" on page 180.

Creating the HM Repository Base Objects


To use the Hierarchies tool with an ORS, the system must first create the
Repository Base Objects (RBO tables) for the ORS. RBO tables are essentially
system base objects. They are required base objects that must contain
specific columns.

Queries and MRM packages (and their associated queries) will also be created
for these RBO tables.

Warning: Never modify these RBO tables, queries, or packages.

To create the RBOs:


1. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
2. Start the Hierarchies tool. Expand the Model workbench and click
Hierarchies. For more information, see "Starting the Hierarchies Tool" on
page 178.

Note: Any option that you can select by right-clicking in the navigation panel,
you can also choose from the Hierarchies tool menu.

After you start the Hierarchies tool, if an ORS does not have the necessary
RBO tables, then the Hierarchies tool walks you through the process of
creating them.

The following steps explain what to select in the dialog boxes that the
Hierarchies tool displays:

- 178 -
1. Choose Yes in the Hub Console dialog to create the metadata (RBO tables)
for HM in the ORS.
2. Select the tablespace names in the Create RBO tables dialog, and then
click OK.

Uploading Default Entity Icons


The Hierarchies tool prompts you to upload default entity icons. These icons
will be useful to you when you are creating entities.
1. Click Yes.
2. The Hub Console displays the Hierarchies tool with the default metadata.

Upgrading From Previous Versions of Hierarchy


Manager
After you upgrade a pre-XU schema to XU, you will be prompted to upgrade
the XU-specific Hierarchy Manager (HM) metadata when you open the
Hierarchies tool in the Hub Console.

To upgrade the HM metadata:


1. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
2. Start the Hub Console. For more information, see "Starting the Hub
Console" on page 30.

- 179 -
3. Launch the Hierarchies tool in the Hub Console.
4. Click Yes to add additional columns.

After you upgrade a pre-XU schema to XU, you will be reminded to remove
obsolete HM metadata when you get into the Hierarchies tool.

To remove obsolete HM metadata:


1. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
2. Start the Hub Console. To learn more see the "Starting the Hub Console"
on page 30.
3. Launch the Hierarchies tool in the Hub Console.

4. Click Yes to delete a base object.

Note: If the Rbo Rel Type Usage base object is being used by some other non-
HM base object, you will be told to manually delete the table by going to the
schema manager.

Informatica MDM Hub shows relationship and entity types under the base
object with which they are associated. If a type is not associated with a base
object, for example it does not have packages assigned, it is not displayed in
the GUI, but does remain in the database.

During the ORS upgrade process, the migration script skips over the orphan
entity and relationship types, displays a related warning message, then
continues. After the ORS upgrade, you can delete the orphan types or
associate entities and relationship types with them.

If you want to associate orphan types but you have not created the
corresponding base objects, create the objects, then press refresh. The
software prompts you to create the association.

Configuring Entity Icons


Using the Hierarchies tool, you can add or configure your own entity icons that
you can subsequently use when configuring your entity types. These entity
icons are used to represent your entities in graphic displays within Hierarchy
Manager. Entity icons must be stored in a JAR or ZIP file.

Adding Entity Icons

To import your own icons, create a ZIP or JAR file containing your icons. For
each icon, create a 16 x 16 icon for the small icon and a 48 x 48 icon for the
large icon.

To add new entity icons:

- 180 -
1. Acquire a write lock.
2. Start the Hierarchies tool.
3. Right-click anywhere in the navigation pane and choose Add Entity
Icons.
Note: You must acquire a lock to display the right-click menu.
A browse files window opens.
4. Browse for the JAR or ZIP file containing your icons.
5. Click Open to add the icons.

Modifying Entity Icons

You cannot modify icons directly from the console. You can download a ZIP or
JAR file, modify its contents, then upload it again into the console.

You can either delete icons groups or make them inactive. If an icon is already
associated with an entity, or if you could use a group of icons in the future,
you might consider choosing to inactivate them instead of deleting them.

You inactivate a group of icons by marking the icon package Inactive.


Inactive icons are not displayed in the UI and cannot be assigned to an entity
type. To reactivate the icon packet, mark it Active.

Warning: Informatica MDM Hub does not validate icons assignments before
deleting. If you delete an icon that is currently assigned to an Entity Type, you
will get an error when you try to save the edit.

Deleting Entity Icons

You cannot delete individual icons from a ZIP or JAR file from the console; you
can only delete them as a group or package.

To delete a group of entity icons:


1. Acquire a write lock.
2. Start the Hierarchies tool. For more information, see "Starting the
Hierarchies Tool" on page 178.
3. Right-click the icon collections in the navigation pane and choose Delete
Entity Icons.

Configuring Entity Objects and Entity Types


This section describes how to define entity objects and entity types using the
Hierarchies tool.

- 181 -
About Entities, Entity Objects, and Entity Types

This section describes entities, entity objects, and entity types in Hierarchy
Manager.

Entities

In Hierarchy Manager, an entity is any object, person, place, organization, or


other thing that has a business meaning and can be acted upon in your
database. Examples include a specific person’s name, a specific checking
account number, a specific company, a specific address, and so on.

Entity Base Objects

An entity base object is a base object that has been configured in HM, and that
is used to store HM entities. When you create an entity base object using the
Hierarchies tool (instead of the Schema Manager), the Hierarchies tool
automatically creates the columns required for Hierarchy Manager. You can
also convert an existing MRM base object to an entity base object by using the
options in the Hierarchies tool.

After adding an entity base object, you use the Schema Manager to view, edit,
or delete it. For more information, see "Configuring Base Objects" on page 82.

Entity Types

In Hierarchy Manager, an entity type is a logical classification of one or more


entities. Examples include doctors, checking accounts, banks, and so on. An
Entity Base Object must have a Foreign Key to the Entity Type table (Rbo BO
Class). The foreign key can be defined as either a ROWID or predefined Code
value. All entities with the same entity type are stored in the same entity
object. In the Hierarchies tool, entity types are displayed in the navigation
tree under the Entity Object with which the Type is associated.

Well-defined entity types have the following characteristics:


• They effectively segment the data in a way that reflects the real-world
nature of the entities.
• They are disjoint. That is, no entity can have more than one entity type.
• Taken collectively, they cover the entire set of entities. That is, every
entity has one and only one entity type.
• They are granular enough so that you can easily define the types of
relationships that each entity type can have. For example, an entity type
of “doctor” can have the relationships: “member of” with a medical group,
“staff” (or “non-staff with admitting privileges”) with a hospital, and so on.

- 182 -
• A more general entity type, such as “care provider” (which encompasses
nurses, nurse practitioners, doctors, and others) is not granular enough. In
this case, the types of relationships that such a general entity type will
have will depend on something beyond just the entity type. Therefore, you
need to need to define more-granular entity types.

Adding Entity Base Objects

To create a new entity base object:


1. In the Hierarchies tool, acquire a write lock.
2. Right-click anywhere in the navigation pane and choose Create New
Entity/Relationship Object. You can also choose this option from the
Hierarchies tool menu.
3. In the Create New Entity/Relationship Base Object, select Create New
Entity Base Object and click OK.

4. Click OK.
The Hierarchies tool prompts you to enter information about the new base
object.

5. Specify the following properties for this new entity type.

Field Description
Item Type Read-only. Already specified.
Display Name of this base object as it will be displayed in the Hub

- 183 -
Field Description
name Console.
Physical Actual name of the table in the database. Informatica MDM
name Hub will suggest a physical name for the table based on the
display name that you enter.
The RowId is generated and assigned by the system, but the
BO Class Code is created by the user, making it easier to
remember.
Data Name of the data tablespace. For more information, see the
tablespace Informatica MDM Hub Installation Guide.
Index Name of the index tablespace. For more information, see the
tablespace Informatica MDM Hub Installation Guide.
Description Description of this base object.
Foreign Column used as the Foreign Key for this entity type; can be
Key either ROWID or CODE.
column for
Entity The ability to choose a BO Class CODE column reduces the
Types complexity by allowing you to define the foreign key
relationship based on a predefined code, rather than the
Informatica MDM Hub-generated ROWID.
Display Descriptive name of the column of the Entity Type Foreign Key
name that is displayed in Hierarchy Manager.
Physical Actual name of the FK column in the table. Informatica MDM
name Hub will suggest a physical name for the FK column based on
the display name that you enter.
6. Click OK to save the new base object.

The base object you created has the columns required by Hierarchy Manager.
You probably require additional columns in the base object, which you can add
using the Schema Manager, as described in "Configuring Columns in Tables"
on page 102.

Important: When you modify the base object using the Schema Manager, do
not change any of the columns added by Hierarchy Manager. Modifying any of
these Hierarchy Manager columns will result in unpredictable behavior and
possible data loss.

Converting Base Objects to Entity Base Objects

You must convert base objects to entity base objects before you can use them
in HM.

Base objects created in MRM do not have the metadata required by Hierarchy
Manager. In order to use these MRM base objects with HM, you must add this
metadata via a conversion process. Once you have done this, you can use
these converted base objects with both MRM and HM.

To convert an existing MRM base object to work with HM:


1. In the Hierarchies tool, acquire a write lock.

- 184 -
2. Right-click anywhere in the navigation pane and choose Convert BO to
Entity/Relationship Object.
Note: The same options you see on the right-click menu are also available
on the Hierarchies menu.
3. In the Modify Existing Base Object dialog, select Convert to Entity and
click OK.

Note: If you do not see any choices in the Modify Base Object field, then
there are no non-hierarchy base objects available. You must create one in
the Schema tool.
4. Click OK.
If the base object already has HM metadata, the Hierarchies tool will
display a message indicating the HM metadata that exists.

5. In the Foreign Key Column for Entity Types field, select the column to be
added: RowId Object or BO Class Code.

- 185 -
This is the descriptive name of the column of the Entity Type Foreign Key
that is displayed in Hierarchy Manager.
The ability to choose a BO Class Code column reduces the complexity by
allowing you to define the foreign key relationship based on a predefined
code, rather than the Informatica MDM Hub-generated ROWID.
6. In the Existing BO Column to use, select an existing column or select the
Create New Column option.
If no BO columns exist, only the Create New Column option is available.
7. In the Display Name and Physical Name fields, create display and physical
names for the column, and click OK.

The base object will now have the columns that Hierarchy Manager requires.
To add additional columns, use the Schema Manager (see "Configuring
Columns in Tables" on page 102).

Important: When you modify the base object using the Schema Manager
tool, do not change any of the columns added using the Hierarchies tool.
Modifying any of these columns will result in unpredictable behavior and
possible data loss.

Adding Entity Types

To add a new entity type:


1. In the Hierarchies tool, right-click on the entity object in which you want to
store the new entity type you are creating and select Add Entity Type.

The Hierarchies tool displays a new entity type (called New Entity Type)
in the navigation tree under the Entity Object you selected.

- 186 -
2. In the properties panel, specify the following properties for this new entity
base object.

Field Description
Code Unique code name of the Entity Type. Can be used as a foreign
key from HM entity base objects.
Display Name of this entity type as it will be displayed in the Hub
name Console. Specify a unique, descriptive name.
Description Description of this entity type.
Color Color of the entities associated with this entity type as they
will be displayed in the Hub Console in the Hierarchy Manager
Console and Business Data Director.
Small Icon Small icon for entities associated with this entity type as they
will be displayed in the Hub Console in the Hierarchy Manager
Console and Business Data Director.
Large Icon Large icon for entities associated with this entity type as they
will be displayed in the Hub Console in the Hierarchy Manager
Console and Business Data Director.

3. To designate a color for this entity type, click next to Color.


The color choices window is displayed.

The color you choose determines how entities of this type are displayed in
the Hierarchy Manager. Select a color and click OK.

- 187 -
4. To select a small icon for the new entity type, click next to Small
Icon.
The Choose Small Icon window is displayed.

Small icons determine how entities of this type are displayed when the
graphic view window shows many entities. To learn more about adding
icon graphics for your entity types, see "Configuring Entity Icons" on page
180.
Select a small icon and click OK.

5. To select a large icon for the new entity type, click next to Large Icon.
The Choose Large Icon window is displayed.

Large icons determine how entities of this type are displayed when the
graphic view window shows few entities. To learn more about adding icon
graphics for your entity types, see "Configuring Entity Icons" on page 180.
Select a large icon and click OK.

6. Click to save the new entity type.

Editing Entity Types

To edit an entity type:


1. In the Hierarchies tool, in the navigation tree, click the entity type to edit.

- 188 -
2. For each field that you want to edit, click and make the change that
you want. For more information about these fields, see "Adding Entity
Types" on page 186.

3. When you have finished making changes, click to save your changes.

Warning: If your entity object uses the code column, you probably do not
want to modify the entity type code if you already have records for that entity
type.

Deleting Entity Types

You can delete any entity type that is not used by any relationship types. If the
entity type is being used by one or more relationship types, attempting to
delete it will generate an error.

To delete an entity type:


1. Acquire a write lock.
2. In the Hierarchies tool, in the navigation tree, right-click the entity type
that you want to delete, and choose Delete Entity Type.
If the entity type is not used by any relationship types, then the
Hierarchies tool prompts you to confirm deletion.
3. Choose Yes.
The Hierarchies tool removes the selected entity type from the list.

Warning: You probably do not want to delete an entity type if you already
have entity records that use that type. If your entity object uses the code
column instead of the rowid column and you have records in that entity object
for the entity type you are trying to delete, you will get an error.

Display Options for Entities

In addition to configuring color and icons for entities, you can also configure
the font size and maximum width. While color and icons can be specified for
each entity type, the font size and width apply to entities of all types.

To change the font size in HM, use the HM Font Size and Entity Box Size. The
default entity font size (38 pts) and max entity box width (600 pixels) can be
overridden by settings in the cmxserver.properties file. The settings to use
are:
sip.hm.entity.font.size=fontSize
sip.hm.entity.max.width=maxWidth

- 189 -
The value for fontSize can be from 6 to 100 and the value for maxWidth can be
from 20 to 5000. If value specified is outside the range, the minimum or
maximum values are used. Default values are used if the values specified are
not numbers.

Reverting Entity Base Objects to Base Objects

If you inadvertently converted a base object to an entity object, or if you no


longer want to work with an entity object in Hierarchy Manager, you can
revert the entity object to a base object. In doing so, you are removing the HM
metadata from the object.

To revert an entity base object to a base object:


1. In the Hierarchies tool, acquire a write lock.
2. Right-click on an entity base object and choose Revert
Entity/Relationship Object to BO.
3. If you are prompted to revert associated relationship objects, click OK.

Note that when you revert the entity object, you are also reverting its
corresponding relationship objects.
4. In the Revert Entity/Relationship Object dialog box, click OK.

- 190 -
A dialog is displayed when the entity is reverted.

Configuring Hierarchies
This section describes how to define hierarchies using the Hierarchies tool.

About Hierarchies
A hierarchy is a set of relationship types (as described in "About Relationships,
Relationship Objects, and Relationship Types" on page 193). These
relationship types are not ranked, nor are they necessarily related to each
other. They are merely relationship types that are grouped together for ease
of classification and identification. The same relationship type can be
associated with multiple hierarchies. A hierarchy type is a logical classification
of hierarchies.

Adding Hierarchies
To add a new hierarchy:
1. In the Hierarchies tool, acquire a write lock.
2. Right-click an entity object in the navigation pane and choose Add
Hierarchy.

- 191 -
The Hierarchies tool displays a new hierarchy (called New Hierarchy) in
the navigation tree under the Hierarchies node. The default properties are
displayed in the properties pane.

3. Specify the following properties for this new hierarchy.

Field Description
Code Unique code name of the hierarchy. Can be used as a foreign
key from HM relationship base objects.
Display Name of this hierarchy as it will be displayed in the Hub
name Console. Specify a unique, descriptive name.
Description Description of this hierarchy.

4. Click to save the new hierarchy.

Editing Hierarchies
To edit a hierarchy:
1. In the Hierarchies tool, acquire a write lock.
2. In the navigation tree, click the hierarchy to edit.

3. Click and edit the name.

4. Click to save your changes.

Warning: If your relationship object uses the hierarchy code column (instead
of the rowid column), you probably do not want to modify the hierarchy code if
you already have records for that hierarchy in the relationship object.

Deleting Hierarchies
Warning: You do not want to delete a hierarchy if you already have
relationship records that use the hierarchy. If your relationship object uses

- 192 -
the hierarchy code column instead of the rowid column and you have records
in that relationship object for the hierarchy you are trying to delete, you will
get an error.

To delete a hierarchy:
1. In the Hierarchies tool, acquire a write lock.
2. In the navigation tree, right-click the hierarchy that you want to delete,
and choose Delete Hierarchy.
The Hierarchies tool prompts you to confirm deletion.
3. Choose Yes.
The Hierarchies tool removes the selected hierarchy from the list.

Note: You are allowed to delete a hierarchy that has relationship types
associated with it. There will be a warning with the list of associated
relationship types. If you elect to delete the hierarchy, all references to it will
automatically be removed.

Configuring Relationship Base Objects and


Relationship Types
This section describes how to define relationship types using the Hierarchies
tool.

About Relationships, Relationship Objects, and


Relationship Types
This section introduces relationships, relationship base objects, and
relationship types in Hierarchy Manager.

Relationships

A relationship describes the affiliation between two specific entities. Hierarchy


Manager relationships are defined by specifying the relationship type,
hierarchy type, attributes of the relationship, and dates for when the
relationship is active.

Relationship Base Objects

A relationship base object is a base object used to store HM relationships.

- 193 -
Relationship Types

A relationship type describes classes of relationships and defines the types of


entities that a relationship of this type can include, the direction of the
relationship (if any), and how the relationship is displayed in the Hub Console.

Note: Relationship Type is a physical construct and can be configuration


heavy, while Hierarchy Type is more of a logical construct and is typically
configuration light. Therefore, it is often easier to have many Hierarchy Types
than to have many Relationship Types. Be sure to understand your data and
hierarchy management requirements prior to defining Hierarchy Types and
Relationship Types within Informatica MDM Hub.

A well defined set of Hierarchy Manager relationship types has the following
characteristics:
• It reflects the real-world relationships between your entity types.
• It supports multiple relationship types for each relationship.

Configuring Relationship Base Objects


This section describes how to configure relationship base objects in the
Hierarchies tool.

Creating Relationship Base Objects

A relationship base object is used to store HM relationships.

To add a new relationship base object:


1. In the Hierarchies tool, acquire a write lock.
2. Right-click anywhere in the navigation pane and choose Create New
Entity/Relationship Object...
The Hierarchies tool prompts you to select the type of base object to
create.

3. Select Create New Relationship Base Object.


4. Click OK.

- 194 -
The Hierarchies tool prompts you to enter information about the new
relationship base object.

5. Specify the following properties for this new entity base object.

Field Description
Item Type Read-only. Already specified.
Display Name of this base object as it will be displayed in the Hub
name Console.
Physical Actual name of the table in the database. Informatica MDM
name Hub will suggest a physical name for the table based on the
display name that you enter.
Data Name of the data tablespace. For more information, see the
tablespace Informatica MDM Hub Installation Guide.
Index Name of the index tablespace. For more information, see the
tablespace Informatica MDM Hub Installation Guide.
Description Description of this base object.
Entity Base Entity base object to be linked via this relationship base
Object 1 object.
Display Name of the column that is a FK to the entity base object 1.
name
Physical Actual name of the column in the database. Informatica MDM
name Hub will suggest a physical name for the column based on the
display name that you enter.
Entity Base Entity base object to be linked via this relationship base
Object 2 object.

- 195 -
Field Description
Display Name of the column that is a FK to the entity base object 2.
name
Physical Actual name of the column in the database. Informatica MDM
name Hub will suggest a physical name for the column based on the
display name that you enter.
Hierarchy Column used as the foreign key for the hierarchy; can be
FK Column either ROWID or CODE.
The ability to choose a BO Class CODE column reduces the
complexity by allowing you to define the foreign key
relationship based on a predefined code, rather than the
Informatica MDM Hub-generated ROWID.
Hierarchy Name of this FK column as it will be displayed in the Hub
FK Display Console
Name
Hierarchy Actual name of the hierarchy foreign key column in the table.
FK Physical Informatica MDM Hub will suggest a physical name for the
Name column based on the display name that you enter.
Rel Type Column used as the foreign key for the relationship; can be
FK Column either ROWID or CODE.
Rel Type Name of the column that is used to store the Rel Type CODE or
Display ROWID.
Name
Rel Type Actual name of the relationship type FK column in the table.
Physical Informatica MDM Hub will suggest a physical name for the
Name column based on the display name that you enter.
6. Click OK to save the new base object.

The relationship base object you created has the columns required by
Hierarchy Manager. You may require additional columns in the base object,
which you can add using the Schema Manager, as described in "Configuring
Columns in Tables" on page 102.

Important: When you modify the base object using the Schema Manager, do
not change any of the columns added by Hierarchy Manager. Modifying any of
these columns will result in unpredictable behavior and possible data loss.

Creating a Foreign Key Relationship Base Object

A foreign key relationship base object is an entity base object with a foreign
key to another entity base object.

To create a foreign key relationship base object:


1. In the Hierarchies tool, acquire a write lock.
2. Right-click anywhere in the navigation pane and choose Create Foreign
Key Relationship.
The Hierarchies tool displays the Modify Existing Base Object dialog.

- 196 -
3. Specify the base object and the number of Foreign Key columns, then click
OK.
The Hierarchies tool displays the Convert to FK Relationship Base Object
dialog.

4. Specify the following properties for this new FK relationship object.

Field Description
FK Select FK entity base object from list.
Constraint
Entity BO 1
Existing BO Name of existing base object column used for FK, or choose
Column to to create a new column.
Use
FK Column Name of FK column as it will be displayed in the Hub Console.
Display
Name 1
FK Column Actual name of FK column in the database. Informatica MDM
Physical Hub will suggest a physical name for the table based on the
Name 1 display name that you enter.
FK Column Choose Entity1 or Entity2, depending on what the FK Column
Represents represents in the relationship.

- 197 -
5. Click OK to save the new FK relationship object.

The base object you created has the columns required by Hierarchy Manager.
You may require additional columns in the base object, which you can add
using the Schema Manager, as described in "Configuring Columns in Tables"
on page 102.

Important: When you modify the base object using the Schema Manager
tool, do not change any of the columns added by the Hierarchies tool.
Modifying any of these columns will result in unpredictable behavior and
possible data loss.

For more information about foreign key relationships, see "Building the
Schema" on page 73

Converting Base Objects to Relationship Base Objects

Relationship base objects are tables that contain information about two entity
base objects.

Base objects created in MRM do not have the metadata required by Hierarchy
Manager for relationship information. In order to use these MRM base objects
with Hierarchy Manager, you must add this metadata via a conversion
process. Once you have done this, you can use these converted base objects
with both MRM and HM.

To convert a base object to a relationship object for use with HM:


1. In the Hierarchies tool, acquire a write lock.
2. Right-click in the navigation pane and choose Convert BO to
Entity/Relationship Object.

3. Click OK.
The Convert to Relationship Base Object screen is displayed.

- 198 -
4. Specify the following properties for this base object.

Field Description
Entity Entity base object to be linked via this relationship base object.
Base
Object 1
Display Name of the column that is a FK to the entity base object 1.
name
Physical Actual name of the column in the database. Informatica MDM
name Hub will suggest a physical name for the column based on the
display name that you enter.
Entity Entity base object to be linked via this relationship base object.
Base
Object 2
Display Name of the column that is a FK to the entity base object 2.
name
Physical Actual name of the column in the database. Informatica MDM
name Hub will suggest a physical name for the column based on the
display name that you enter.

- 199 -
Field Description
Hierarchy Column used as the foreign key for the hierarchy; can be either
FK ROWID or CODE.
Column The ability to choose a BO Class CODE column reduces the
complexity by allowing you to define the foreign key
relationship based on a predefined code, rather than the
Informatica MDM Hub-generated ROWID.
Existing Actual column in the existing BO to use.
BO
Column
to Use
Hierarchy Name of this FK column as it will be displayed in the Hub
FK Console
Display
Name
Hierarchy Actual name of the hierarchy foreign key column in the table.
FK Informatica MDM Hub will suggest a physical name for the
Physical column based on the display name that you enter.
Name
Rel Type Column used as the foreign key for the relationship; can be
FK either ROWID or CODE.
Column
Existing Actual column in the existing BO to use.
BO
Column
to Use
Rel Type Name of the FK column that is used to store the Rel Type CODE
FK or ROWID.
Display
Name
Rel Type Actual name of the relationship type FK column in the table.
FK Informatica MDM Hub will suggest a physical name for the
Physical column based on the display name that you enter.
Name
5. Click OK.

Warning: When you modify the base object using the Schema Manager tool,
do not change any of the columns added by HM. Modifying any of these HM
columns will result in unpredictable behavior and possible data loss.

Reverting Relationship Base Objects to Base Objects

This removes HM metadata from the relationship object. The relationship


object remains as a base object, but is no longer displayed in the Hierarchy
Manager.

To revert a relationship object to a base object:


1. In the Hierarchies tool, acquire a write lock.
2. Right-click on a relationship base object and choose Revert
Entity/Relationship Object to BO.
3. In the Revert Entity/Relationship Object dialog box, click OK.

- 200 -
4. A dialog is displayed when the entity is reverted.

Configuring Relationship Types


This section describes how to configure relationship types in the Hierarchies
tool.

Adding Relationship Types

To add a new relationship type:


1. In the Hierarchies tool, acquire a write lock.
2. Right-click on a relationship object and choose Add Relationship Type.
The Hierarchies tool displays a new relationship type (called New Rel
Type) in the navigation tree under the Relationship Types node. The
 default properties are displayed in the properties pane.

- 201 -
Note: You can only save a relationship type if you associate it with a
hierarchy.
A Foreign Key Relationship Base Object is an Entity Base Object containing
a foreign key to another Entity Base Object. A Relationship Base Object is
a table that relates the two Entity Base Objects.
Note: FK relationship types can only be associated with a single hierarchy.
3. The properties panel displays the properties you must enter to create the
relationship.

4. In the properties panel, specify the following properties for this new
relationship type.

Field Description
Code Unique code name of the rel type. Can be used as a foreign
key from HM relationship base objects.
Display Name of this relationship type as it will be displayed in the

- 202 -
Field Description
name Hub Console. Specify a unique, descriptive name.
Description Description of this relationship type.
Color Color of the relationships associated with this relationship
type as they will be displayed in the Hub Console in the
Hierarchy Manager Console and Business Data Director.
Entity Type First entity type associated with this new relationship type.
1 Any entities of this type will be able to have relationships of
this relationship type.
Entity Type Second entity type associated with this new relationship type.
2 Any entities of this type will be able to have relationships of
this relationship type.
Direction Select a direction for the new relationship type to allow a
directed hierarchy. The possible directions are:
• Entity 1 to Entity 2
• Entity 2 to Entity 1
• Undirected
• Bi-Directional
• Unknown
An example of a directed hierarchy is an organizational chart,
with the relationship reports to being directed from employee
to supervisor, and so on, up to the head of the organization.
FK Rel The start date of the foreign key relationship.
Start Date
FK Rel End The end date of the foreign key relationship.
Date
Hierarchies Check the check box next to any hierarchy that you want
associated with this new relationship type. Any selected
hierarchies can contain relationships of this relationship type.

5. Click next to Color to designate a color for this entity type.


The color choices window is displayed.

The color you choose determines how entities of this type are displayed in
the Hierarchy Manager. Select a color and click OK.

- 203 -
6. Click the Calendar button to designate a start and end date for a
foreign key relationship. All relationships of this FK relationship type will
have the same start and end date. If you do not specify these dates, the
default values are automatically added.
7. Select a hierarchy.

8. Click to save the new relationship type.

Editing Relationship Types

To edit a relationship type:


1. In the Hierarchies tool, acquire a write lock.
2. In the navigation tree, click the relationship type that you want to edit.

3. For each field that you want to edit, click and make the change that
you want. To learn more about these fields, see "Adding Relationship
Types" on page 201.

4. When you have finished making changes, click to save your changes.

Warning: If your relationship object uses the code column, you probably do
not want to modify the relationship type code if you already have records for
that relationship type.

This warning does not apply to FK relationship types.

Deleting Relationship Types

Warning: You probably do not want to delete a relationship type if you


already have relationship records that use the relationship type. If your
relationship object uses the relationship type code column instead of the rowid
column and you have records in that relationship object for the relationship
type you are trying to delete, you will get an error.

The above warnings are not applicable to FK relationship types.You can delete
relationship types that are associated with hierarchies. The confirmation
dialog displays the hierarchies associated with the relationship type being
deleted.

To delete a relationship type:


1. In the Hierarchies tool, acquire a write lock.
2. In the navigation tree, right-click the relationship type that you want to
delete, and choose Delete Relationship Type.
The Hierarchies tool prompts you to confirm deletion.
3. Choose Yes.

- 204 -
The Hierarchies tool removes the selected relationship type from the list.

Configuring Packages for Use by HM


This section describes how to add MRM packages to your schema using the
Hierarchies tool. You can create MRM packages for entity base objects,
relationship base objects, and foreign key relationship base objects. If records
will be inserted or changed in the package, be sure to enable the Put option.

About Packages
As described in "Configuring Queries and Packages" on page 127, a package is
a public view of one or more underlying tables in Informatica MDM Hub.
Packages represent subsets of the columns in those tables, along with any
other tables that are joined to the tables. A package is based on a query. The
underlying query can select a subset of records from the table or from another
package. Packages are used for configuring user views of the underlying data.
For more information, see "Configuring Queries and Packages" on page 127.

You must first create a package to use with Hierarchy Manager, then you must
associate it with Entity Types or Relationship Types.

Creating Packages
This section describes how to create HM and Relationship packages.

Creating Entity, Relationship, and FK Relationship Object Packages

To create an HM package:
1. Acquire a write lock.
2. In the Hierarchies tool, right-click anywhere in the navigation pane and
choose Create New Package.
The Hierarchies tool starts the Create New Package wizard and displays
the first dialog box.

- 205 -
3. Specify the following information for this new package.

Field Description
Type of One of the following types:
Package • Entity Object
• Relationship Object
• FK Relationship Object
Query Select an existing query group or choose to create a new one.
Group In Informatica MDM Hub, query groups are logical groups of
queries. For more information, see "Configuring Query
Groups" on page 129.
Query Name of the new query group - only needed if you chose to
group create a new group above.
name
Description Optional description for the new query group you are creating.
4. Click Next.
The Create New Package wizard displays the next dialog box.

- 206 -
5. Specify the following information for this new package.

Field Description
Query Name of the query. In Informatica MDM Hub, a query is a
Name request to retrieve data from the Hub Store. For more
information, see "Configuring Queries" on page 130.
Description Optional description.
Select Primary table for this query.
Primary
Table
6. Click Next.
The Create New Package wizard displays the next dialog box.

7. Specify the following information for this new package.

- 207 -
Field Description
Display Display name for this package, which will be used to display
Name this package in the Hub Console.
Physical Physical name for this package. The Hub Console will suggest
Name a physical name based on the display name you entered.
Description Optional description.
Enable PUT Select to enable records to be inserted or changed. (optional)
If you do not choose this, your package will be read only. If
you are creating a foreign key relationship object package,
you have additional steps in Step 9 of this procedure.
Note: You must have both a PUT and a non-PUT package for
every Foreign Key relationship. Both Put and non-Put packages
that you create for the same foreign key relationship object
must have the same columns.
Secure Select to create a secure resource. (optional)
Resource
8. Click Next.
The Create New Package wizard displays a final dialog box. The dialog box
you see depends on the type of package you are creating.
• If you selected to create either a package for entities or relationships
or a PUT package for FK relationships, a dialog box similar to the
following dialog box is displayed. The required columns (shown in
grey) are automatically selected — you cannot deselect them.
Deselect the columns that are irrelevant to your package.

Note: You must have both a PUT and a non-PUT package for every
Foreign Key relationship. Both Put and non-Put packages that you
create for the same foreign key relationship object must have the
same columns.

- 208 -
• If you selected to create a non-Put enabled package for foreign key
relationships (see Step 7 of this procedure - do not check the Put check
box), the following dialog box is displayed.

9. If you are creating a non-Put enabled package for foreign key


relationships, specify the following information for this new package.
Field Description
Hierarchy Hierarchy associated with this package. For more
information, see "Configuring Hierarchies" on page 191.
Relationship Relationship type associated with this package. For more
Type information, see "Configuring Relationship Base Objects and
Relationship Types" on page 193.
Note: You must have both a PUT and a non-PUT package for every Foreign
Key relationship. Both Put and non-Put packages that you create for the
same foreign key relationship object must have the same columns.
10. Select the columns for this new package.
11. Click Finish to create the package.

Use the Packages tool to view, edit, or delete this newly-created package, as
described in "Configuring Packages" on page 151.

You should not remove columns that are needed by Hierarchy Manager. These
columns are automatically selected (and greyed out) when the user creates
packages using the Hierarchies tool.

After You Create a Package


After creating a package, assign that package to an entity or relationship type.

- 209 -
Assigning Packages to Entity or Relationship Types
After you create a profile, and a package for each of the entity/relationship
types in a profile, you must assign the packages. This defines what fields are
displayed when an entity is displayed in HM. For more information, see
"Customizing the Hub Console Interface" on page 45. You can also assign a
package for relationship types and entity types.

To assign a package to an entity/relationship type:


1. Acquire a write lock.
2. In the Hierarchies tool, select the Entity/Relationship Type.
The Hierarchy Manager displays the properties for the Package for that
type if they exist, or the same properties pane with empty fields. When
you make the display and Put package selections, the HM package column
information is displayed in the lower panel.

The numbers in the cells define the sequence in which the attributes are
displayed.
3. Configure the package for your entity or relationship type.

Field Description
Label Columns used to display the label of the entity/relationship you
are viewing in the HM graphical console. These columns are used
to create the Label Pattern in the Hierarchy Manager Console and
Business Data Director.
To edit a label, click the label value to the right of the label. In
the Edit Pattern dialog, enter a new label or double-click a
column to use it in a pattern.
Tooltip Columns used to display the description or comment that
appears when you scroll over the entity/relationship. Used to
create the tooltip pattern in the Hierarchy Manager Console and
Business Data Director.
To edit a tooltip, click the tooltip pattern value to the right of the
Tooltip Pattern label. In the Edit Pattern dialog, enter a new
tooltip pattern or double-click a column to use it in a pattern.

- 210 -
Field Description
Common Columns used when entities/relationships of different types are
displayed in the same list. The selected columns must be in
packages associated with all Entity/Relationship Types in the
Profile.
Search Columns that can be used with the search tool.
List Columns to be displayed in a search result.
Detail Columns used for the detailed view of an entity/relationship
displayed at the bottom of the screen.
Put Columns that are displayed when you want to edit a record.
Add Columns that are displayed when you want to create a new
record.

4. When you have finished making changes, click to save your changes.

Configuring Profiles
This section describes how to configure profiles using the Hierarchies tool.

About Profiles
In Hierarchy Manager, a profile is used to define user access to HM objects—
what users can view and what the HM objects look like to those users. A
profile determines what fields and records an HM user may display, edit, or
add. For example, one profile can allow full read/write access to all entities
and relationships, while another profile can be read-only (no add or edit
operations allowed). Once you define a profile, you can configure it as a
secure resource, as described in "Securing Informatica MDM Hub Resources"
on page 629.

Adding Profiles
A new profile (called Default) is created automatically for you before you
access the HM. The default profile can be maintained, and you can also add
additional profiles.

Note: The Business Data Director uses the Default Profile to define how Entity
Labels as well as Relationship and Entity Tooltips are displayed. Additional
Profiles, as well as the additional information defined within Profiles, is only
used within the Hierarchy Manager Console and not the Business Data
Director.

To add a new profile:


1. Acquire a write lock.
2. In the Hierarchy tool, right-click anywhere in the navigation pane and
choose Add Profiles.

- 211 -
The Hierarchies tool displays a new profile (called New Profile) in the
navigation tree under the Profiles node. The default properties are
displayed in the properties pane.

When you select these relationship types and click Save, the tree below
the Profile will be populated with Entity Objects, Entity Types, Rel Objects
and Rel Types. When you deselect a Rel type, only the Rel types will be
removed from the tree - not the Entity Types.
3. Specify the following information for this new profile.

Field Description
Name Unique, descriptive name for this profile.
Description Description of this profile.
Relationship Select one or more relationship types associated with
Types this profile.

4. Click to save the new profile.


The Hierarchies tool displays information about the relationship types you
selected in the References section of the screen. Entity types are also
displayed. This information is derived from the relationship types you
selected.

Editing Profiles
To edit a profile:
1. Acquire a write lock.
2. In the Hierarchies tool, in the navigation tree, click the profile that you
want to edit.

- 212 -
3. Configure the profile as needed (specifying the appropriate profile name,
description, and relationship types and assigning packages), according to
the instructions in "Adding Profiles" on page 211 and "Configuring
Packages for Use by HM" on page 205.

4. When you have finished making changes, click to save your changes.

Validating Profiles
To validate a profile:
1. Acquire a write lock.
2. In the Hierarchies tool, in the navigation pane, select the profile to
validate.

3. In the properties pane, click the Validate tab.


Note: Profiles can be successfully validated only after the packages are
assigned to Entity Types and Relationship Types.
The Hierarchies tool displays the Validate tab.

- 213 -
4. Select a sandbox to use.
For information about creating and configuring sandboxes, see the
Informatica MDM Hub Data Steward Guide.
5. To validate the data, check Validate Data. This may take a long time if
you have a lot of records.
6. To start the validation process, click Validate HM Configuration.
The Hierarchies tool displays a progress window during the validation
process. The results of the validation appear in the window below the
buttons.

7. When the validation is finished, click Save.

- 214 -
8. Choose the directory where the validation report will be saved.
9. Click Clear to clear the box containing the description of the validation
results.

Copying Profiles
To copy a profile:
1. Acquire a write lock.
2. In the Hierarchies tool, right-click the profile that you want to copy, and
then choose Copy Profile.
The Hierarchies tool displays a new profile (called New Profile) in the
navigation tree under the Profiles node. This new profile that is an exact
copy (with a different name) of the profile that you selected to copy. The
default properties are displayed in the properties pane.

3. Configure the profile as needed (specifying the appropriate profile name,


description, relationship types, and assigning packages), according to the
instructions in "Adding Profiles" on page 211.

4. Click to save the new profile.

Deleting Profiles
To delete a profile:
1. Acquire a write lock.
2. In the Hierarchies tool, right-click the profile that you want to delete, and
choose Delete Profile.
The Hierarchies tool displays a window that warns that packages will be
removed when you delete this profile.
3. Click Yes.
The Hierarchies tool removes the deleted profile.

Deleting Relationship Types from a Profile


To delete a relationship type:

- 215 -
1. Acquire a write lock.
2. In the Hierarchy tool, right-click the relationship type and choose Delete
Entity Type/Relationship Type From Profile.
If the profile contains relationship types that use the entity/relationship
type that you want to delete, you will not be able to delete it unless you
delete the relationship type from the profile first.

Deleting Entity Types from a Profile


To delete an entity type:
1. Acquire a write lock.
2. In the Hierarchy tool, right-click the entity type and choose Delete Entity
Type/Relationship Type From Profile.
If the profile contains relationship types that use the entity type that you
want to delete, you will not be able to delete it unless you delete the
relationship type from the profile first.

Assigning Packages to Entity and Relationship


Types
After you create a profile, you must:
• Assign packages to the entity types and relationship types associated with
the profile. For more information, see "Assigning Packages to Entity or
Relationship Types" on page 210.
• Configure the package as a secure resource. For more information, see
"Securing Informatica MDM Hub Resources" on page 629.

Sandboxes
To learn about sandboxes, see the Hierarchy Manager chapter in the
Informatica MDM Hub Data Steward Guide.

- 216 -
Part 3: Configuring the Data Flow

Part 3: Configuring the Data Flow

Contents
• "Informatica MDM Hub Processes" on page 218
• "Configuring the Land Process" on page 264
• "Configuring the Stage Process" on page 274
• "Configuring Data Cleansing" on page 307
• "Configuring the Load Process" on page 343
• "Configuring the Match Process" on page 363
• "Configuring the Consolidate Process" on page 443
• "Configuring the Publish Process" on page 449

- 217 -
Chapter 9: Informatica MDM Hub
Processes

This chapter provides an overview of the processes associated with batch


processing in Informatica MDM Hub, including key concepts, tasks, and
references to related topics in the Informatica MDM Hub documentation.

Chapter Contents
• "About Informatica MDM Hub Processes" on page 218
• "Land Process" on page 221
• "Stage Process" on page 224
• "Load Process" on page 227
• "Tokenize Process" on page 240
• "Match Process" on page 245
• "Consolidate Process" on page 255
• "Publish Process" on page 260

Before You Begin


Before you begin, you should be thoroughly familiar with the concepts of
reconciliation, distribution, best version of the truth (BVT), and batch
processing that are described in Chapter 3, “Key Concepts,” in the Informatica
MDM Hub Overview.

About Informatica MDM Hub Processes


With batch processing in Informatica MDM Hub, data flows through
Informatica MDM Hub in a sequence of individual processes.

Overall Data Flow for Batch Processes


The following figure provides a detailed perspective on the overall flow of data
through the Informatica MDM Hub using batch processes, including individual
processes, source systems, base objects, and support tables.

- 218 -
Note: The publish process is not shown in this figure because it is not a batch
process.

Consolidation Status for Base Object Records


This section describes the consolidation status of records in a base object.

Consolidation Indicator

All base objects have a system column named CONSOLIDATION_IND. This


consolidation indicator represents the consolidation status of individual
records as they progress through various processes in Informatica MDM Hub.

The consolidation indicator is one of the following values:


Value State Name Description
1 CONSOLIDATED This record has been consolidated (determined to be
unique) and represents the best version of the truth.
2 UNMERGED This record has gone through the match process and is
ready to be consolidated.
3 QUEUED_FOR_ This record is a match candidate in the match batch that
MATCH is being processed in the currently-executing match

- 219 -
Value State Name Description
process.
4 NEWLY_ This record is new (load insert) or changed (load
LOADED update) and needs to undergo the match process.
9 ON_HOLD The data steward has put this record on hold until
further notice. Any record can be put on hold regardless
of its consolidation indicator value. The match and
consolidate processes ignore on-hold records. For more
information, see the .

How the Consolidation Indicator Changes

Informatica MDM Hub updates the consolidation indicator for base object
records in the following sequence.
1. During the load process, when a new or updated record is loaded into a
base object, Informatica MDM Hub assigns the record a consolidation
indicator of 4, indicating that the record needs to be matched.
2. Near the start of the match process, when a record is selected as a match
candidate, the match process changes its consolidation indicator to 3.
Note: Any change to the match or merge configuration settings will trigger
a reset match dialog, asking whether you want to reset the records in the
base object (change the consolidation indicator to 4, ready for match). For
more information, see "Configuring the Match Process" on page 363 and
"Configuring the Consolidate Process" on page 443.
3. Before completing, the match process changes the consolidation indicator
of match candidate records to 2 (ready for consolidation).
Note: The match process may or may not have found matches for the
record.
A record with a consolidation indicator of 2 or 4 is visible in Merge
Manager. For more information, see the Informatica MDM Hub Data
Steward Guide.
4. If Accept All Unmatched Rows as Unique is enabled, and a record has
undergone the match process but no matches were found, then
Informatica MDM Hub automatically changes its consolidation indicator to
1 (unique). For more information, see "Accept All Unmatched Rows as
Unique" on page 369.
5. If Accept All Unmatched Rows as Unique is enabled, after the record has
undergone the consolidate process, and once a record has no more
duplicates to merge with, Informatica MDM Hub changes its consolidation
indicator to 1, meaning that this record is unique in the base object, and
that it represents the master record (best version of the truth) for that
entity in the base object.
Note: Once a record has its consolidation indicator set to 1, Informatica
MDM Hub will never directly match it against any other record. New or

- 220 -
updated records (with a consolidation indicator of 4) can be matched
against consolidated records.

Survivorship and Order of Precedence


When evaluating cells to merge from two records, Informatica MDM Hub
determines which cell data should survive and which one should be discarded.
The surviving cell data (or winning cell) is considered to represent the better
version of the truth between the two cells. Ultimately, a single, consolidated
record contains the best surviving cell data and represents the best version of
the truth.

Survivorship applies to both trust-enabled columns and columns that are not
trust enabled. When comparing cells from two different records, Informatica
MDM Hub determines survivorship based on the following factors, in order of
precedence:
1. By trust score (only if the two columns are trust-enabled). The data with
the highest trust score wins. If the trust scores are equal, or if trust is not
enabled for both columns, then proceed to the next comparison.
2. By SRC_LUD in the cross-reference record. The data with the more recent
cross-reference SRC_LUD value wins. If the SRC_LUD values are equal,
then proceed to the next comparison.
3. If both records are incoming load updates, then by LAST_UPDATE_DATE
values in the associated cross-reference records. The data with the more
recent cross-reference LAST_UPDATE_DATE wins. If the cross-reference
LAST_UPDATE_DATE values are equal, or if both records are not load
updates, then proceed to the next comparison.
4. By LAST_UPDATE_DATE in the cross-reference record. The data with the
more recent cross-reference LAST_UPDATE_DATE value wins. If the cross-
reference LAST_UPDATE_DATE values are equal, then proceed to the next
comparison.
5. By ROWID_OBJECT in the base object. ROWID_OBJECT values are
evaluated in numeric descending order. The data with the highest ROWID_
OBJECT wins.

Land Process

This section describes concepts and tasks associated with the land process in
Informatica MDM Hub.

- 221 -
About the Land Process
Landing data is the initial step for loading data into Informatica MDM Hub.

Source Systems and Landing Tables

Landing data involves the transfer of data from one or more source systems to
Informatica MDM Hub landing tables.

• A source system is an external system that provides data to Informatica


MDM Hub. Source systems can be applications, data stores, and other
systems that are internal to your organization, or obtained or purchased
from external sources. For more information, see "About Source Systems"
on page 265.
• A landing table is a table in the Hub Store that contains the data that is
initially loaded from a source system. For more information, see "About
Landing Tables" on page 269.

Data Flow of the Land Process

The following figure shows the land process in relation to other Informatica
MDM Hub processes.

- 222 -
Land Process is External to Informatica MDM Hub

The land process is external to Informatica MDM Hub and is executed using an
external batch process or an external application that directly populates
landing tables in the Hub Store. Subsequent processes for managing data are
internal to Informatica MDM Hub.

Ways to Populate Landing Tables

Landing tables can be populated in the following ways:


Load Description
Method
external An ETL (Extract-Transform-Load) tool or other external process
batch copies data from a source system to Informatica MDM Hub. Batch
process loads are external to Informatica MDM Hub. Only the results of the
batch load are visible to Informatica MDM Hub in the form of
populated landing tables.
Note: This process is handled by a separate ETL tool of your
choice. This ETL tool is not part of the Informatica MDM Hub suite
of products.
real-time External applications can populate landing tables in on-line, real-
processing time mode. Such applications are not part of the Informatica MDM
Hub suite of products.

For any given source system, the approach used depends on whether it is the
most efficient—or perhaps the only—way to data from a particular source
system. In addition, batch processing is often used for the initial data load
(the first time that business data is loaded into the Hub Store), as it can be the
most efficient way to populate the landing table with a large number of
records. For more information, see "Initial Data Loads and Incremental Loads"
on page 229.

- 223 -
Note: Data in the landing tables cannot be deleted until after the load process
for the base object has been executed and completed successfully.

Managing the Land Process


To manage the land process, refer to the following topics in this
documentation:
Task Topic(s)
Configuration "Configuring the Land Process" on page 264
• "Configuring Source Systems" on page 264
• "Configuring Landing Tables" on page 269
Execution Execution of the land process is external to Informatica MDM
Hub and depends on the approach you are using to populate
landing tables, as described in "Ways to Populate Landing
Tables" on page 223.
Application If you are using external application(s) to populate landing
Development tables, see the developer documentation for the API used by
your application(s).

Stage Process

This section describes concepts and tasks associated with the stage process in
Informatica MDM Hub.

About the Stage Process


The stage process transfers data from a populated landing table to the staging
table associated with a particular base object.

Data is transferred according to mappings that link a source column in the


landing table with a target column in the staging table. Mappings also define

- 224 -
data cleansing, if any, to perform on the data before it is saved in the target
table.

If delta detection is enabled (see "Configuring Delta Detection for a Staging


Table" on page 302), Informatica MDM Hub detects which records in the
landing table are new or updated and then copies only these records,
unchanged, to the corresponding RAW table. Otherwise, all records are copied
to the target table. Records with obvious problems in the data are rejected
and stored in a corresponding reject table, which can be inspected after
running the stage process (see "Viewing Rejected Records" on page 510).

Data from landing tables can be distributed to multiple staging tables.


However, each staging table receives data from only one landing table.

The stage process prepares data for the load process, described in "Load
Process" on page 227, which subsequently loads data from the staging table
into a target base object.

Data Flow of the Stage Process

The following figure shows the stage process in relation to other Informatica
MDM Hub processes.

Tables Associated With the Stage Process

The following tables in the Hub Store are associated with the stage process.
Type Description
of
Table
landing Contains data that is copied from a source system. For more
table information, see "About the Land Process" on page 222 and "About
Landing Tables" on page 269.

- 225 -
Type Description
of
Table
staging Contains data that was accepted and copied from the landing table
table during the stage process. For more information, see "About Staging
Tables" on page 275.
raw Contains data that was archived from landing tables. Raw data can be
table configured to get archived based on the number of loads or the
duration (specific time interval). For more information, see
"Configuring the Audit Trail for a Staging Table" on page 300 and
"Configuring Delta Detection for a Staging Table" on page 302.
reject Contains records that Informatica MDM Hub has rejected for a specific
table reason. Records in these tables will not be loaded into base objects.
Data gets rejected automatically during Stage jobs for the following
reasons:
• future date or NULL date in the LAST_UPDATE_DATE column
• NULL value mapped to the PKEY_SRC_OBJECT of the staging table
• duplicates found in PKEY_SRC_OBJECT

If multiple records with the same PKEY_SRC_OBJECT are found,


the surviving record is the one with the most recent LAST_
UPDATE_DATE. The other records are sent to the REJECT table.
See "Survivorship and Order of Precedence" on page 221.
• invalid value in the HUB_STATE_IND field (for state-enabled base
objects only)
• duplicate value found in a unique column
Note: Rejected records are removed from the reject table when the
number of stage runs is greater than number of loads. Null pkeys are
getting inserted into RAW/REJ tables even though there are no
changes to the landing table.
The rejects table is associated with the staging table (called
stagingTableName_REJ). Rejected records can be inspected after
running Stage jobs (see "Viewing Rejected Records" on page 510).

Managing the Stage Process


To manage the stage process, refer to the following topics in this
documentation:
Task Topic(s)
Configuration "Configuring the Stage Process" on page 274
• "Configuring Staging Tables" on page 275
• "Mapping Columns Between Landing and Staging Tables" on
page 286
• "Using Audit Trail and Delta Detection" on page 300
"Configuring Data Cleansing" on page 307
• "Configuring Cleanse Match Servers" on page 308
• "Using Cleanse Functions" on page 314
• "Configuring Cleanse Lists" on page 333
Execution "Using Batch Jobs " on page 496
• "Stage Jobs" on page 556
"Writing Custom Scripts to Execute Batch Jobs " on page 559
• "Stage Jobs" on page 596

- 226 -
Load Process

This section describes concepts and tasks associated with the load process in
Informatica MDM Hub. For related tasks, see "Managing the Load Process" on
page 239.

About the Load Process


In Informatica MDM Hub, the load process moves data from a staging table to
the corresponding target table (the base object) in the Hub Store.
o

The load process determines what to do with the data in the staging table
based on:
• whether a corresponding record already exists in the target table and, if
so, whether the record in the staging table has been updated since the load
process was last run
• whether trust is enabled for certain columns (base objects only); if so, the
load process calculates trust scores for the cell data
• whether the data is valid to load; if not, the load process rejects the record
instead
• other configuration settings

Data Flow for the Load Process


The following figure shows the load process in relation to other Informatica
MDM Hub processes.

- 227 -
Tables Associated with the Load Process
In addition to base objects, the following tables in the Hub Store are
associated with the load process.
Type of Description
Table
staging Contains the data that was accepted and copied from the landing
table table during the stage process. For more information, see "Stage
Process" on page 224 and "About Staging Tables" on page 275.
cross- Used for tracking the lineage of data—the source system for each
reference record in the base object. For each source system record that is
table loaded into the base object, Informatica MDM Hub maintains a
record in the cross-reference table that includes:
• an identifier for the system that provided the record
• the primary key value of that record in the source system
• the most recent cell values provided by that system
Each base object record will have one or more cross-reference
records. For more information, see "Cross-Reference Tables" on
page 86.
history If history is enabled for the base object, and records are updated or
tables inserted, then the load process writes to this information into two
tables:
• base object history table
• cross-reference history table
For more information, see "History Tables" on page 88.
reject Contains records from the staging table that the load process has
table rejected for a specific reason. Rejected records will not be loaded
into base objects. The reject table is associated with the staging
table (called stagingTableName_REJ). For more information, see
"Rejected Records in Load Jobs" on page 238. Rejected records can
be inspected after running Load jobs (see "Viewing Rejected
Records" on page 510).

- 228 -
Initial Data Loads and Incremental Loads
The initial data load (IDL) is the very first time that data is loaded into a
newly-created, empty base object.

During the initial data load, all records in the staging table are inserted into
the base object as new records. For more information, see "Load Inserts" on
page 232.

Once the initial data load has occurred for a base object, any subsequent load
processes are called incremental loads because only new or updated data is
loaded into the base object.

Duplicate data is ignored. For more information, see "Run-time Execution


Flow of the Load Process" on page 231.

Trust Settings and Validation Rules


Informatica MDM Hub uses trust and validation rules to help determine the
most reliable data.

Trust Settings

If a column in a base object derives its data from multiple source systems,
Informatica MDM Hub uses trust to help with comparing the relative reliability
of column data from different source systems. For example, the Orders
system might be a more reliable source of billing addresses than the Direct
Marketing system.

- 229 -
Trust is enabled and configured at the column level. For example, you can
specify a higher trust level for Customer Name in the Orders system and for
Phone Number in the Billing system.

Trust provides a mechanism for measuring the relative confidence factor


associated with each cell based on its source system, change history, and
other business rules. Trust takes into account the quality and age of the cell
data, and how its reliability decays (decreases) over time. Trust is used to
determine survivorship (when two records are consolidated) and whether
updates from a source system are sufficiently reliable to update the master
record. For more information, see "Survivorship and Order of Precedence" on
page 221 and "Configuring Trust for Source Systems" on page 344.

Data stewards can manually override a calculated trust setting if they have
direct knowledge that a particular value is correct. Data stewards can also
enter a value directly into a record in a base object. For more information, see
the Informatica MDM Hub Data Steward Guide.

- 230 -
Validation Rules

Trust is often used in conjunction with validation rules, which might


downgrade (reduce) trust scores according to configured conditions and
actions. For more information, see "Configuring Validation Rules" on page
353.

When data meets the criterion specified by the validation rule, then the trust
value for that data is downgraded by the percentage specified in the validation
rule. For example:
Downgrade trust on First_Name by 50% if Length < 3
Downgrade trust on Address Line 1, City, State, Zip and Valid_address_ind
if Valid_address_ind= ‘False’

If the Reserve Minimum Trust flag is enabled (checked) for a column, then the
trust cannot be downgraded below the column’s minimum trust setting.

Run-time Execution Flow of the Load Process


This section provides a detailed explanation of what can occur during the load
process based on configured settings as well as characteristics of the data
being processed. This section describes the default behavior of the
Informatica MDM Hub load process. Alternatively, for incremental loads, you
can streamline load, match, and merge processing by loading by RowID, as
described in "Loading by RowID" on page 296.

Loading Records by Batch

The load process handles staging table records in batches. For each base
object, the load batch size setting (see "Load Batch Size" on page 91) specifies
the number of records to load per batch cycle (default is 1000000).

During execution of the load process for a base object, Informatica MDM Hub
creates a temporary table (_TLL) for each batch as it cycles through records in
the staging table. For example, suppose the staging table contained 250
records to load, and the load batch size were set to 100. During execution, the
load process would:
• create a TLL table and process the first 100 records
• drop and create the TLL table and process the second 100 records
• drop and create the TLL table and process the remaining 50 records
• drop and create the TLL table and stop executing because the TLL table
contained no records

- 231 -
Determining Whether Records Already Exist

During the load process, Informatica MDM Hub first checks to see whether the
record has the same primary key as an existing record from the same source
system. It compares each record in the staging table with records in the target
table to determine whether it already exists in the target table.

What occurs next depends on the results of this comparison.

Load Description
Operation
load If a record in the staging table does not already exist in the target
insert table, then Informatica MDM Hub inserts that new record in the
target table.
load If a record in the staging table already exists in the target table,
update then Informatica MDM Hub takes the appropriate action. A load
update occurs if the target base object gets updated with data in a
record from the staging table. The load process updates a record
only if it has changed since the record was last supplied by the
source system.
Load updates are governed by current Informatica MDM Hub
configuration settings and characteristics of the data in each record
in the staging table. For example, if Force Update is enabled (see
"Forcing Updates in Load Jobs" on page 544), the records will be
updated regardless of whether they have already been loaded.

During the load process, load updates are executed first, followed by load
inserts.

Load Inserts

What happens during a load insert depends on the target base object and other
factors.

- 232 -
Load Inserts and Target Base Objects

To perform a load insert for a record in the staging table:


• The load process generates a unique ROWID_OBJECT value for the new
record.
• The load process performs foreign key lookups and substitutes any foreign
key value(s) required to maintain referential integrity. For more
information, see "Performing Lookups Needed to Maintain Referential
Integrity" on page 237.
• The load process inserts the record into the base object, and copies into
this new record the generated ROWID_OBJECT value (as the primary key
for this record in the base object), any foreign key lookup values, and all
of the column data from the staging table (except PKEY_SRC_OBJECT)—
including null values.
The base object may have multiple records for the same object (for
example, one record from source system A and another from source
system B). Informatica MDM Hub flags both new records as new.
• For each new record in the base object, the load process sets its DIRTY_
IND to 1 so that match keys can be regenerated during the tokenize
process, as described in "Base Object Records Flagged for Tokenization"
on page 243.
• For each new record in the base object, the load process sets its
CONSOLIDATION_IND to 4 (ready for match) so that the new record can

- 233 -
matched to other records in the base object. For more information, see
"Consolidation Status for Base Object Records" on page 219.
• The load process inserts a record into the cross-reference table associated
with the base object. The load process generates a primary key value for
the cross-reference table, then copies into this new record the generated
key, an identifier for the source system, and the columns in the staging
table (including PKEY_SRC_OBJECT). For more information, see "Cross-
Reference Tables" on page 86.
Note: The base object does not contain the primary key value from the
source system. Instead, the base object’s primary key is the generated
ROWID_OBJECT value. The primary key from the source system (PKEY_
SRC_OBJECT) is stored in the cross-reference table instead.
• If history is enabled for the base object (see "History Tables" on page 88),
then the load process inserts a record into its history and cross-reference
history tables.
• If trust is enabled for one or more columns in the base object, then the
load process also inserts records into control tables that support the trust
algorithms, populating the elements of trust and validation rules for each
trusted cell with the values used for trust calculations. This information
can be used subsequently to calculate trust when needed. For more
information, see "Configuring Trust for Source Systems" on page 344 and
"Control Tables for Trust-Enabled Columns" on page 345.
• If Generate Match Tokens on Load is enabled for a base object (see
"Generate Match Tokens on Load" on page 92), then the tokenize process
is automatically started after the load process completes.

Load Updates

What happens during a load update depends on the target base object and
other factors.

Load Updates and Target Base Objects

For load updates on target base objects:


• By default, for each record in the staging table, the load process compares
the value in the LAST_UPDATE_DATE column with the source last update
date (SRC_LUD) in the associated cross-reference table.

- 234 -
• If the record in the staging table has been updated since the last time
the record was supplied by the source system, then the load process
proceeds with the load update.
• If the record in the staging table is unchanged since the last time the
record was supplied by the source system, then the load process
ignores the record (no action is taken) if the dates are the same and
trust is not enabled, or rejects the record if it is a duplicate.
Administrators can change the default behavior so that the load process
bypasses this LAST_UPDATE_DATE check and forces an update of the
records regardless of whether the records might have already been
loaded. For more information, see "Forcing Updates in Load Jobs" on page
544.
• The load process performs foreign key lookups and substitutes any foreign
key value(s) required to maintain referential integrity. For more
information, see "Performing Lookups Needed to Maintain Referential
Integrity" on page 237.
• If the target base object has trust-enabled columns, then the load process:
• calculates the trust score for each trust-enabled column in the record
to be updated, based on the configured trust settings for this trusted
column (as described in "Configuring Trust for Source Systems" on
page 344)
• applies validation rules, if defined, to downgrade trust scores where
applicable (see "Configuring Validation Rules" on page 353)
The load process updates the target record in the base object according to
the following rules:
• If the trust score for the cell in the staging table record is higher than
the trust score in the corresponding cell in the target base object
record, then the load process updates the cell in the target record.
• If the trust score for the cell in the staging table record is lower than
the trust score in the corresponding cell in the target base object
record, then the load process does not update the cell in the target
record.
• If the trust score for the cell in the staging table record is the same as
the trust score in the corresponding cell in the target base object
record, or if trust is not enabled for the column, then the cell value in
the record with the most recent LAST_UPDATE_DATE wins.

- 235 -
• If the staging table record has a more recent LAST_UPDATE_DATE,
then the corresponding cell in the target base object record is
updated.
• If the target record in the base object has a more recent LAST_
UPDATE_DATE, then the cell is not updated.
For more information, see "Survivorship and Order of Precedence" on
page 221.
• For each updated record in the base object, the load process sets its
DIRTY_IND to 1 so that match keys can be regenerated during the tokenize
process. For more information, see "Base Object Records Flagged for
Tokenization" on page 243.
• Whenever an update happens on a base object record, it retains the
consolidation indicator value. For more information, see "Consolidation
Status for Base Object Records" on page 219.
• Whenever the load process updates a record in the base object, it also
updates the associated record in the cross-reference table ("Cross-
Reference Tables" on page 86), history tables (if history is enabled, see
"History Tables" on page 88), and other control tables as applicable.

• If Generate Match Tokens on Load is enabled for a base object (see


"Generate Match Tokens on Load" on page 92), then the tokenize process
is automatically started after the load process completes.

- 236 -
Performing Lookups Needed to Maintain Referential Integrity

Regardless of whether the load process is inserting or updating a record, it


performs any lookups needed to translate source system foreign keys into
Informatica MDM Hub foreign key values using the lookup settings configured
for the staging table. For more information, see "Configuring Lookups For
Foreign Key Columns" on page 283.

Disabling Referential Integrity Constraints

During the initial load/updates—or if there is no real-time, concurrent


access—you can disable the referential integrity constraints on the base object
to improve performance. For more information, see "Allow constraints to be
disabled" on page 90.

Undefined Lookups

If a lookup on a child object is not defined (the lookup table and column were
not populated), before you can successfully load data, you must repeat the
stage process for the child object prior to executing the load process. For
more information, see "Stage Jobs" on page 556 and "Load Jobs" on page 542.

Allowing Null Foreign Keys

When configuring columns for a staging table in the Schema Manager, you can
specify whether to allow NULL foreign keys for target base objects. In the
Schema Manager, the Allow Null Foreign Key check box (see "Properties for
Columns in Staging Tables" on page 279) determines whether NULL foreign
keys are permitted.
• By default, the Allow Null Foreign Key check box is unchecked, which
means that NULL foreign keys are not allowed. The load process:
• accepts records valid lookup values
• rejects records with NULL foreign keys
• rejects records with invalid foreign key values
• If Allow Null Foreign Key is enabled (selected), then the load process:
• accepts records with valid lookup values
• accepts records with NULL foreign keys (and permits load inserts and
load updates for these records)
• rejects records with invalid foreign key values

The load process permits load inserts and load updates for accepted records
only. Rejected records are inserted into the reject table rather than being
loaded into the target table.

- 237 -
Note: During the initial data load only, when the target base object is empty,
the load process allows null foreign keys. For more information, see "Initial
Data Loads and Incremental Loads" on page 229.

Rejected Records in Load Jobs

During the load process, records in the staging table might be rejected for the
following reasons:
• future date or NULL date in the LAST_UPDATE_DATE column
• NULL value mapped to the PKEY_SRC_OBJECT of the staging table
• duplicates found in PKEY_SRC_OBJECT
• invalid value in the HUB_STATE_IND field (for state-enabled base objects
only)
• invalid or NULL foreign keys, as described in "Allowing Null Foreign Keys"
on page 237

Rejected records will not be loaded into base objects. Rejected records can be
inspected after running Load jobs (see "Viewing Rejected Records" on page
510).

For more information about configuring the behavior delta detection for
duplicates and the retention of records in the REJ and RAW tables for a staging
table, see "Using Audit Trail and Delta Detection" on page 300.

Note: To reject records, the load process requires traceability back to the
landing table. If you are loading a record from a staging table and its
corresponding record in the associated landing table has been deleted, then
the load process does not insert it into the reject table.

Other Considerations for the Load Process


This section describes other considerations for the load process.

How the Load Process Handles Parent-Child Records

If the child table contains generated keys from the parent table, the load
process copies the appropriate primary key value from the parent table into
the child table. For example, suppose you had the following data.

PARENT TABLE:
PARENT_ID FNAME LNAME
101 Joe Smith
102 Jane Smith

CHILD TABLE: has a relationship to the PARENTS PKEY_SRC_OBJECT

- 238 -
ADDRESS CITY STATE FKEY_PARENT
1893 my city CA 101
1893 my city CA 102

In this example, you can have a relationship pointing to the ROWID_OBJECT,


to PKEY_SRC_OBJECT, or to a unique column for table lookup.

Loading State-Enabled Base Objects

The load process has special considerations when processing records for
state-enabled base objects. For more information, see "Rules for Loading
Data" on page 168.

Note: The load process rejects any record from the staging table that has an
invalid value in the HUB_STATE_IND column. For more information, see "Hub
State Indicator" on page 160.

Generating Match Tokens (Optional)

Generating match tokens is required before running the match process. In the
Schema Manager, when configuring a base object, you can specify whether to
generate match tokens immediately after the Load job completes, or to delay
tokenizing data until the Match job runs. The setting of the Generate Match
Tokens on Load check box determines when the tokenize process occurs. For
more information, see "When to Generate Match Tokens" on page 242.

Managing the Load Process


To manage the load process, refer to the following topics in this
documentation:
Task Topic(s)
Configuration "Configuring the Load Process" on page 343
• "Configuring Trust for Source Systems" on page 344
• "Configuring Validation Rules" on page 353
Execution "Using Batch Jobs " on page 496
• "Load Jobs" on page 542
• "Synchronize Jobs" on page 557
• "Revalidate Jobs" on page 556
"Writing Custom Scripts to Execute Batch Jobs " on page 559
• "Load Jobs" on page 580
• "Synchronize Jobs" on page 597
• "Revalidate Jobs" on page 595

- 239 -
Tokenize Process

This section describes concepts and tasks associated with the tokenize process
in Informatica MDM Hub.

About the Tokenize Process


In Informatica MDM Hub, the tokenize process generates match tokens and
stores them in a match key table associated with the base object. Match
tokens are used subsequently by the match process to identify candidates for
matching.

Match Tokens and Match Keys

Match tokens are encoded and non-encoded representations of the data in


base object records. Match tokens include:
• Match keys, which are fixed-length, compressed strings consisting of
encoded values built from all of the columns in the Fuzzy Match Key (see
"Configuring Fuzzy Match Key Properties" on page 391) of a fuzzy-match
base object. Match keys contain a combination of the words and numbers
in a name or address such that relevant variations have the same match
key value.
• Non-encoded strings consisting of flattened data from the match columns
(Fuzzy Match Key as well as all fuzzy-match columns and exact-match
columns).

Match Key Tables

Match tokens are stored in the match key table associated with the base
object. For each record in the base object, the tokenize process generates one
or more records in the match key table.

- 240 -
The match key table has the following system columns.
Column Data Type Description
Name (Size)
ROWID_ CHAR (14) Identifies the record for which this match token was
OBJECT generated.
SSA_ CHAR (8) Match key for this record. Encoded representation of the
KEY value in the fuzzy match key column (such as names,
addresses, or organization names) for the associated base
object record. String consists of fixed-length,
compressed, and encoded values built from a combination
of the words and numbers in a name or address.
SSA_ VARCHAR2 Non-encoded (plain text) string representation of the
DATA (500) concatenation of the match column(s) defined in the
associated base object record (Fuzzy Match Key as well as
all fuzzy-match columns and exact-match columns).

Each record in the match key table contains a match token (the data in both
SSA_KEY and SSA_DATA).

Example Match Keys

The match keys that are generated depend on your configured match settings
and characteristics of the data in the base object. The following example
shows match keys generated from strings using a fuzzy match / search
strategy:
String in Record Generated Match Keys
BETH O'BRIEN MMU$?/$-
BETH O'BRIEN PCOG$$$$
BETH O'BRIEN VL/IEFLM
LIZ O'BRIEN PCOG$$$$
LIZ O'BRIEN SXOG$$$-
LIZ O'BRIEN VL/IEFLM

In this example, the strings BETH O'BRIEN and LIZ O'BRIEN have the same
match key values (PCOG$$$$). The match process would consider these to be
match candidates while searching for match candidates during the match
process.

Tokenize Process Applies to Fuzzy-match Base Objects Only

The tokenize process applies to fuzzy-match base objects only—it does not
apply to exact-match base objects. For fuzzy-match base objects, the
tokenize process allows Informatica MDM Hub to match rows with a degree of
fuzziness—the match need not be identical—just sufficiently similar to be

- 241 -
considered a match. For more information, see match / search strategy in
"Exact-match and Fuzzy-match Base Objects" on page 247)

Tokenize Data Flow


The following figure shows the tokenize process in relation to other
Informatica MDM Hub processes.

Key Concepts for the Tokenize Process


This section describes key concepts that apply to the tokenize process.

When to Generate Match Tokens

Match tokens are maintained independently of the match process. The match
process depends on the match tokens in the match key table being current.
Updating match tokens can occur:
• after the load process (see "Generating Match Tokens (Optional)" on page
239), with any changed records (load inserts or load updates)
• when data is put into the base object using SIF Put or CleansePut requests
(see "Generate Match Tokens on Load" on page 92, as well as the
Informatica MDM Hub Services Integration Framework Guide and the
Informatica MDM Hub Javadoc)
• when you run the Generate Match Tokens job (see "Generate Match Tokens
Jobs" on page 540)
• at the start of a match job, as described in "Regenerating Match Tokens If
Needed" on page 251

- 242 -
Base Object Records Flagged for Tokenization

All base objects have a system column named DIRTY_IND. This dirty indicator
identifies when match keys need to be generated for the base object record.
Match tokens are stored in the match key table.

The dirty indicator is one of the following values:


Value Meaning Description
0 Record is up to date Record does not need to be tokenized.
1 Record needs to be This flag is set to 1 when a record has
tokenized been:
• added (load insert)
• updated (load update)
• consolidated
• edited in the Data Manager

For each record in the base object whose DIRTY_IND is 1, the tokenize
process generates match tokens, and then resets the DIRTY_IND to 0.

The following figure shows how the DIRTY_IND flag changes during various
batch processes:

- 243 -
Key Types and Key Widths in Fuzzy-Match Base Objects

For fuzzy-match base objects, match keys are generated based on the
following settings:
Property Description
key type Identifies the primary type of information being tokenized (Person_
Name, Organization_Name, or Address_Part1) for this base object.
The match process uses its intelligence about name and address
characteristics to generate match keys and conduct searches.
Available key types depend on the population set being used, as
described in "Population Sets" on page 248. For more information,
see "Key Types" on page 392.
key Determines the thoroughness of the analysis of the fuzzy match key,
width the number of possible match candidates returned, and how much
disk space the keys consume. Available key widths are Limited,
Standard, Extended, and Preferred. For more information, see "Key
Widths" on page 392.

Because match keys must be able to overcome errors, variations, and word
transpositions in the data, Informatica MDM Hub generates multiple match
tokens for each name, address, or organization. The number of keys
generated per base object record varies, depending on your data and the
match key width.

Match Key Distribution and Hot Spots

The Match Keys Distribution tab in the Match / Merge Setup Details pane of the
Schema Manager allows you to investigate the distribution of match keys in
the match key table. This tool can assist you with identifying potential hot
spots in your data—high concentrations of match keys that could result in
overmatching—where the match process generates too many matches,
including matches that are not relevant. For more information, see
"Investigating the Distribution of Match Keys" on page 438.

Tokenize Ratio

You can configure the match process to repeat the tokenize process whenever
the percentage of changed records exceeds the specified ratio, which is
configured as an advanced property in the base object. For more information,
see "Complete Tokenize Ratio" on page 90.

- 244 -
Managing the Tokenize Process
To manage the tokenize process, refer to the following topics in this
documentation:
Task Topic(s)
Configuration • "Complete Tokenize Ratio" on page 90
• "Generate Match Tokens on Load" on page 92
• "Generating Match Tokens (Optional)" on page 239
Execution "Using Batch Jobs " on page 496
• "Generate Match Tokens Jobs" on page 540
"Writing Custom Scripts to Execute Batch Jobs " on
page 559
• "Generate Match Token Jobs" on page 573
Application Informatica MDM Hub Services Integration Framework
Development Guide

Match Process

This section describes concepts and tasks associated with the match process
in Informatica MDM Hub.

About the Match Process


Before records in a base object can be consolidated, Informatica MDM Hub
must determine which records are likely duplicates (matches) of each other.
The match process uses match rules to:
• identify which records in the base object are likely duplicates (identical or
similar)
• determine which records are sufficiently similar to be consolidated
automatically, and which records should be reviewed manually by a data
steward prior to consolidation

In Informatica MDM Hub, the match process provides you with two main ways
in which to compare records and determine duplicates:
• Fuzzy matching is the most common means used in Informatica MDM Hub
to match records in base objects. Fuzzy matching looks for sufficient
points of similarity between records and makes probabilistic match
determinations that consider likely variations in data patterns, such as
misspellings, transpositions, the combining or splitting of words,
omissions, truncation, phonetic variations, and so on.

- 245 -
• Exact matching is less commonly-used because it matches records with
identical values in the match column(s). An exact strategy is faster, but an
exact match might miss some matches if the data is imperfect.

The best option to choose depends on the characteristics of the data, your
knowledge of the data, and your particular match and consolidation
requirements. For more information, see "Exact-match and Fuzzy-match Base
Objects" on page 247.

During the match process, Informatica MDM Hub compares records in the
base object for points of similarity. If the match process finds sufficient points
of similarity (identical or similar matches) between two records, indicating
that the two records probably are duplicates of each other, then the match
process:
• populates a match table with ROWID_OBJECT references to matched
record pairs, along with the match rule that identified the match, and
whether the matched records qualify for automatic consolidation

• flags those records for consolidation by changing their consolidation


indicator to 2 (ready for consolidation), as described in "Consolidation
Status for Base Object Records" on page 219

Match Data Flow


The following figure shows the match process in relation to other Informatica
MDM Hub processes.

- 246 -
Key Concepts for the Match Process
This section describes key concepts that apply to the match process.

Match Rules

A match rule defines the criteria by which Informatica MDM Hub determines
whether two records in the base object might be duplicates. Informatica MDM
Hub supports two types of match rules:
Type Description
Match Used to match base object records based on the values in columns
column you have defined as match columns, such as last name, first name,
rules address1, and address2. This is the most commonly-used method for
identifying matches. For more information, see "Configuring Match
Columns" on page 387.
Primary Used to match records from two systems that use the same primary
key keys for records. It is uncommon for two different source systems to
match use identical primary keys. However, when this does occur, primary
rules key matches are quick and very accurate. For more information, see
"Configuring Primary Key Match Rules" on page 434.

Both kinds of match rules can be used together for the same base object.

Exact-match and Fuzzy-match Base Objects

A base object is configured to use one of the following types of matching:


Type of Base Description
Object
exact-match Can have only exact match columns. For more information,
base object see "Match Column Types" on page 387.

- 247 -
Type of Base Description
Object
fuzzy-match Can have both fuzzy match and exact match columns:
base object • fuzzy match only
• exact match only, or
• some combination of fuzzy and exact match

The type of base object determines the type of match and the type of match
columns you can define. The base object type is determined by the selected
match / search strategy for the base object. For more information, see
"Match/Search Strategy" on page 370.

Support Tables Used in the Match Process

The match process uses the following support tables:


Table Description
match key Contains the match keys that were generated for all base object
table records. A match key table uses the following naming convention:
C_baseObjectName_STRP
where baseObjectName is the root name of the base object.
Example: C_PARTY_STRP. For more information, see "Match Key
Tables" on page 240.
match Contains the pairs of matched records in the base object resulting
table from the execution of the match process on this base object.
Match tables use the following naming convention:
C_baseObjectName_MTCH
where baseObjectName is the root name of the base object.
Example: C_PARTY_MTCH. For more information, see "Populating
the Match Table with Match Pairs" on page 252.
Note: Link-style base objects use a link table (*_LNK) instead.
match flag Contains the userID of the user who, in Merge Manager, queued a
audit table manual match record for automerging.
Match flag audit tables use the following naming convention:
C_baseObjectName_FHMA
where baseObjectName is the root name of the base object.
Used only if Match Flag Audit Table is enabled for this base object,
as described in "Match Flag Audit Table" on page 92.

Population Sets

For base objects with the fuzzy match/search strategy, the match process
uses standard population sets to account for national, regional, and language
differences. The population set affects how the match process handles
tokenization, the match / search strategy, and match purposes. For more
information, see "Fuzzy Population" on page 370.

- 248 -
A population set encapsulates intelligence about name, address, and other
identification information that is typical for a given population. For example,
different countries use different address formats, such as the placement of
street numbers and street names, location of postal codes, and so on.
Similarly, different regions have different distributions for surnames—the
surname “Smith” is quite common in the United States population, for
example, but not so common for other parts of the world.

Population sets improve match accuracy by accommodating for the variations


and errors that are likely to appear in data for a particular population. For
more information, see "Configuring Match Settings for Non-US Populations" on
page 699.

Matching for Duplicate Data

The match for duplicate data functionality is used to generate matches for
duplicates of all non-system base object columns. These matches are
generated when there are more than a set number of occurrences of complete
duplicates on the base object columns (see "Duplicate Match Threshold" on
page 91). For most data, the optimal value is 2.

Although the matches are generated, the consolidation indicator (see


"Consolidation Indicator" on page 219) remains at 4 (unconsolidated) for those
records, so that they can be later matched using the standard match rules.

Note: The Match for Duplicate Data job is visible in the Batch Viewer if the
threshold is set above 1 and there are no NON_EQUAL match rules defined on
the corresponding base object. For more information, see "Match for Duplicate
Data Jobs" on page 552.

Build Match Groups and Transitive Matches

The Build Match Group (BMG) process removes redundant matching in


advance of the consolidate process. For example, suppose a base object had
the following match pairs:
• record 1 matches to record 2
• record 2 matches to record 3
• record 3 matches to record 4

After running the match process and creating build match groups, and before
the running consolidation process, you might see the following records:
• record 2 matches to record 1
• record 3 matches to record 1
• record 4 matches to record 1

- 249 -
In this example, there was no explicit rule that matched record 4 to record 1.
Instead, the match was made indirectly due to the behavior of other matches
(record 1 matched to 2, 2 matched to 3, and 3 matched to 4). An indirect
matching is also known as a transitive match. In the Merge Manager and Data
Manager, you can display the complete match history to expose the details of
transitive matches.

Maximum Matches for Manual Consolidation

You can configure the maximum number of manual matches to process during
batch jobs. Setting a limit helps prevent data stewards from being
overwhelmed with thousands of manual consolidations to process. Once this
limit is reached, the match process stops running run until the number of
records ready for manual consolidation has been reduced. For more
information, see "Maximum Matches for Manual Consolidation" on page 368
and "Consolidate Process" on page 255.

External Match Jobs

Informatica MDM Hub provides a way to match new data with an existing base
object without actually loading the data into the base object. Rather than run
an entire Match job, you can run the External Match job instead to test for
matches and inspect the results. External Match jobs can process both fuzzy-
match and exact-match rules, and can be used with fuzzy-match and exact-
match base objects. For more information, see "External Match Jobs" on page
535 and "Exact-match and Fuzzy-match Base Objects" on page 247.

Distributed Cleanse Match Servers

For your Informatica MDM Hub implementation, you can increase the
throughput of the match process by running multiple Cleanse Match Servers in
parallel. For more information, see "Configuring Cleanse Match Servers" on
page 308 and the material about distributed Cleanse Match Servers in the
Informatica MDM Hub Installation Guide.

Handling Application Server or Database Server Failures

When running very large Match jobs with large match batch sizes, if there is a
failure of the application server or the database, you must re-run the entire
batch. Match batches are a unit. There are no incremental checkpoints. To
address this, if you think there might be a database or application server
failure, set your match batch sizes smaller to reduce the amount of time that
will be spent re-running your match batches. For more information, see
"Number of Rows per Match Job Batch Cycle" on page 368 and "Match Jobs" on
page 547.

- 250 -
Run-Time Execution Flow of the Match Process
This section describes the overall sequence of activities that occur during the
execution of match process. The following figure provides an overview of the
flow, which is determined by the configured match/search strategy for the
base object:

Cycles for Merge and Auto Match and Merge Jobs

The Merge job executes the match process for a single match batch (see
"Flagging the Match Batch" on page 251). The Auto Match and Merge job cycles
repeatedly until there are no more records to match (no more base object
records with a CONSOLIDATION_IND = 4).

Base Object Records Excluded from the Match Process

The following base object records are ignored during the match process:
• Records with a CONSOLIDATION_IND of 9 (on hold).
• Records with a PENDING or DELETED status. PENDING records can be
included if explicitly enabled according to the instructions in "Enabling
Match on Pending Records" on page 163.
• Records that are manually excluded according to the instructions in
"Excluding Records from the Match Process" on page 441

Regenerating Match Tokens If Needed

When the match process (such as a Match or Auto Match and Merge job)
executes, it first checks to determine whether match tokens need to be
generated for any records in the base object and, if so, generates the match
tokens and updates the match key table. Match tokens will be generated if the
c_repos_table.STRIP_INCOMPLETE_IND flag for the base object is 1, or if any
base object records have a DIRTY_IND=1 (see "Base Object Records Flagged
for Tokenization" on page 243). For more information, see "Match Tokens and
Match Keys" on page 240.

Flagging the Match Batch

The match process cycles through a series of batches until there are no more
base object records to process. It matches a subset of base object records
(the match batch) against all the records available for matching in the base
object (the match pool). The size of the match batch is determined by the
Number of Rows per Match Job Batch Cycle setting ("Number of Rows per
Match Job Batch Cycle" on page 368).

- 251 -
For the match batch, the match process retrieves, in no specific order, base
object records that meet the following conditions:
• the record has a CONSOLIDATION_IND value of 4 (ready for match)
The load process sets the CONSOLIDATION_IND to 4 for any record that is
new (load insert) or updated (load update).
• the record qualifies based on rule set filtering, if configured (see "Enable
Filtering" on page 403 and "Filtering SQL" on page 403)

Internally, the match process changes the CONSOLIDATION_IND=3 for any


records in the match batch. At the end, the match process changes this setting
to CONSOLIDATION_IND=2 (match is complete).

Note: Conflicts can arise if base object records already have


CONSOLIDATION_IND=3. For example, if rule set filtering is used, the match
process performs an internal consistency check and displays an error if there
is a mismatch between the expected number of records meeting the filter
condition and the actual number of records in which CONSOLIDATION_IND=3.

Applying Match Rules and Generating Matches

In this step, the match process applies the configured match rules to the
match candidates. The match process executes the match rules one at a time,
in the configured order. The match process executes exact-match rules and
exact match-column rules first, then it executes fuzzy-match rules.

For a match to be declared:


• all match columns in a match rule must pass
• only one match rule needs to pass

The match process continues executing the match rules until there is a match
or there are no more rules to execute.

Populating the Match Table with Match Pairs

When all of the records in the match batch have been processed, the match
process adds all of the matches for that group to the match table and changes
CONSOLIDATION_IND=2 for the records in the match batch.

Match Pairs

The match process populates a match table for that base object. Each row in
the match table represents a pair of matched records in the base object. The
match table stores the ROWID_OBJECT values for each pair of matched

- 252 -
records, as well as the identifier for the match rule that resulted in the match,
an automerge indicator, and other information.

Columns in the Match Table

Match (_MTCH) tables have the following columns:


Column Data Type Description
Name (Size)
ROWID_ CHAR (14) Identifies one of the records in the matched pair.
OBJECT
ROWID_ CHAR (14) Identifies the record that matched the record
OBJECT_ specified in ROWID_OBJECT.
MATCHED
ORIG_ CHAR (14) Identifies the original record that was matched to
ROWID_ (prior to merge).
OBJECT_
MATCHED
MATCH_ NUMBER Indicates the direction of the original match. One of
REVERSE_ (38) the following values:
IND • Zero (0): ROWID_OBJECT matched ROWID_
OBJECT_MATCHED.
• One (1): ROWID_OBJECT_MATCHED matched
ROWID_OBJECT
ROWID_ CHAR (14) User who executed the match process.
USER
ROWID_ CHAR (14) Identifies the match rule that was used to match the
MATCH_RULE two records.
AUTOMERGE_ NUMBER Specifies whether the base object records in the
IND (38) match pair qualify for automatic consolidation during
the consolidate process. One of the following values:
• Zero (0): Records do not qualify for automatic
consolidation.
• One (1): Records do qualify for automatic
consolidation.
• Two (2): Records are pending. For Build Match
Group (BMG), do not build groups with PENDING

- 253 -
Column Data Type Description
Name (Size)
records. PENDING records are to be left as
individual matches.
The Automerge and Autolink jobs processes any
records with an AUTOMERGE_IND of 1. For more
information, see "Automerge Jobs" on page 534 and
"Autolink Jobs" on page 532.
CREATOR VARCHAR2 User or process responsible for creating the record.
(50)
CREATE_ DATE Date on which the record was created.
DATE
UPDATED_BY VARCHAR2 User or process responsible for the most recent
(50) update to the record.
LAST_ DATE Date on which the record was last updated.
UPDATE_
DATE

Flagging Matched Records for Automatic or Manual


Consolidation

Match rules also determine how matched records are consolidated:


automatically or manually.
Type of Description
Consolidation
automatic Identifies records in the base object that can be consolidated
consolidation automatically, without manual intervention. For more
information, see "Automerge Jobs" on page 534.
manual Identifies records in the base object that have enough points of
consolidation similarity to warrant attention from a data steward, but not
enough points of similarity to automatically consolidate them.
The data steward uses the Merge Manager to review and
manually merge records. For more information, see the
Informatica MDM Hub Data Steward Guide.

For more information, see "Specifying Consolidation Options for Matched


Records" on page 408.

Managing the Match Process


To manage the match process, refer to the following topics in this
documentation:
Task Topic(s)
Configuration "Configuring the Match Process" on page 363
• "Configuring Match Properties for a Base Object" on
page 366
• "Configuring Match Paths for Related Records" on
page 373
• "Configuring Match Columns" on page 387
• "Configuring Match Rule Sets" on page 399
• "Configuring Match Column Rules for Match Rule
Sets" on page 407

- 254 -
Task Topic(s)
• "Configuring Primary Key Match Rules" on page 434
• "Investigating the Distribution of Match Keys" on
page 438
• "Excluding Records from the Match Process" on page
441
"Configuring International Data Support" on page 698
• "Configuring Match Settings for Non-US Populations"
on page 699
Execution "Using Batch Jobs " on page 496
• "Auto Match and Merge Jobs" on page 532
• "External Match Jobs" on page 535
• "Generate Match Tokens Jobs" on page 540
• "Key Match Jobs" on page 541
• "Match Jobs" on page 547
• "Match Analyze Jobs" on page 550
• "Match for Duplicate Data Jobs" on page 552
• "Reset Links Jobs" on page 555
• "Reset Match Table Jobs" on page 555
"Writing Custom Scripts to Execute Batch Jobs " on page
559
• "Auto Match and Merge Jobs" on page 570
• "External Match Jobs" on page 572
• "Generate Match Token Jobs" on page 573
• "Key Match Jobs" on page 579
• "Match Jobs" on page 586
• "Match Analyze Jobs" on page 588
• "Match for Duplicate Data Jobs" on page 589
Application Informatica MDM Hub Services Integration Framework
Development Guide

Consolidate Process

This section describes concepts and tasks associated with the consolidate
process in Informatica MDM Hub.

About the Consolidate Process


After match pairs have been identified in the match process, consolidation is
the process of consolidating data from matched records into a single, master
record.

- 255 -
The following figure shows cell data in records from three different source
systems being consolidated into a single master record.

Consolidating Records Automatically or Manually

As described in "Flagging Matched Records for Automatic or Manual


Consolidation" on page 254, match rules set the AUTOMERGE_IND column in
the match table to specify how matched records are consolidated:
automatically or manually.
• Records flagged for manual consolidation are reviewed by a data steward
using the Merge Manager tool. For more information, see the Informatica
MDM Hub Data Steward Guide.
• Records flagged for automatic consolidation are automatically merged
(see "Automerge Jobs" on page 534). Alternately, you can run the
automatch-and-merge job (see "Auto Match and Merge Jobs" on page 532)
for a base object, which calls the match and then automerge jobs
repeatedly, until either all records in the base object have been checked

- 256 -
for matches, or the maximum number of records for manual consolidation
is reached.

Consolidate Data Flow

The following figure shows the consolidate process in relation to other


Informatica MDM Hub processes.

Traceability

The goal in Informatica MDM Hub is to identify and eliminate all duplicate data
and to merge or link them together into a single, consolidated record while
maintaining full traceability. Traceability is Informatica MDM Hub functionality
that maintains knowledge about which systems—and which records from
those systems—contributed to consolidated records. Informatica MDM Hub
maintains traceability using cross-reference and history tables.

Key Configuration Settings for the Consolidate Process

The following configurable settings affect the consolidate process.


Option Description
base Determines whether the consolidate process using merging or
object linking. For more information, see "Base Object Style" on page 93
style and "Consolidation Options" on page 258.
immutable Allows you to specify source systems as immutable, meaning that
sources records from that source system will be accepted as unique and,
once a record from that source has been fully consolidated, it will
not be changed subsequently. For more information, see
"Immutable Rowid Object" on page 443.
distinct Allows you to specify source systems as distinct, meaning that the
systems data from that system gets inserted into the base object without
being consolidated. For more information, see "Distinct Systems"
on page 445.
cascade Allows you to enable cascade unmerging for child base objects and
unmerge

- 257 -
Option Description
for child to specify what happens if records in the parent base object are
base unmerged. For more information, see "Unmerge Child When
objects Parent Unmerges (Cascade Unmerge)" on page 446.
child base For two base objects in a parent-child relationship, if enabled on
object the child base object, child records are resubmitted for the match
records on process if parent records are consolidated. For more information,
parent see "Requeue On Parent Merge" on page 91.
merge

Consolidation Options
There are two ways to consolidate matched records:
• Merging (physical consolidation) combines the matched records and
updates the base object. Merging occurs for merge-style base objects (link
is not enabled).
• Linking (virtual consolidation) creates a logical link between the matched
records. Linking occurs for link-style base objects (link is enabled).

By default, base object consolidation is physically saved, so merging is the


default behavior. For more information, see "Base Object Style" on page 93.

Merging combines two or more records in a base object table. Depending on


the degree of similarity between the two records, merging is done
automatically or manually.
• Records that are definite matches are automatically merged (automerge
process). For more information, see "Automerge Jobs" on page 534.
• Records that are close but not definite matches are queued for manual
review (manual merge process) by a data steward in the Merge Manager
tool. The data steward inspects the candidate matches and selectively
chooses matches that should be merged. Manual merge match rules are
configured to identify close matches. For more information, see "Manual
Merge Jobs" on page 545 and, for the Merge Manager, see the Informatica
MDM Hub Data Steward Guide.
• Informatica MDM Hub queues all other records for manual review by a
data steward in the Merge Manager tool.

Match rules are configured to identify definite matches for automerging and
close matches for manual merging.

To allow Informatica MDM Hub to automatically change the state of such


records to Consolidated (thus removing them from the Data Steward’s queue),
you can check (select) the Accept all other unmatched rows as unique
check box. For more information, see "Accept All Unmatched Rows as Unique"
on page 369.

- 258 -
Best Version of the Truth

For a base object, the best version of the truth (sometimes abbreviated as
BVT) is a record that has been consolidated with the best cells of data from the
source records. The precise definition depends on the base object style:
• For merge-style base objects, the base object record is the BVT record,
and is built by consolidating with the most-trustworthy cell values from the
corresponding source records.
• For link-style base objects, the BVT Snapshot job will build the BVT
record(s) by consolidating with the most-trustworthy cell values from the
corresponding linked base object records and return to the requestor a
snapshot for consumption.

Consolidation and Workflow Integration


For state-enabled base objects, consolidation behavior is affected by the
current system state of records in the base object. For example, only ACTIVE
records can be automatically consolidated—records with a PENDING or
DELETED system state cannot be. To understand the implications of system
states during consolidation, refer to the following topics:
• "State Management" on page 159, especially "State Transition Rules for
State Management" on page 161 and "Hub States and Base Object Record
Value Survivorship" on page 162
• “Consolidating Data” in the Informatica MDM Hub Data Steward Guide

Managing the Consolidate Process


To manage the consolidate process, refer to the following topics in this
documentation:
Task Topic(s)
Configuration "Configuring the Consolidate Process" on page 443
• "About Consolidation Settings" on page 443
• "Changing Consolidation Settings" on page 447
Execution Informatica MDM Hub Data Steward Guide
• “Managing Data”
• “Consolidating Data”
"Using Batch Jobs " on page 496
• "Accept Non-Matched Records As Unique " on page
532
• "Auto Match and Merge Jobs" on page 532
• "Autolink Jobs" on page 532
• "Automerge Jobs" on page 534
• "BVT Snapshot Jobs" on page 535
• "Manual Link Jobs" on page 545
• "Manual Merge Jobs" on page 545

- 259 -
Task Topic(s)
• "Manual Unlink Jobs" on page 546
• "Manual Unmerge Jobs" on page 546
• "Multi Merge Jobs" on page 552
• "Reset Links Jobs" on page 555
• "Reset Match Table Jobs" on page 555
• "Synchronize Jobs" on page 557
"Writing Custom Scripts to Execute Batch Jobs " on
page 559
• "Auto Match and Merge Jobs" on page 570
• "Autolink Jobs" on page 570
• "Automerge Jobs" on page 571
• "BVT Snapshot Jobs" on page 572
• "Manual Link Jobs" on page 581
• "Manual Unlink Jobs" on page 582
• "Manual Unmerge Jobs" on page 583
Application Informatica MDM Hub Services Integration Framework
Development Guide

Publish Process

This section describes concepts and tasks associated with the publish process
in Informatica MDM Hub.

About the Publish Process


This section describes how Informatica MDM Hub integrates with external
systems by generating XML messages about data changes in the Hub Store
and publishing these messages to an outbound Java Messaging System (JMS)
queue—also known as a message queue in the Hub Console.

Other external systems, processes, or applications can listen on the JMS


message queue, retrieve the XML messages, and process them accordingly.

- 260 -
Informatica MDM Hub supports two JMS models:
• point-to-point—specific destination for a target external system
• publish/subscribe: point-to-point to an Enterprise Service Bus (ESB),
then publish/subscribe from the ESB to other systems.

Using the Publish Process is Optional

Informatica MDM Hub implementations use the publish process in support of


stated business and technical requirements. However, not all organizations
will take advantage of this functionality, and its use in Informatica MDM Hub
implementations is optional.

Publish Process is Part of the Informatica MDM Hub Distribution


Flow

The processes previously described in this chapter—land, stage, load, match,


and consolidate—are all associated with reconciliation, which is the main
inbound flow for Informatica MDM Hub. With reconciliation, Informatica MDM
Hub receives data from one or more source systems, cleanses the data if
applicable, and then reconciles “multiple versions of the truth” to arrive at the
master record—the best version of the truth—for that entity.

In contrast, the publish process belongs to the main Informatica MDM Hub
outbound flow—distribution. Once the master record is established or updated
for a given entity, Informatica MDM Hub can then (optionally) distribute the
master record data to other applications or databases. For an introduction to
reconciliation and distribution, see the Informatica MDM Hub Overview. In
another scenario, data changes can be sent to the Activity Manager Rules
queue so that the data change can be evaluated against user-defined rules.

Publish Process Executes By Message Triggers

The land, stage, load, match, and consolidate processes work with batches of
records and are executed as batch jobs or stored procedures. In contrast, the
publish process is executed as the result of a message trigger that executes
when a data change occurs in the Hub Store. The message trigger creates an
XML message that gets published on a JMS message queue.

Outbound JMS Message Queues

Informatica MDM Hub use an outbound message queue as a communication


channel to feed data changes back to external systems. Informatica supports
embedded message queues, which uses the JMS providers that come with
application servers. An embedded message queue uses the JNDI name of
ConnectionFactory and the name of the JMS queue to connect with. It requires

- 261 -
those JNDI names that have been set up by the application server. The Hub
Console allows you to register message queue servers and message queues
that have already been configured in the application server environment.

ORS-specific XML Message Schemas

XML messages are created using an ORS-specific schema file (<ors-name>-


siperian-mrm-event.xsd) that is based on a common XML schema (siperian-
mrm-events.xsd). You use the JMS Event Schema Manager to generate this
ORS-specific schema. This is a required task for setting up the publish
process. For more information, see "Generating and Deploying ORS-specific
Schemas" on page 617.

Run-time Flow of the Publish Process


The following figure shows the run-time flow of the publish process.

In this scenario:
1. A batch load or a real-time SIF API request (SIF put or cleanse_put
request) may result in an insert or update on a base object.

- 262 -
You can configure a message rule to control data going to the C_REPOS_
MQ_DATA_CHANGE table.
2. Hub Server polls data from C_REPOS_MQ_DATA_CHANGE table at regular
intervals.
3. For data that has not been sent, Hub Server constructs an XML message
based on the data and sends it to the outbound queue configured for the
message queue.
4. It is the external application's responsibility to retrieve the message from
the outbound queue and process it.

Managing the Publish Process


To manage the publish process, refer to the following topics in this
documentation:
Task Topic(s)
Configuration "Configuring the Publish Process" on page 449
• "Configuring Global Message Queue Settings" on page 451
• "Configuring Message Queue Servers" on page 452
• "Configuring Outbound Message Queues" on page 454
• "Configuring Message Triggers" on page 456
• "Generating and Deploying ORS-specific Schemas" on page
617
Execution Informatica MDM Hub publishes an XML message to an outbound
message queue whenever a messages trigger is fired. You do
not need to explicitly execute a batch job from the Batch Viewer
or Batch Group tool.
To monitor run-time activity for message queues using the Audit
Manager tool in the Hub Console, see "Auditing Message
Queues" on page 689.
Application Informatica MDM Hub Services Integration Framework Guide
Development

- 263 -
Chapter 10: Configuring the Land
Process

This chapter explains how to configure the land process for your Informatica
MDM Hub implementation. For an introduction, see "Land Process" on page
221.

Chapter Contents
• "Before You Begin" on page 264
• "Configuration Tasks for the Land Process" on page 264
• "Configuring Source Systems" on page 264
• "Configuring Landing Tables" on page 269

Before You Begin


Before you begin to configure the land process, you must have completed the
following tasks:
• Installed Informatica MDM Hub and created the Hub Store according to the
instructions in the Informatica MDM Hub Installation Guide
• Built the schema, including defining base objects, according to the
instructions "Building the Schema" on page 73
• Learned about the land process described in "Land Process" on page 221

Configuration Tasks for the Land Process


To set up the land process for your Informatica MDM Hub implementation, you
must complete the following tasks in the Hub Console:
• "Configuring Source Systems" on page 264
• "Configuring Landing Tables" on page 269

Configuring Source Systems


This section describes how to define source systems for your Informatica MDM
Hub implementation. For an introduction, see "Land Process" on page 221.

- 264 -
About Source Systems
Source systems are external applications or systems that provide data to
Informatica MDM Hub. In order to manage input from various source systems,
Informatica MDM Hub requires a unique internal name for each source
system. You use the Systems and Trust tool in the Model workbench to define
source systems for your Informatica MDM Hub implementation.

Configuring Trust for Source Systems

If multiple source systems contribute data for the same column in a base
object, you can configure trust on a column-by-column basis to specify which
source system(s) are more reliable providers of data (relative to other source
systems) for that column. Trust is used to determine survivorship when two
records are consolidated, and whether updates from a source system are
sufficiently reliable to update the “best version of the truth” record. For more
information, see "Configuring Trust for Source Systems" on page 344.

Administration Source System

Informatica MDM Hub uses an administration source system for manual trust
overrides and data edits from the Data Manager or Merge Manager tools,
which are described in the Informatica MDM Hub Data Steward Guide. This
administration source system can contribute data to any trust-enabled
column. The administration source system is named Admin by default, but you
can optionally change its name according to the instructions in "Editing Source
System Properties" on page 267.

Informatica System Repository Table

The source systems that you define in the Systems and Trust tool are stored in
a special public Informatica MDM Hub repository table (C_REPOS_SYSTEM,
with a display name of MRM System). This table is visible in the Schema
Manager if the Show System Tables option is selected (for more information,
see "Changing the Item View" on page 40). C_REPOS_SYSTEM can also be
used in packages, as described in "Configuring Packages" on page 151.

Warning: The C_REPOS_SYSTEM table contains Informatica MDM Hub


metadata. As with any Informatica MDM Hub systems tables, you should
never alter the structure of, or data in, the C_REPOS_SYSTEM table. Doing so
causes Informatica MDM Hub to behave unpredictably and can result in data
loss.

Starting the Systems and Trust Tool


To start the Systems and Trust tool:

- 265 -
• In the Hub Console, expand the Model workbench, and then click Systems
and Trust.

The Hub Console displays the Systems and Trust tool.

The Systems and Trust tool displays the following panes:


Pane Description
Navigation Systems: List of every source system that contributes data to
Informatica MDM Hub, including the administration source system
described in "Administration Source System" on page 265.
Trust: Expand the tree to display:
• base objects containing one or more trust-enabled columns
• trust-enabled columns (only)
For more information about configuring trust for base object
columns, see "Configuring Trust for Source Systems" on page 344.
Properties Properties for the selected source system. Trust settings for the
base object column if the base object column is selected.

Source System Properties


A source system definition in Informatica MDM Hub has the following
properties.

Property Description
Name Unique, descriptive name for this source system.
Primary Primary key for this source system. Unique identifier for this
Key system in the ROWID_SYSTEM column of C_REPOS_SYSTEM. Read
only.
Description Optional description for this source system.

Adding Source Systems


Using the Systems and Trust tool, you need to define each source system that
will contribute data to your Informatica MDM Hub implementation.

To add a source system definition:


1. Start the Systems and Trust tool according to the instructions in "Starting
the Systems and Trust Tool" on page 265.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click in the list of source systems and choose Add System.
The Systems and Trust tool displays the New System dialog.

- 266 -
4. Specify the source system properties. For more information, see "Source
System Properties" on page 266.
5. Click OK.
The Systems and Trust tool displays the newly-added source system in the
list of source systems.
Note: When you add a source system, Hub Store uses the first 14
characters of the system name (in all uppercase letters) as its primary key
(ROWID_SYSTEM value in C_REPOS_SYSTEM).

Editing Source System Properties


You can rename any source system, including the administration system (see
"Administration Source System" on page 265). You can change the display
name used in the Hub Console to identify this source system—renaming it has
no effect outside of the Hub Console.

Note: If this source system has already contributed data to your Informatica
MDM Hub implementation, Informatica MDM Hub continues to track the
lineage (history) of data from this source system even after you have
renamed it.

To edit source system properties:


1. Start the Systems and Trust tool according to the instructions in "Starting
the Systems and Trust Tool" on page 265.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the list of source systems, select the source system that you want to
configure.

The screen refreshes, showing the Edit button next to the name and
description fields for the selected source system.

- 267 -
4. Change any of the editable properties. For more information, see "Source
System Properties" on page 266.
5. To change trust settings for a source system, see "Configuring Trust for
Source Systems" on page 344.

6. Click the button to save your changes.

Removing Source Systems


You can remove any source system except:
• the administration system (see "Administration Source System" on page
265)
• any source system that has contributed data to a staging table after the
stage process has been run
You can remove a source system only before the stage process has copied
data from an associated landing to a staging table.
• any source system that is configured as a source for a base object
(meaning that a staging table associated with a base object points to the
source system)

Note: Removing a source system deletes only the source system definition in
the Hub Console—it has no effect outside of Informatica MDM Hub.

To remove a source system:


1. Start the Systems and Trust tool according to the instructions in "Starting
the Systems and Trust Tool" on page 265.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the list of source systems, right-click the source system that you want
to remove, and choose Remove System.
The Systems and Trust tool prompts you to confirm deletion.
4. Click Yes.

- 268 -
The Systems and Trust tool removes the source system from the list,
along with any metadata associated with this source system.

Configuring Landing Tables


This section describes how to configure landing tables in your Informatica
MDM Hub implementation. For an introduction, see "Land Process" on page
221.

About Landing Tables


A landing table provides intermediate storage in the flow of data from source
systems into Informatica MDM Hub. In effect, landing tables are “where data
lands” from source systems into the Hub Store. You use the Schema Manager
in the Model workbench to define landing tables.

The manner in which source systems populate landing tables with data is
entirely external to Informatica MDM Hub. The data model you use for
collecting data in landing tables from various source systems is also external
to Informatica MDM Hub. One source system could populate multiple landing
tables. A single landing table could receive data from different source
systems. The data model you use is entirely up to your particular
implementation requirements.

Inside Informatica MDM Hub, however, landing tables are mapped to staging
tables, as described in "Mapping Columns Between Landing and Staging
Tables" on page 286. It is in the staging table—mapped to a landing table—
where the source system supplying the data to the base object is identified.
During the load process, Informatica MDM Hub copies data from a landing
table to a target staging table, tags the data with the source system
identification, and optionally cleanses data in the process. A landing table can
be mapped to one or more staging tables. A staging table is mapped to only
one landing table.

As described in "Ways to Populate Landing Tables" on page 223, landing tables


are populated using batch or real-time approaches that are external to
Informatica MDM Hub. After a landing table is populated, the stage process
pulls data from the landing tables, further cleanses the data if appropriate,
and then populates the appropriate staging tables. For more information, see
"Stage Process" on page 224.

Landing Table Columns


Landing tables have two types of columns:

- 269 -
Column Description
Type
system Columns that are automatically created and maintained by the
columns Schema Manager.
user- Columns that have been added by users according to the
defined instructions in "Configuring Columns in Tables" on page 102.
columns

Landing tables have only one system column.


Physical Data Description
Name Type
LAST_ DATE Date on which the record was last updated in the source
UPDATE_ system (for base objects, this will populate LAST_UPDATE_
DATE DATE and SRC_LUD in the cross-reference table, and may also
populate LAST_UPDATE_DATE on the base object, depending
on trust).

All other columns in the landing table are user-defined columns.

Note: If the source system table has a multiple-column key, concatenate


these columns to produce a single unique VARCHAR value for the primary key
column.

Landing Table Properties


Landing tables have the following properties.

Property Description
Item Type Type of table that you are adding. Select Landing Table.
Display Name of this landing table as it will be displayed in the Hub
Name Console.
Physical Actual name of the landing table in the database. Informatica
Name MDM Hub will suggest a physical name for the landing table based
on the display name that you enter.
Data Name of the data tablespace for this landing table. For more
Tablespace information, see the Informatica MDM Hub Installation Guide.
Index Name of the index tablespace for this landing table. For more
Tablespace information, see the Informatica MDM Hub Installation Guide.
Description Description of this landing table.
Create Date and time when this landing table was created.
Date
Contains Specifies whether this landing table contains the full data set from
Full the source system, or only updates.
Data Set • If selected (default), indicates that this landing table contains
the full set of data from the source system (such as for the
initial data load). When this check box is enabled, you can
configure Informatica MDM Hub’s delta detection feature (see
"Configuring Delta Detection for a Staging Table" on page 302)
so that, during the stage process, only changed records are
copied to the staging table.
• If not selected, indicates that this landing table contains only
changed data from the source system (such as for incremental
loads). In this case, Informatica MDM Hub assumes that you

- 270 -
Property Description
filtered out unchanged records before populating the landing
table. Therefore, the stage process inserts all records from
the landing table directly into the staging table. When this
check box is enabled, Informatica MDM Hub’s delta detection
feature is not available.
Note: You can change this property only when editing the source
system properties, as described in "Editing Source System
Properties" on page 267.

Adding Landing Tables


To add a landing table:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.

2. Acquire a write lock according to the instructions in "Acquiring a Write


Lock" on page 36.
3. Select the Landing Tables node.

4. Right-click the Landing Tables node and choose Add Item.


The Schema Manager displays Add Table dialog box.

- 271 -
5. Specify the properties (described in "Landing Table Properties" on page
270) for this new landing table.
6. Click OK.
The Schema Manager creates the new landing table in the Operational
Reference Store (ORS), along with support tables, and then adds the new
landing table to the schema tree.

7. Configure the columns for your landing table according to the instructions
in "Configuring Columns in Tables" on page 102.
8. If you want to configure this landing table to contain only changed data
from the source system (Contains Full Data Set), edit the landing table
properties according to the instructions in "Editing Landing Table
Properties" on page 272.

Editing Landing Table Properties


To edit properties in a landing table:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the landing table that you want to edit.
The Schema Manager displays the Landing Table Identity pane for the
selected table.

- 272 -
4. Change the landing table properties you want. For more information, see
"Landing Table Properties" on page 270.

5. Click the button to save your changes.


6. Change the column configuration for your landing table, if you want,
according to the instructions in "Configuring Columns in Tables" on page
102.

Removing Landing Tables


To remove a landing table:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, expand the Landing Tables node.
4. Right-click the landing table that you want to remove, and choose
Remove.
The Schema Manager prompts you to confirm deletion.
5. Choose Yes.
The Schema Manager drops the landing table from the database, deletes
any mappings between this landing table and any staging table (but does
not delete the staging table), and removes the deleted landing table from
the schema tree.

- 273 -
Chapter 11: Configuring the Stage
Process

This chapter explains how to configure the data staging process for your
Informatica MDM Hub implementation. For an introduction, see "Stage
Process" on page 224. In addition, to learn about cleansing data during the
data staging process, see "Configuring Data Cleansing" on page 307

Chapter Contents
• "Before You Begin" on page 274
• "Configuration Tasks for the Stage Process" on page 274
• "Configuring Staging Tables" on page 275
• "Mapping Columns Between Landing and Staging Tables" on page 286
• "Using Audit Trail and Delta Detection" on page 300

Before You Begin


Before you begin to configure staging data, you must have completed the
following tasks:
• Installed Informatica MDM Hub and created the Hub Store according to the
instructions in the Informatica MDM Hub Installation Guide
• Built the schema according to the instructions "Building the Schema" on
page 73
• Learn about the stage process described in "Stage Process" on page 224.

Configuration Tasks for the Stage Process


In addition to the prerequisites described in "Before You Begin" on page 274,
to set up the process of staging data in your Informatica MDM Hub
implementation, you must complete the following tasks in the Hub Console:
• "Configuring Staging Tables" on page 275
• "Mapping Columns Between Landing and Staging Tables" on page 286
• "Configuring Data Cleansing" on page 307, if you plan to use Informatica
MDM Hub internal cleansing to normalize your data.

- 274 -
Configuring Staging Tables
This section describes how to configure staging tables in your Informatica
MDM Hub implementation.

About Staging Tables


A staging table provides temporary, intermediate storage in the flow of data
from landing tables into base objects via load jobs (see "Load Jobs" on page
542). Staging tables:
• contain data from one source system for one table in the Hub Store
• are populated from landing tables by stage jobs (see "Stage Jobs" on page
556)
• can be created for base objects

The structure of a staging table is directly based on the structure of the target
object that will contain the consolidated data. You use the Schema Manager in
the Model workbench to configure staging tables.

Note: You must have at least one source system defined before you can
define a staging table. For more information, see "Configuring Source
Systems" on page 264.

Staging Table Columns


Staging tables have two types of columns:
Column Description
Type
system Columns that are automatically created and maintained by the
columns Schema Manager.
user- Columns that have been added by users. To add columns to a staging
defined table, you select from a list of columns that are already defined in
columns the base object associated with the staging table. For more
information, see "Adding Staging Tables" on page 280 and
"Configuring Columns in Tables" on page 102.

Staging tables have the following system columns.


Physical Data Description
Name Type
(Size)
PKEY_ VARCHAR Primary key from the source system. This must be
SRC_ (255) unique. If the source record does not have a single
OBJECT unique column, then concatenate the values from
multiple columns to uniquely identify the record.
Display name is Pkey Src Object (or, in some places,
Primary Key from Source System).

- 275 -
Physical Data Description
Name Type
(Size)
ROWID_ CHAR Primary key. Unique value assigned by Informatica
OBJECT (14) during the stage process.
DELETED_ INT Reserved for future use.
IND
DELETED_ DATE Reserved for future use.
DATE
DELETED_ VARCHAR Reserved for future use.
BY (50)
LAST_ DATE Date on which the record was last updated in the source
UPDATE_ system. For base objects, this will populate LAST_
DATE UPDATE_DATE and SRC_LUD in the cross-reference
table, and (depending on trust settings) may also
populate LAST_UPDATE_DATE on the base object.
UPDATED_ VARCHAR User or process responsible for the most recent update.
BY (50)
CREATE_ DATE Date on which the record was created.
DATE
CREATOR VARCHAR User or process responsible for creating the record.
(50)
SRC_ VARCHAR Database internal Rowid column that is used to uniquely
ROWID (30) trace back records to the Landing table from Staging.
HUB_ INT For state-enabled base objects only. Integer value
STATE_ indicating the state of this record. Valid values are:
IND • 0=Pending
• 1=Active (Default)
• -1=Deleted
For details, see "Hub State Indicator" on page 160.

Staging tables must be based on the columns provided by the source system
for the target base object for which the staging table is defined, even if the
landing tables are shared across multiple source systems. If you do not make
the column on staging tables source-specific, then you create unnecessary
trust and validation requirements.

Trust is a powerful mechanism, but it carries performance overhead. Use trust


where it is appropriate and necessary, but not where the most recent cell
value will suffice for the surviving record.

If you limit the columns in the staging tables to the columns actually provided
by the source systems, then you can restrict the trust columns to those that
come from two or more staging tables. Use this approach instead of treating
every column as if it comes from every source, which would mean needing to
add trust for every column, and then validation rules to downgrade the trust
on null values for all of the sources that do not provide values for the columns.

More trust columns and validation rules obviously affect the load and the
merge processes. Also, the more trusted columns, the longer will the update
statements be for the control table. Bear in mind that Oracle and DB2 have a

- 276 -
32K limit on the size of the SQL buffer for SQL statements. For this reason,
more than 40 trust columns result in a horizontal split in the update of the
control table—MRM will try to update only 40 columns at a time.

Staging Table Properties


Staging tables have the following properties.
Property Description
Staging
Identity
Display Name of this staging table as it will be displayed in the Hub
Name Console.
Physical Actual name of the staging table in the database. Informatica
Name MDM Hub will suggest a physical name for the staging table based
on the display name that you enter.
System Select the source system for this data. For more information, see
"Configuring Source Systems" on page 264.
Preserve Copy key values from the source system rather than using
Source Informatica MDM Hub’s internally-generated key values. For
System more information, see "Preserving Source System Keys" on page
Keys 277.
Highest Specify the amount by which the key is increased after the first
Reserved load. Visible only if the Preserve Source System Key checkbox is
Key selected. For more information, see "Specifying the Highest
Reserved Key" on page 278.
Data Name of the data tablespace for this staging table. For more
Tablespace information, see the Informatica MDM Hub Installation Guide.
Index Name of the index tablespace for this staging table. For more
Tablespace information, see the Informatica MDM Hub Installation Guide.
Description Description of this staging table.
Cell Determines whether Informatica MDM Hub updates the cell in the
Update target table if the value in the incoming record from the staging
table is the same. For more information, see "Enabling Cell
Update" on page 279.
Columns Columns in this staging table. For more information, see
"Configuring Columns in Tables" on page 102.
Audit Trail Configurable after mappings between landing and staging tables
and Delta have been defined. For more information, see "Mapping Columns
Detection Between Landing and Staging Tables" on page 286.
Audit Trail If enabled, retains the history of the data in the RAW table based
on the number of loads and timestamps. For more information,
see "Configuring the Audit Trail for a Staging Table" on page 300.
Delta If enabled, Informatica MDM Hub processes only new or changed
Detection records and ignores unchanged records. For more information,
see "Configuring Delta Detection for a Staging Table" on page
302.

Preserving Source System Keys

By default, this option is not enabled. During Informatica MDM Hub stage jobs
(see "Stage Jobs" on page 556), for each inbound record of data, Informatica

- 277 -
MDM Hub generates an internal key that it inserts in the ROWID_OBJECT
column of the target base object.

Enable this option when you want to use the value from the primary key
column from the source system instead of Informatica MDM Hub’s internally-
generated key. To enable this option, when adding a staging table to a base
object (see "Adding Staging Tables" on page 280), check (select) the Preserve
Source System Keys check box in the Add staging to Base Object dialog. Once
enabled, during stage jobs, instead of generating an internal key, Informatica
MDM Hub takes the value in the PKEY_SOURCE_OBJECT column from the
staging table and inserts it into the ROWID_OBJECT column in the target base
object.

Note: Once a base object is created, you cannot change this setting.

Note: During the stage process, if multiple records contain the same PKEY_
SRC_OBJECT, the surviving record is the one with the most recent LAST_
UPDATE_DATE. The other records are sent to the reject table. For more
information, see "Tables Associated With the Stage Process" on page 225 and
"Survivorship and Order of Precedence" on page 221.

Specifying the Highest Reserved Key

If the Preserve Source System Keys check box is enabled, then the Schema
Manager displays the Highest Reserved Key field. If you want to insert a gap
between the source key and Informatica MDM Hub’s key, then enter the
amount by which the key is increased after the first load.

Note: Set the Highest Reserved Key to the upper boundary of the source
system keys. To allow a margin, set this number slightly higher, adding a
buffer to the expected range of source system keys. Any records added to the
base object that do not contain this key will be given a key by Informatica
MDM Hub that is above the highest reserved value you set.

Enabling this option has the following consequences when the base object is
first loaded:
1. From the staging table, Informatica MDM Hub takes the value in PKEY_
SOURCE_OBJECT and inserts that into the base object’s ROWID_OBJECT—
instead of generating Informatica MDM Hub’s internal key.
2. Informatica MDM Hub then resets the key's starting position to MAX
(PKEY_SOURCE_OBJECT) + the GAP value.
3. On the next load for this staging table, Informatica MDM Hub continues to
use the PKEY_SOURCE_OBJECT. For loads from other staging tables, it
uses the Informatica MDM Hub-generated key.

- 278 -
Note: Only one staging table per base object can have this option enabled
(even if it is from the same system). The reserved key range is set at the
initial load only.

Enabling Cell Update

By default, during the stage process (see "Stage Jobs" on page 556), for each
inbound record of data, Informatica MDM Hub replaces the cell value in the
target base object whenever an incoming record has a higher trust level—
even if the value it replaces is identical. Even though the value has not
changed, Informatica MDM Hub updates the last update date for the cell to the
date associated with the incoming record, and assigns to the cell the same
trust level as a new value. For more information, see "Configuring Trust for
Source Systems" on page 344.

You can change this behavior by checking (selecting) the Cell Update check
box when configuring a staging table. If cell update is enabled, then during
Stage jobs, Informatica MDM Hub will compare the cell value with the current
contents of the cross-reference table before it updates the target record in the
base object. If the cross-reference record for this system has an identical
value in this cell, then Informatica MDM Hub will not update the cell in the Hub
Store. Enabling cell update can increase performance during Stage jobs if your
Informatica MDM Hub implementation does not require updates to the last
update date and trust value in the target base object record.

Properties for Columns in Staging Tables

Columns in staging tables have the following properties:


Property Description
Column Name of this column as defined in the associated base object.
Lookup Name of the lookup system if the Lookup Table is a cross-reference
System table.
Lookup For foreign key columns in the staging table, the name of the table
Table containing the lookup column.
Lookup For foreign key columns in the staging table, the name of the lookup
Column column in the lookup table. For more information, see "Configuring
Lookups For Foreign Key Columns" on page 283.
Allow Determines whether null updates are allowed when a Load job
Null specifies a null value for a cell that already contains a non-null
Update value.
• Check (select) this check box to have the Load job update the
cell. Do this if you want Informatica MDM Hub to update the cell
value even though the new value would be null.
• Uncheck (clear, the default) this check box to prevent null
updates and retain the existing non-null value.
Exception: The Allow Null Update off flag is ignored when a null
value is used in a load update that is performed against a base
object that has only one cross reference.

- 279 -
Property Description
Allow Determines whether null foreign keys are allowed. Use this option
Null only if null values are valid for the foreign key relationship—that is,
Foreign if the foreign key is an optional relationship. For more information,
Key see "Configuring Lookups For Foreign Key Columns" on page 283.
• Check (select) this check box to allow data to be loaded when the
child record does not contain a value for the lookup operation.
• Uncheck (clear, the default) this check box to prevent null
foreign keys. In this case, records with null values in the lookup
column will be written to the rejects table instead of being
loaded.

Adding Staging Tables


To add a staging table:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, expand the Base Objects node.
4. In the schema tree, expand the node for the base object associated with
this staging table.
5. If you want to add a staging table to this base object, right-click the
Staging Tables node and choose Add Staging Table.
The Schema Manager displays the Add Staging to Base Object dialog.

6. Specify the staging table properties. For more information, see "Staging
Table Properties" on page 277.

- 280 -
Note: Some of these settings cannot be changed after the staging table
has been added, so make sure that you specify the settings you want
before closing this dialog.
7. From the list of the columns in the base object, select all of the columns
that this source system will provide. For more information, see "Staging
Table Columns" on page 275.

• Click the Select All button to select all of the columns without
needing to click each column individually.

• Click the Clear All button to unselect all selected columns.


These staging table columns inherit the properties of their corresponding
columns in the base object. You can select columns but you cannot change
its inherited data types and column widths.
Schema Manager creates the new staging table in the Operational
Reference Store (ORS), along with any support tables, and then adds the
new staging table to the schema tree.
Note: The Rowid Object and the Last Update Date are automatically
selected. You cannot uncheck these columns or change their properties.
8. Specify column properties. For more information, see "Properties for
Columns in Staging Tables" on page 279.
9. For each column that has an associated foreign key relationship, select the

row and click the button to define the lookup column. For more
information, see "Configuring Lookups For Foreign Key Columns" on page
283.
Note: You will not be able to save this new staging table unless you
complete this step.
10. Click OK.
The Schema Manager creates the new staging table in the Operational
Reference Store (ORS), along with any support tables, and then adds the
new staging table to the schema tree.
11. If you want, configure an Audit Trail and Delta Detection for this staging
table. For more information, see "Using Audit Trail and Delta Detection" on
page 300.

Changing Properties in Staging Tables


To change properties in a staging table:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.

- 281 -
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree:
4. expand the Base Objects node, and then expand the node for the base
object associated with this staging table.
• If the staging table is associated with the base object, then expand the
Staging Tables node to display it.
5. Select the staging table that you want to configure.
The Schema Manager displays the properties for the selected table.

6. Specify the staging table properties. For more information, see "Staging
Table Properties" on page 277.
For each property that you want to edit (Display Name and Description),

click the Edit button next to it, and specify the new value.
Note: You can change the source system only if the staging table and its
related support tables (raw, opl, and prl tables) are empty.
Should not be able to change the source system if the Staging table (or its
related tables) contain data.
7. From the list of the columns in the base object, change the columns that
this source system will provide.

• Click the Select All button to select all of the columns without
needing to click each column individually.

• Click the Clear All button to unselect all selected columns.


Note: The Rowid Object and the Last Update Date are automatically
selected. You cannot uncheck these columns or change their properties.
8. If you want, change column properties. For more information, see
"Properties for Columns in Staging Tables" on page 279.

- 282 -
9. If you want, change lookups for foreign key columns. Select the column

and click the button to configure the lookup column. For more
information, see "Configuring Lookups For Foreign Key Columns" on page
283.
10. If you want to change cell updating (see "Enabling Cell Update" on page
279), click in the Cell update check box.
11. Change the column configuration for your staging table, if you want. For
more information, see "Configuring Columns in Tables" on page 102.
12. If you want, configure an Audit Trail and Delta Detection for this staging
table. For more information, see "Using Audit Trail and Delta Detection" on
page 300.

13. Click the button to save your changes.

Jumping to the Source System for a Staging Table


To view the source system associated with a staging table:
• Right-click the staging table and choose Jump to Source System.

The Hub Console launches the Systems and Trust tool and displays the source
system associated with this staging table. For more information, see
"Configuring Source Systems" on page 264.

Configuring Lookups For Foreign Key Columns


This section describes how to configure lookups for foreign key columns in
staging tables associated with base objects.

About Lookups

A lookup is the process of retrieving a data value from a parent table during
Load jobs. In Informatica MDM Hub, when configuring a staging table
associated with a base object, if a foreign key column in the staging table (as
the child table) is related to the primary key in a parent table, you can
configure a lookup to retrieve data from that parent table. The target column
in the lookup table must be a unique column (such as the primary key). For
more information, see "Performing Lookups Needed to Maintain Referential
Integrity" on page 237.

For example, suppose your Informatica MDM Hub implementation had two
base objects: a Consumer parent base object and an Address child base
object, with the following relationship between them:
Consumer.Rowid_object = Address.Consumer_Fkey

- 283 -
In this case, the Consumer_Fkey will be included in the Address Staging table
and it will look up data on some column.

Note: The Address.Consumer_Fkey must be the same as Consumer.Rowed_


object.

In this example, you could configure three types of lookups:


• to the ROWID_OBJECT (primary key) of the Consumer base object (lookup
table)
• to the PKEY_SRC_OBJECT column (primary key) of the cross-reference
table for the Consumer base object
In this case, you must also define the lookup system. Configuring a lookup
to the PKEY_SRC_OBJECT column of a cross-reference table allows you to
point to parent tables associated with a source system that differs from
the source system associated with this staging table.
• to any other unique column, if available, in the base object or its cross-
reference table

Once defined, when the Load job runs on the base object, Informatica MDM
Hub looks up the source system’s Consumer code value in the primary key
from source system column of the Consumer code cross-reference table, and
returns the customer type ROWID_OBJECT value that corresponds to the
source consumer type.

Configuring Lookups

To configure a lookup via foreign key relationship:


1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, expand the Base Objects node, and then expand the
node for the base object associated with this staging table.
4. Select the staging table that you want to configure.
5. Select the row of the foreign key column that you want to configure.

The Edit Lookup button is enabled only for foreign key columns.

6. Click the Edit Lookup button.


7. The Schema Manager displays the Define Lookup dialog.

- 284 -
The Define Lookup dialog contains the parent base object and its cross-
reference table, along with any unique columns (only).
8. Select the target column for the lookup.
• To define the lookup to a base object, expand the base object and
select Rowid_Object (the primary key for this base object).

• To define the lookup to a cross-reference table, select PKey Src Object


(the primary key for the source system in this cross-reference table).
• To define the lookup to any other unique column, simply select the
column.
Note: When you delete a relationship, it clears the lookup.
9. If the lookup column is PKey Src Object in the relationship table, select the
lookup system from the Lookup System drop-down list.
10. Click OK.
11. If you want, configure the Allow Null Update check box to specify what will
happen if a Load job specifies a null value for a cell that already contains a
non-null value. For more information, see "Properties for Columns in
Staging Tables" on page 279.
12. For each column, configure the Allow Null Foreign Key option to specify
what happens if the foreign key column contains a null value (no lookup
value is available). For more information, see "Properties for Columns in
Staging Tables" on page 279.

- 285 -
13. Click the button to save your changes.

Removing Staging Tables


To remove a staging table:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the schema tree, expand the Base Objects node, and then expand the
node for the base object associated with this staging table.
4. Right-click the staging table that you want to remove, and then choose
Remove.
The Schema Manager prompts you to confirm deletion.
5. Choose Yes.
The Schema Manager drops the staging table from the Operational
Reference Store (ORS), deletes associated control tables, and removes the
deleted staging table from the schema tree.

Mapping Columns Between Landing and


Staging Tables
This section describes how to configure the mapping between landing and
staging tables. Mapping defines how the data is transferred from landing to
staging tables via Stage jobs.

About Mapping Columns


To give Informatica MDM Hub the ability to move data from a landing table to
a staging table, you need to define a mapping from columns in the landing
table to columns in the staging table. This mapping defines:
• which landing table column is used to populate a column in the staging
table
• what standardization and verification (cleansing) must be done, if any,
before the staging table is populated

Mappings are configured as either SECURE or PRIVATE resources. For more


information, see "Securing Informatica MDM Hub Resources" on page 629.

- 286 -
Relationships Between Landing and Staging Tables

You can map columns from one landing table to multiple staging tables.
However, each staging table is mapped to only one landing table.

Data is Either Cleansed or Passed Through Unchanged

For each column of data in the staging table, the data comes from the landing
column in one of two ways:
Copy Description
Method
passed Informatica MDM Hub copies the data as is, without making any
through changes to it. Data comes directly from a column in the landing
table.
cleansed Informatica MDM Hub standardizes and verifies data using cleanse
functions. The output of the cleanse function becomes the input to
the target column in the staging table. For more information about
cleanse functions, see "Configuring Data Cleansing" on page 307

In the following figure, data in the Name column is cleansed via a cleanse
function, while data from all other columns is passed directly to the
corresponding target column in the staging table.

Note: A staging table does not need to use every column in the landing table
or every output string from a cleanse function. The same landing table can
provide input to multiple staging tables, and the same cleanse function can be
reused for multiple columns in multiple landing tables.

Decomposition and Aggregation

Cleanse functions can also decompose and aggregate data. Either way, your
mappings need to accommodate the required inputs and outputs.

Cleanse Functions that Decompose Data

In the following figure, the cleanse function decomposes the name field,
breaking the data into smaller pieces.

This cleanse function has one input string and five output strings. In your
mapping, you need to make sure that the input string is mapped to the cleanse
function, and each output string is mapped to the correct target column in the
staging table.

- 287 -
Cleanse Functions that Aggregate Data

In the following figure, the cleanse function aggregates data from five fields
into a single string.

This cleanse function has five input strings and one output string. In your
mapping, you need to make sure that the input strings are mapped to the
cleanse function and the output string is mapped to the correct target column
in the staging table.

Considerations for Column Mappings

When mapping columns, consider the following rules and guidelines:


• The source column must have the same data type as the target column, or
it must be a data type that can be implicitly converted to the target
column’s data type.
• For string (char or varchar) columns, the length does not need to be the
same. When data is loaded from the landing table to the staging table, any
data value that is too long for the target column will trigger Informatica
MDM Hub to place the entire record in a reject table.
• Although more than three columns from the landing table can be mapped
to the Pkey Src Object column in the staging table, index creation is
restricted to only three columns.

Starting the Mappings Tool


To start the Mappings tool:
• In the Hub Console, expand the Model workbench, and then click
Mappings.
The Hub Console displays the Mappings tool.

The Mappings tool displays the following panels:


Column Description
Mappings List List of every defined landing-to-staging mapping.
Properties Properties for the selected mapping.

When you select a mapping in the mappings list, its properties are displayed.

- 288 -
Tabs in the Mappings Tool

When a mapping is selected, the Mappings tool displays the following tabs.
Column Description
General General properties for this mapping. For more information, see
"Mapping Properties" on page 289.
Diagram Interactive diagram that lets you define mappings between
columns in the landing and staging tables. For more information,
see "Mapping Columns Between Landing and Staging Table
Columns" on page 291.
Query Allows you to specify query parameters for this mapping. For
Parameters more information, see "Configuring Query Parameters for
Mappings" on page 294.
Test Allows you to test the mapping.

Mapping Diagrams

When you click the Diagram tab for a mapping, the Mappings tool displays the
current column mappings.

Mapping lines show the mapping from source columns in the landing table to
target columns in the staging table. Colors in the circles at either end of the
mapping lines indicate data types.

Mapping Properties
Mappings have the following properties.
Field Description
Name Name of this mapping as it will be displayed in the Hub Console.
Description Description of this mapping.
Landing Select the landing table that will be the source of the mapping.
Table
Staging Select the staging table that will be the target of the mapping.
Table
Secure Check (enable) to make this mapping a secure resource, which
Resource allows you to control access to this mapping. Once a mapping is
designated as a secure resource, you can assign privileges to it in
the Secure Resources tool. For more information, see "Securing
Informatica MDM Hub Resources" on page 629, and "Assigning

- 289 -
Field Description
Resource Privileges to Roles" on page 641.

Adding Mappings
To create a new mapping:
1. Start the Mappings tool according to the instructions in "Starting the
Mappings Tool" on page 288.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click in the area where the mappings are listed and choose Add
Mapping.
The Mappings tool displays the Mapping dialog.

4. Specify the mapping properties. For more information, see "Mapping


Properties" on page 289.
5. Click OK.
The Mappings tool displays the landing table and staging table on the
workspace.
6. Using the workspace tools and the input and output nodes, connect the
column in the landing table to the corresponding column in the staging
table.
Tip: If you want to automatically map columns in the landing table to

columns with the same name in the staging table, click the button.
7. Click OK.

8. When you are finished, click the button to save your changes.

Copying Mappings
To create a new mapping by copying an existing one:

- 290 -
1. Start the Mappings tool according to the instructions in "Starting the
Mappings Tool" on page 288.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click the mapping that you want to copy, and then choose Copy
Mapping.
The Mappings tool displays the Mapping dialog.

4. Specify the mapping properties. The landing table is already specified. For
more information, see "Mapping Properties" on page 289.
5. Click OK.

6. Click the button to save your changes.

Editing Mapping Properties


To create a new mapping by copying an existing one:
1. Start the Mappings tool according to the instructions in "Starting the
Mappings Tool" on page 288.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the mapping that you want to edit.
4. Edit the mapping properties, diagram, and mapping settings as needed.

5. Click the button to save your changes.

Mapping Columns Between Landing and Staging


Table Columns
You use the Diagrams tab in the Mappings tool to define the mappings between
source columns in landing tables and target columns staging tables. How you
map depends on whether it is a pass through mapping (directly between
columns) or a cleansed mapping (data is processed by a cleanse function).

- 291 -
For each mapping:
• inputs are columns from the landing table
• outputs are the columns in the staging table

The workspace and the methods of creating a mapping are the same as for
creating cleanse functions. To learn how to use the workspace to define
functions, inputs, and outputs, see "Configuring Graph Functions" on page 321.

Navigate to the Diagrams Tab

To navigate to the Diagrams tab:


1. Start the Mappings tool according to the instructions in "Starting the
Mappings Tool" on page 288.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the mapping that you want to configure.
4. Click the Diagram tab.
The Mappings tool displays the Diagram tab for this mapping.

Mapping Columns Directly

To configure mappings directly between columns in landing and staging


tables:
1. Navigate to the Diagrams tab according to the instructions in "Navigate to
the Diagrams Tab" on page 292.
2. Mouse-over the output connector (circle) to the right of the column in the
landing table (the circle outline turns red), drag the line to the input
connector (circle) to the left of the column in the staging table, and then
release the mouse button.

Note: If you want to load by rowid, create a mapping between the primary
key in the landing table and the Rowid object in the staging table. For more
information, see "Loading by RowID" on page 296.

- 292 -
3. Click the button to save your changes.

Mapping Columns Using Cleanse Functions

To cleanse data during Stage jobs, you can include one or more cleanse
functions in your mapping. This section provides brief instructions for
configuring cleanse functions in mappings. For more information, see "Using
Cleanse Functions" on page 314.

To configure mappings between columns in landing and staging tables via


cleanse functions:
1. Navigate to the Diagrams tab according to the instructions in "Navigate to
the Diagrams Tab" on page 292.
2. Add the cleanse function(s) that you want to configure by right-clicking
anywhere in the workspace and choosing the cleanse function that you
want to add.
3. For each input connector on the cleanse function, mouse-over the output
connector from the appropriate column in the landing table, drag the line
to its corresponding input connector, and release the mouse button.
4. Similarly, for each output connector on the cleanse function, mouse-over
the output connector, drag the line to its corresponding column in the
staging table, and release the mouse button.
In the following example, the Titlecase cleanse function will process data
that comes from the Last Name column in the landing table and then
populate the Last Name column in the staging table with the cleansed data.

- 293 -
5. Click the button to save your changes.

Note: For column mappings (from landing to staging tables) that use cleanse
functions, cleanse functions can be automatically removed from the mappings
in the following circumstances:
• If you change cleanse engines in your Informatica MDM Hub
implementation and column mappings use cleanse functions that are not
available in the new cleanse engine. Unsupported cleanse functions are
automatically removed.
• If you restart the application server for the Cleanse Match Server and the
cleanse engine fails to initialize for some reason. Even after you resolve
the issue(s) that cause the cleanse engine initialization failure, unavailable
cleanse functions are automatically removed.

In either case, you will need to use the Mappings tool in the Hub Console to
reconfigure the mappings using cleanse functions that are supported in the
current cleanse engine.

Configuring Query Parameters for Mappings


To configure query parameters for a mapping:
1. Start the Mappings tool according to the instructions in "Starting the
Mappings Tool" on page 288.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the mapping that you want to configure.
4. Click the Query Parameters tab.
The Mappings tool displays the Query Parameters tab for this mapping.

- 294 -
5. If you want, check or uncheck the Enable Distinct check box, as
appropriate, to configure distinct mapping. For more information, see
"Distinct Mapping" on page 295.
6. If you want, check or uncheck the Enable Condition check box, as
appropriate, to configure conditional mapping. For more information, see
"Conditional Mapping" on page 296.
If enabled, type the SQL WHERE clause (omitting the WHERE keyword),
and then click Validate to validate the clause.

7. Click the button to save your changes.

Filtering Records in Mappings

By default, all records are retrieved from the landing table. Optionally, you
can configure a mapping that filters records in the landing table. There are two
types of filters: distinct and conditional. You configure these settings on the
Query Parameters tab in the Mappings tool. For more information, see
"Configuring Query Parameters for Mappings" on page 294.

Distinct Mapping

If you click the Enable Distinct check box on the Query Parameters tab, the
Stage job selects only the distinct records from the landing table. Informatica
MDM Hub populates the staging table using the following SELECT statement:

select distinct * from landing_table

Using distinct mapping is useful in situations in which you have a single


landing table feeding multiple staging tables and the landing table is
denormalized (for example, it contains both customer and address data). A
single customer could have three addresses. In this case, using distinct

- 295 -
mapping prevents the two extra customer records from being written to the
rejects table.

In another example, suppose a landing table contained the following data:


LUD CUST_ID NAME ADDR_ID ADDR
7/24 1 JOHN 1 1 MAIN ST
7/24 1 JOHN 2 1 MAPLE ST

In the mapping to the customer table, check (select) Enable Distinct to avoid
having duplicate records because only LUD, CUST_ID, and NAME are mapped
to the Customer staging table. With Distinct enabled, only one record would
populate your customer table and no rejects would occur.

Alternatively, for the address mapping, you map ADDR_ID and ADDR with
Distinct disabled so that you get two records and no rejects.

Conditional Mapping

If you select the Enable Condition check box, you can apply a SQL WHERE
clause to unload the data in cleanse. For example, suppose the data in your
landing table is from all states in the US. You can use the WHERE clause to
filter the data that is written to the staging tables to include only data from
one state, such as California. To do this, type in a WHERE clause (but omit the
WHERE keyword): STATE = 'CA'. When the cleanse job is run, it unloads and
processes records as SELECT * FROM LANDING WHERE STATE = 'CA'. If you
specify conditional mapping, click the Validate button to validate the SQL
statement.

Loading by RowID
You can streamline load, match, and merge processing by explicitly
configuring Informatica MDM Hub to load by RowID. Otherwise, Informatica
MDM Hub loads data according to its default behavior, which is described in
"Run-time Execution Flow of the Load Process" on page 231.

Note: If you clean the BASE OBJECT using the stored procedure, and if you
had setup the TAKE-ON GAP for the particular staging table, the ROWID
sequences are reset to 1.

In the staging table, the Rowid Object column (a nullable column) has a
specialized usage. You can streamline load, match, and merge processing by
mapping any column in a landing table to the Rowid Object column in a staging
table. In the following example, the Customer Id column in the landing table is
mapped to the Rowid Object column in the staging table.

- 296 -
Mapping to the Rowid Object column allows for the loading of records by
present- or lineage-based ROWID_OBJECT. During the load, if an incoming
record with a populated ROWID_OBJECT is new (the incoming PKEY_SRC_
OBJECT + ROWID_SYSTEM is checked), then this record bypasses the match
and merge process and gets added to the base object directly—a real-time API
PUT(_XREF) by ROWID_OBJECT. Using this feature enhances lineage and
unmerge support, enables closed-loop integration with downstream systems,
and can increase throughput.

The initial data load for a base object inserts all records into the target base
object. Therefore, enable loading by rowID for incremental loads that occur
after the initial data load. For more information, see "Initial Data Loads and
Incremental Loads" on page 229 and "Run-time Execution Flow of the Load
Process" on page 231.

Jumping to a Schema
The Mappings tool allows you to quickly launch the Schema Manager and
display the schema associated with the selected mapping.

Note: The Jump to Schema command is available only in the Workbenches


view, not the Processes view.

To jump to the schema for a mapping:


1. Start the Mappings tool according to the instructions in "Starting the
Mappings Tool" on page 288.
2. Select the mapping whose schema you want to view.
3. In the View By list at the bottom of the navigation pane, choose one of the
following options:
• By Staging Table
• By Landing Table

- 297 -
• by Mapping
4. Right-click anywhere in the navigation pane, and then choose Jump to
Schema.

The Mappings tool displays the schema for the selected mapping.

Testing Mappings
To test a mapping that you have configured:
1. Start the Mappings tool according to the instructions in "Starting the
Mappings Tool" on page 288.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the mapping that you want to configure.
4. Click the Test tab.
The Mappings tool displays the Test tab for this mapping.

- 298 -
5. Specify input values for the columns under Input Name.
6. Click Test.
7. The Mappings tool tests the mapping and populates the columns under
Output Name with the results.

Removing Mappings
To remove a mapping:
1. Start the Mappings tool according to the instructions in "Starting the
Mappings Tool" on page 288.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click the mapping that you want to remove, and choose Delete
Mapping.
The Mappings tool prompts you to confirm deletion.
4. Click Yes.
The Mappings tool drops supporting tables, removes the mapping from the
metadata, and updates the list of mappings.

- 299 -
Using Audit Trail and Delta Detection
After you have completed mapping columns between landing and staging
tables, you can configure the audit trail and delta detection features for a
staging table. For more information, see "Mapping Columns Between Landing
and Staging Tables" on page 286.

To configure audit trail and delta detection, click the Settings tab.

Configuring the Audit Trail for a Staging Table


Informatica MDM Hub allows you to configure an audit trail that retains the
history of the data in the RAW table based on the number of Loads and
timestamps. This audit trail is useful, for example, when using HDD (Hard
Delete Detection). By default, audit trails are not enabled, and the RAW table
is empty. If enabled, then records are kept in the RAW table for either the
configured number of stage job executions or the specified retention period.

Note: The Audit Trail has very different functionality from—and is not to be
confused with—the Audit Manager tool described in "Auditing Informatica MDM
Hub Services and Events" on page 684.

To configure the audit trail for a staging table:


1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. If you have not already done so, add a mapping for the staging table. For
more information, see "Adding Mappings" on page 290
4. Select the staging table that you want to configure.

- 300 -
5. At the bottom of the properties panel, click Preserve an audit trail in
the raw table to enable the raw data audit trail.
The Schema Manager prompts you to select the retention period for the
audit table.

6. Selecting one of the following options for audit retention period:

Option Description
Loads Number of batch loads for which to retain data.
Time Period Period of time for which to retain data.

7. Click Save to save your changes.

Once configured, the audit trail keeps data for the retention period that you
specified. For example, suppose you configured the audit trail for two loads
(Stage job executions). In this case, the audit trail will retain data for the two
most recent loads to the staging table. If there were ten records in each load
in the landing table, then the total number of records in the RAW table would
be 20.

If the Stage job is run multiple times, then the data in the RAW table will be
retained for the most recent two sets based on the ROWID_JOB. Data for older
ROWID_JOBs will be deleted. For example, suppose the value of the ROWID_
JOB for the first Stage job is 1, for the second Stage job is 2, and so on. When
you run the Stage job a third time, then the records in which ROWID_JOB=1
will be discarded.

Note: Using the Clear History button in the Batch Viewer after the first run of
the process:
If the audit trail is enabled for a staging table and you choose the Clear
History button in the Batch Viewer while the associated stage job is selected,
the records in the RAW and REJ tables will be cleared the next time the stage
job is run.

- 301 -
Configuring Delta Detection for a Staging Table
If you enable delta detection for a staging table, Informatica MDM Hub
processes only new or changed records and ignores unchanged records.

Enabling Delta Detection on Specific Columns

To enable delta detection for a specific columns:


1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Select the Staging Table Properties tab.
3. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

4. Select (check) the Enable Delta detection box.


5. Select (check) the Detect deltas using specific columns button.

- 302 -
6. A list of available columns is displayed. Choose which ones you want to
use for delta detection.

When loading data to stage table, if any column from defined set has the value
different from available previous load value, the row is treated as changed. If
all columns from defined set are the same, the row is treated as unchanged.
Columns that are not mapped are ignored

Enabling Delta Detection for a Staging Table

To enable delta detection for a staging table:


1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Select the staging table that you want to configure.
3. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
4. Select (check) the Enable delta detection check box to enable delta
detection for the table. You might need to scroll down to see this option.

- 303 -
5. Specify the manner in which you want to have deltas detected.

Option Description
Detect del- All columns are selected for delta comparison, includ-
tas by com- ing the Last Update Date.
paring all
columns in
mapping
Detect del- If your schema has an applicable date column, choose
tas via a this option and select the date column you want to
date column use for delta comparison. This is the preferred option
in cases where you have an applicable date column.

6. Specify whether to allow staging if a prior duplicate was rejected during


the stage process or load process.
• Select (check) this option to allow the duplicate record being staged,
during this next stage process execution, to bypass delta detection if
its previously-staged duplicate was rejected.
Note: If this option is enabled, and a user in the Batch Viewer clicks
the Clear History button while the associated stage job is selected,
then the history of the prior rejection (that this feature relies on) will
be discarded because the records in the REJ table will be cleared the
next time the stage job is run.
• Clear (uncheck) this option (the default) to prevent the duplicate
record being staged, during this next stage process execution, from
bypassing delta detection if its previously-staged duplicate was
rejected. Delta detection will filter out any corresponding duplicate
landing record that is subsequently processed in the next stage process
execution.

How Informatica MDM Hub Handles Delta Detection

If delta detection is enabled, then the Stage job compares the contents of the
landing table—which is mapped to the selected staging table—against the data

- 304 -
set processed in the previous run of the stage job. This comparison is done to
determine whether the data has changed since the previous run. Changed,
new records, and rejected records will be put into the staging table. Duplicate
records are ignored. For more information, see "Mapping Columns Between
Landing and Staging Tables" on page 286.

Note: Reject records move from cleanse to load after the second stage run.

Considerations for Using Delta Detection

When using delta detection, consider the following issues:


• Delta detection can be done either by comparing entire records or via a
date column. Delta detection on last update date is the most efficient, as
Informatica MDM Hub can simply compare the last update date columns
for each incoming record against the record’s previous last update date.
• With Delta detection you have the option to not include the column in
landing table that is mapped to the last_update_date in the staging table
for delta detection.
• When processing records by last update date, do not use the Now cleanse
function to compare last update values (for example, testing whether the
last update date in a source record occurred before the current system
date). Using Now in this way can produce unpredictable results. For more
information, see "Configuring Data Cleansing" on page 307.
• Perform delta detection only on columns for those sources where the Last
Update Date is not a true indicator of change. The Informatica MDM Hub
stage job will compare the entire source record against the most recent
corresponding record in the PRL (previous load) table. If any cell is
different, then the record is passed on to the staging table. Delta detection
is being done from the PRL table.
• If the Last Update Date data in the landing table is changed (to an older or
newer date), the record will be inserted into the staging table if the delta
detection is based on the all columns or subset of columns.
• If the delta detection is based on the date column (LUD), then only the
newer LUD date value (in comparison to the corresponding record in the
PRL table, not the max date in the RAW table), will go into the staging
table.
• During delta detection, when you are checking for deltas on all columns,
only records that have null primary keys are rejected. This is expected
behavior. Any other records that fail the delta process are rejected on
subsequent stage processes.
• When delta detection is based on the Last Update Date, any changes to the
last update date or the primary key will be detected. Updates to any

- 305 -
values that are not the last update date or part of the concatenated
primary key will not be detected.
• Duplicate primary keys are not considered during subsequent stage
processes when using delta detection by mapped columns.
• Reject handling allows you to:
• View all reject records for a given staging table regarding of the batch
job
• View all reject records by day across all staging tables
• Query reject tables based on query filters

- 306 -
Chapter 12: Configuring Data
Cleansing

This chapter describes how to configure your Hub Store to cleanse data during
the stage process. This chapter is a companion to the material provided in
"Configuring the Stage Process" on page 274.

Chapter Contents
• "Before You Begin" on page 307
• "About Data Cleansing in Informatica MDM Hub" on page 307
• "Configuring Cleanse Match Servers" on page 308
• "Using Cleanse Functions" on page 314
• "Configuring Cleanse Lists" on page 333

Before You Begin


Before you begin, you must have completed the following tasks:
• Installed Informatica MDM Hub and created the Hub Store according to the
instructions in the Informatica MDM Hub Installation Guide.
• Built the schema according to the instructions in "Building the Schema" on
page 73
• Created staging tables and landing tables according to the instructions in
"Configuring the Stage Process" on page 274
• Installed and configured your cleanse engine according to the
documentation included in your cleanse engine distribution.

About Data Cleansing in Informatica MDM Hub


Data cleansing is the process of standardizing data to optimize it for input into
the match process. Matching cleansed data results in a greater number of
reliable matches. This chapter describes internal cleansing—the data
cleansing that occurs inside Informatica MDM Hub, specifically during a Stage
job, when data is copied from landing tables to the appropriate staging tables
(see "Configuring the Stage Process" on page 274).

- 307 -
Note: Data cleansing that occurs prior to its arrival in the landing tables is
outside the scope of this chapter.

Setup Tasks for Data Cleansing


To set up data cleansing for your Informatica MDM Hub implementation, you
complete the following tasks:
• "Configuring Cleanse Match Servers" on page 308
• "Using Cleanse Functions" on page 314
• "Configuring Cleanse Lists" on page 333

Configuring Cleanse Match Servers


This section describes how to configure Cleanse Match Servers for your
Informatica MDM Hub implementation. For more information, see "About Data
Cleansing in Informatica MDM Hub" on page 307.

About the Cleanse Match Server


The Cleanse Match Server is a servlet that handles cleanse requests. This
servlet is deployed in an application server environment. The servlet contains
two server components:
• a cleanse server handles data cleansing operations
• a match server handles match operations

The Cleanse Match Server is multi-threaded so that each instance can process
multiple requests concurrently. It can be deployed on a variety of application
servers. See the Informatica MDM Hub Release Notes for a list of supported
application servers. See the Informatica MDM Hub Installation Guide for
instructions on installing and configuring Cleanse Match Server(s).

Informatica MDM Hub supports running multiple Cleanse Match Servers for
each Operational Reference Store (ORS). The cleanse process is generally
CPU-bound. This scalable architecture allows you to scale your Informatica
MDM Hub implementation as the volume of data increases. Deploying Cleanse
Match Servers on multiple hosts distributes the processing load across
multiple CPUs and permits the running of cleanse operations in parallel. In
addition, some external adapters are inherently single-threaded, so this
Informatica MDM Hub architecture allows you to simulate multi-threaded
operations by running one processing thread per application server instance.

Modes of Cleanse Operations

Cleanse operations can be classified according to the following modes:

- 308 -
• Online and Batch (default)
• Online Only
• Batch Only

The CLEANSE_TYPE can be used to specify which class(es) of operations a


particular Cleanse Match Server will run. If you deploy two Cleanse Match
Servers, you could make one batch-only and the other online-only, or you
could make them both accept both classes of requests. Unless otherwise
specified, a Cleanse Match Server will default to running both kinds of
requests.

Distributed Cleanse Match Servers

For your Informatica MDM Hub implementation, you can increase the
throughput of the cleanse process by running multiple Cleanse Match Servers
in parallel. To learn more about distributed Cleanse Match Servers, see the
Informatica MDM Hub Installation Guide.

Cleanse Match Servers and Proxy Users

If proxy users have been configured for your Informatica MDM Hub
implementation, if you created proxy_user and cmx_ors with different
passwords, then you need to either:
• restart the application server and log in to the proxy user from the Hub
Console
or
• register the Cleanse Match Server for the proxy user again

Otherwise, Stage jobs will fail.

Cleanse Requests

All requests for cleansing are issued by database stored procedures. These
stored procedures package a cleanse request as an XML payload and transmit
it to a Cleanse Match Server. When the Cleanse Match Server receives a
request, it parses the XML and invokes the appropriate code:
Mode Description
Type
On-line The result is packaged as an XML response and sent back via an
operations HTTP POST connection.
Batch jobs The Cleanse Match Server pulls the data to be processed into a flat
file, processes it, and then uses a bulk loader to write the data
back.
• For Oracle, it uses the Oracle loader (SQLLDR) utility.
• For DB2, it uses the DB2 Load utility.

- 309 -
The Cleanse Match Server is multi-threaded so that each instance can process
multiple requests concurrently. The default timeout for batch requests from
Oracle to a Cleanse Match Server is one year, and the default timeout for on-
line requests is one minute. For DB2, the default timeout for batch requests or
SIF requests is 600 seconds (10 minutes).

When running a stage/match job, if more than one cleanse match server is
registered, and if the total number of records to be staged or matched is more
than 500, then the job will get distributed in parallel among the available
Cleanse Match Servers.

Starting the Cleanse Match Server Tool


To view Cleanse Match Server information (including name, port, server type,
and whether the server is on- or off-line):
• In the Hub Console, expand the Model workbench and then click Cleanse
Match Server.

The Cleanse Match Server tool displays a list of any configured Cleanse Match
Servers.

Cleanse Match Server Properties


When configuring Cleanse Match Servers, you can specify the following
settings.
Property Description
Server Host or machine name of the application server on which you
deployed Informatica MDM Hub Cleanse Match Server.
Port HTTP port of the application server on which you deployed the
Cleanse Match Server.
Cleanse Determines whether to use the Cleanse Match Server for cleansing
Server data.
• Select (check) this check box to use the Cleanse Match Server
for cleansing data.
• Clear (uncheck) this check box if you do not want to use the
Cleanse Match Server for cleansing data.
If an ORS has multiple associated Cleanse Match Servers, you can
enhance performance by configuring each Cleanse Match Server as
either a match-only or a cleanse-only server. Use this option in
conjunction with the Match Server check box to implementation this
configuration.
Cleanse Mode that the Cleanse Match Server uses for cleansing data. For
Mode details, see "Modes of Cleanse Operations" on page 308.

- 310 -
Property Description
Match Determines whether to use the Match Server for matching data.
Server • Check (select) this check box to use the Match Server for
matching data.
• Uncheck (clear) this check box if you do not want to use the
Match Server for matching data.
If an ORS has multiple associated Cleanse Match Servers, you can
enhance performance by configuring each Cleanse Match Server as
either a match-only or a cleanse-only server. Use this option in
conjunction with the Cleanse Server check box to implementation
this configuration.
Match Mode that the Match Server uses for matching data. One of the
Mode following values:
For details, see "Cleanse Requests" on page 309.
Offline Determines whether the Cleanse Match Server is offline or online.
• Select (check) this check box to take the Cleanse Match Server
offline, making it temporarily unavailable. Once offline, no
cleanse jobs are sent to that Cleanse Match Server (servlet).
• Clear (uncheck) this check box to make an offline Cleanse Match
Server available again so that Informatica MDM Hub can once
again send cleanse jobs to that Cleanse Match Server.
Note: Informatica MDM Hub looks at this field but does not set it.
Taking a Cleanse Match Server offline is an administrative action.
Thread Overrides the default thread count. The default, recommended,
Count value is 1 thread. Thread counts can be changed without needing to
restart the server. Consider the following factors:
• Number of processor cores available on your machine. You
might consider setting the number of threads to the number of
processor cores available on your machine. For example, set the
number of threads for a dual-core machine to two threads, and
set the number of threads for a single quad-core to four threads.
• Remote database connection. If you are working with a remote
database, you might consider setting the threads to a number
that is slightly higher than the number of processor cores, so
that the wait of one thread can be used by another thread.
• Process memory requirements. If you are running a memory-
intensive process, you must restrict the total memory allocated
to all threads that are run under the JVM to 1 Gigabyte. Because
Informatica MDM Hub runs in a 32-sit JVM environment, each
thread requires memory from the same JVM, and therefore the
total amount of memory is restricted.
If you set this to any illegal values (such as a negative number, 0, a
character, or a string), then it will automatically reset to the default
value (1).
Note: You must change this value after migration from an earlier
hub version or all values will default to one (1) thread.
CPU Specifies the relative CPU performance of the machines in the
Rating cleanse server pool. This value is used when deciding how to
distribute work during distributed job processing. If all of the
machines are the same, this number should remain set to the
default (1). However, if one machine’s CPU were twice as powerful
as the others, for example, consider setting this rating to 2.

- 311 -
Adding a New Cleanse Match Server
To add a new Cleanse Match Server:
1. Start the Cleanse Match Server tool. For more information, see "Starting
the Cleanse Match Server Tool" on page 310.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.

3. In the right pane of the Cleanse Match Server tool, click the button to
add a new Cleanse Match Server.
The Cleanse Match Server tool displays the Add/Edit Match Cleanse Server
dialog

4. Set the properties for this new Cleanse Match Server. For more
information, see "Cleanse Match Server Properties" on page 310.
If proxy users have been configured for your Informatica MDM Hub
implementation, see "Cleanse Match Servers and Proxy Users" on page
309.
5. Click OK.

6. Click the Save button to save your changes.

Editing Cleanse Match Server Properties


To edit Cleanse Match Server properties:

- 312 -
1. Start the Cleanse Match Server tool. For more information, see "Starting
the Cleanse Match Server Tool" on page 310.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Select the Cleanse Match Server that you want to configure.

4. Click the button.


The Cleanse Match Server tool displays the Add/Edit Match Cleanse Server
dialog for the selected Cleanse Match Server tool.

5. Change the properties you want for this Cleanse Match Server. For more
information, see "Cleanse Match Server Properties" on page 310.
If proxy users have been configured for your Informatica MDM Hub
implementation, see "Cleanse Match Servers and Proxy Users" on page
309.
6. Click OK to apply your changes.

7. Click the Save button to save your changes.

Deleting a Cleanse Match Server


To delete a Cleanse Match Server:
1. Start the Cleanse Match Server tool. For more information, see "Starting
the Cleanse Match Server Tool" on page 310.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

- 313 -
3. Select the Cleanse Match Server that you want to delete.

4. Click the button.


5. The Cleanse Match Server tool prompts you to confirm deletion. Click OK
to delete the server.

Testing the Cleanse Match Server Configuration


Whenever you add or change your Cleanse Match Server information, it is
recommended that you check the configuration to make sure that the
connection works properly.

To test the Cleanse Match Server configuration:


1. Start the Cleanse Match Server tool. For more information, see "Starting
the Cleanse Match Server Tool" on page 310.
2. Select the Cleanse Match Server that you want to test.

3. Click the button to test the configuration.


If the test succeeds, the Cleanse Match Server tool displays a window
showing the connection information and a success message.

If there was a problem, Informatica MDM Hub will display a window with
information about the connection problem.
4. Click OK.

Using Cleanse Functions


This section describes how to use cleanse functions to clean data in your
Informatica MDM Hub implementation. For more information, see "About Data
Cleansing in Informatica MDM Hub" on page 307.

About Cleanse Functions

In Informatica MDM Hub, you can build and execute cleanse functions that
cleanse data. A cleanse function is a function that is applied to a data value in

- 314 -
a record to standardize or verify it. For example, if your data has a column for
salutation, you could use a cleanse function to standardize all instances of
“Doctor” to “Dr.” You can apply cleanse functions successively, or simply
assign the output value to a column in the staging table.

Types of Cleanse Functions

In Informatica MDM Hub, each cleanse function is one of the following types:
• a Informatica MDM Hub-defined function
• a function defined by your cleanse engine
• a custom cleanse function you define

The pre-defined functions provide access to specialized cleansing


functionality, such as name and address standardization, address
decomposition, gender determination, and so on. See the console for more
information on the Cleanse Function tool.

Libraries

Functions are organized into libraries—Java libraries and user libraries, which
are folders used to organize the functions that you can use in the Cleanse
Functions tool in the Model workbench. For more information, see
"Configuring Cleanse Libraries" on page 317.

Cleanse Functions are Secure Resources

Cleanse functions can be configured as secure resources and made SECURE or


PRIVATE. For more information, see "Securing Informatica MDM Hub
Resources" on page 629.

Available Functions Subject to Cleanse Engine

The functions you see in the Hub Console depend on the cleanse engine that
you are using. Informatica MDM Hub shows the cleanse functions that your
cleanse engine makes available. Regardless of which cleanse engine you use,
the overall process of data cleansing in Informatica MDM Hub is the same.

Starting the Cleanse Functions Tool


The Cleanse Functions tool provides the interface for defining how you cleanse
your data.

To start the Cleanse Functions tool:

- 315 -
• In the Hub Console, expand the Model workbench and then click Cleanse
Functions.

The Hub Console displays the Cleanse Functions tool.

The Cleanse Functions tool is divided into two panes:


Pane Description
Navigation Shows the cleanse functions in a tree view. Clicking on any node in
pane the tree shows you the appropriate properties page in the right-
hand pane.
Properties Shows the properties for the selected function. For any of the
pane custom cleanse functions, you can edit properties in the right-hand
pane.

The functions you see in the left pane depend on the cleanse engine you are
using. Your functions may differ from the ones shown in the previous figure.

Cleanse Function Types

Cleanse functions are grouped in the tree according to their type. Cleanse
function types are high-level categories that are used to group similar cleanse
functions for easier management and access.

Cleanse Function Properties

If you expand the list of cleanse function types in the navigation pane, you can
select a cleanse function to display its particular properties.

In addition to specific cleanse functions, the Misc Functions include Read


Database and Reject functions that provide efficiencies in data management.

- 316 -
Field Description
Read Allows a map to look up records directly from a database table.
Database Note: This function is designed to be used when there are many
references to the same limited number of data items.
Reject Allows the creator of a map to identify incorrect data and reject the
record, noting the reason.

Overview of Configuring Cleanse Functions


To define cleanse functions, you complete the following tasks:
1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click Refresh to refresh your cleanse library.
4. Create your own cleanse library, which is simply a folder where you keep
your custom cleanse functions. See "Configuring Cleanse Libraries" on
page 317.
5. Define regular expression functions in the new library, if applicable. See
"Configuring Regular Expression Functions" on page 320.
6. Define graph functions in the new library, if applicable. See "Configuring
Graph Functions" on page 321.
7. Add cleanse functions to your graph function. See "Adding Functions to a
Graph Function" on page 323.
8. Test your functions. See "Testing Functions" on page 330.

Configuring Cleanse Libraries


You can configure either user libraries or Java libraries.

Configuring User Libraries

You can add a User Library when you want to create a customized cleanse
function from existing internal or external Informatica cleanse functions.

To add a user cleanse library:

- 317 -
1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click Refresh to refresh your cleanse library.
4. In the tree, select the Cleanse Functions node.
5. Right-click and choose Add User Library from the pop-up menu.
The Cleanse Functions tool displays the Add User Library dialog.

6. Specify the following properties:

Field Description
Name Unique, descriptive name for this library.
Description Optional description of this library.
7. Click OK.
The Cleanse Functions tool displays the new library you added in the list
under Cleanse libraries in the navigation pane.

Configuring Java Libraries

To add a Java cleanse library:


1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click Refresh to refresh your cleanse library.
4. In the tree, select the Cleanse Functions node.
5. Right-click and choose Add Java Library from the pop-up menu.
The Cleanse Functions tool displays the Add Java Library dialog.

- 318 -
6. Specify the JAR file for this library. You can click the Browse button to
look for the JAR file.
7. Specify the following properties:

Field Description
Name Unique, descriptive name for this library.
Description Optional description of this library.
8. If applicable, click the Parameters button to specify any parameters for
this library.
The Cleanse Functions tool displays the Parameters dialog.

You can add as many parameters as needed for this library.

• To add a parameter, click the button. The Cleanse Functions tool


displays the Add Value dialog.

Type a name and value, and then click OK.

• To import parameters, click the button. The Cleanse Functions tool


displays the Open dialog, prompting you to select a properties file
containing the parameter(s) you want.

The name, value pairs that are imported from the file will be available
to the user-defined Java function at run time as elements of its Java

- 319 -
properties. This allows you to provide customized values in a generic
function, such as “userid” or “target URL”.
9. Click OK.
The Cleanse Functions tool displays the new library in the list under
Cleanse libraries in the navigation pane.

To learn about adding graph functions to your library, see "Configuring Graph
Functions" on page 321.

Configuring Regular Expression Functions


This section describes how to configure regular expression functions for your
Informatica MDM Hub implementation.

About Regular Expression Functions

In Informatica MDM Hub, a regular expression function allows you to use


regular expressions for cleanse operations. Regular expressions are
computational expressions that are used to match and manipulate text data
according to commonly-used syntactic conventions and symbolic patterns. To
learn more about regular expressions, including syntax and patterns, refer to
the Javadoc for java.util.regex.Pattern. Alternatively, to define a graph
function instead, see "Configuring Graph Functions" on page 321.

Adding Regular Expression Functions

To add a regular expression function:


1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click a User Library name and choose Add Regular Expression
Function.
The Cleanse Functions tool displays the Add Regular Expression dialog.

4. Specify the following properties:

Field Description
Name Unique, descriptive name for this regular expression function.
Description Optional description of this regular expression function.

- 320 -
5. Click OK.
The Cleanse Functions tool displays the new regular expression function
under the user library in the list in the left pane, with the properties in the
right pane.

6. Click the Details tab.

7. If you want, specify an input or output expression by clicking the icon


to edit the field, entering a regular expression, and then clicking the
icon to apply the change.

8. Click the icon to save your changes.

Configuring Graph Functions


This section describes how to configure graph functions for your Informatica
MDM Hub implementation.

- 321 -
About Graph Functions

In Informatica MDM Hub, a graph function is a cleanse function that you can
visualize and configure graphically using the Cleanse Functions tool in the Hub
Console. You can add any pre-defined functions to a graph function.
Alternatively, to define a regular expression function, see "Configuring
Regular Expression Functions" on page 320.

Inputs and Outputs

Graph functions have:


• one or more inputs (input parameters)
• one or more outputs (output parameters)

For each graph function, you must configure all required inputs and outputs.
Inputs and outputs have the following properties.

Field Description
Name Unique, descriptive name for this input or output.
Description Optional description of this input or output.
Data Type Data type. Must match exactly. One of the following values:
• Boolean—accepts Boolean values only
• Date—accepts date values only
• Float—accepts float values only
• Integer—accepts integer values only
• String—accepts any data

Adding Graph Functions

To add a graph function:


1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Right-click on a User Library name and choose Add Graph Function.
The Cleanse Functions tool displays the Add Graph Function dialog.

4. Specify the following properties:

- 322 -
Field Description
Name Unique, descriptive name for this graph function.
Description Optional description of this graph function.
5. Click OK.
The Cleanse Functions tool displays the new graph function under the
library in the list in the left pane, with the properties in the right pane.

This graph function is empty. To configure it and add functions, see


"Adding Functions to a Graph Function" on page 323.

Adding Functions to a Graph Function

You can add as many functions as you want to a graph function. The example
in this section shows adding only a single function.

If you already have graph functions defined, you can treat them just like any
other function in the cleanse libraries. This means that you can add a graph
function inside another graph function. This approach allows you to reuse
functions.

To add functions to a graph function:


1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click your graph function, and then click the Details tab to see the function
represented in graphical format.

The area in this tab is referred to as the workspace. You might need to
resize the window to see both the input and output on the workspace.

- 323 -
By default, graph functions have one input and one output that are of type
string (gray circle). The function that you are defining might require more
inputs and/or outputs and different data types. For more information, see
"Configuring Inputs" on page 328 and "Configuring Outputs" on page 329.
4. Right-click on the workspace and choose Add Function from the pop-up
menu.
For more on the other commands on this pop-up menu, see "Workspace
Commands" on page 327. You can also add or delete these functions using
the toolbar buttons.
The Cleanse Functions tool displays the Choose Function to Add dialog.

5. Expand the folder containing the function you want to add, select the
function to add, and then click OK.
Note: The functions that are available for you to add depend on your
cleanse engine and its configuration. Therefore, the functions that you see
might differ from the cleanse functions shown in the previous figure.
The Cleanse Functions tool displays the added function in your workspace.

- 324 -
Note: Although this example shows a single graph function on the
workspace, you can add multiple functions to a cleanse function.
To move a function, click it and drag it wherever you need it on the
workspace.

6. Right-click on the function and choose Expanded Mode.


The expanded mode shows the labels for all available inputs and outputs
for this function.

For more on the modes, see "Function Modes" on page 327.


The color of the circle indicates the data type of the input or output. The
data types must match. In the following example, for the Round function,
the input is a Float value and the output is an Integer. Therefore, the
Inputs and Outputs have been changed to reflect the corresponding data
types.

- 325 -
For more information, see "Configuring Inputs" on page 328 and
"Configuring Outputs" on page 329.
7. Mouse-over the input connector, which is the little circle on the right side
of the input box. It turns red when ready for use.

8. Click the node and draw a line to one of the function input nodes.

9. Draw a line from one of the function output nodes to the output box node.

10. Click the button to save your changes. To learn about testing your new
function, see "Testing Functions" on page 330.

- 326 -
Workspace Commands

There are several ways to complete common tasks on the workspace.


• One way is to use the buttons on the toolbar. To learn more about these
buttons, see "Workspace Buttons" on page 327.
• Another method to access many of the same features is to right-click on
the workspace. The right-click menu has the following commands:

Function Modes

Function modes determine how the function is displayed on the workspace.


Each function has the following modes, which are accessible by right-clicking
the function:
Option Description
Compact Displays the function as a small box, with just the function name.
Standard Displays the function as a larger box, with the name and the nodes
for the input and output, but the nodes are not labeled. This is the
default mode.
Expanded Displays the function as a large box, with the name, the input and
output nodes, and the names of those nodes.
Logging Used for debugging. Choosing this option generates a log file for
Enabled this function when you run a Stage job (see "Stage Jobs" on page
556). The log file records the input and output for every time the
function is called during the stage job. There is a new log file
created for each stage job.
The log file is named <jobID><graph function name>.log and is
stored in:
\<infamdm_install_dir>\hub\cleanse\tmp\<ORS>
Note: Do not use this option in production, as it will consume disk
space and require performance overhead associated with the disk
I/O. To disable this logging, right-click on the function and uncheck
Enable Logging.
Delete Deletes the function from the graph function.
Object

You can cycle through the display modes (compact, standard, and expanded)
by double-clicking on the function.

Workspace Buttons

The toolbar on the right side of the workspace provides the following buttons.
Button Description
Save changes.

Edit the function inputs.

- 327 -
Button Description
Edit the function outputs.

Add a function. For more


information, see "Adding Functions
to a Graph Function" on page 323.
Add a constant. For more
information, see "Using Constants"
on page 328.
Add a conditional execution
component. For more information,
see "Using Conditions in Cleanse
Functions" on page 331.
Edit the selected component.

Delete the selected component.

Expand the graph. This makes


more room for the workspace on
the screen by hiding the left pane.

Using Constants

Constants are useful in cases where you know that you have standardized
input. For example, if you have a data set that you know consists entirely of
doctors, then you can use a constant to put Dr. in the title. When you use
constants in your graph function, they are differentiated visually from other
functions by their grey background color.

Configuring Inputs

To add more inputs:


1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the cleanse function that you want to configure.
4. Click the Details tab.
5. Right-click on the input and choose Edit inputs.
The Cleanse Functions tool displays the Inputs dialog.

- 328 -
Note: Once you create an input, you cannot later edit the input to change
its type. If you must change the type of an input, create a new one of the
correct type and delete the old one.

6. Click the button to add another input.


The Cleanse Functions tool displays the Add Parameter dialog.

7. Specify the following properties:

Field Description
Name Unique, descriptive name for this parameter.
Data Type Data type of this parameter.
Description Optional description of this parameter.
8. Click OK.
Add as many inputs as you need for your functions.

Configuring Outputs

To add more outputs:


1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the cleanse function that you want to configure.

- 329 -
4. Click the Details tab.
5. Right-click on the output and choose Edit outputs.
The Cleanse Functions tool displays the Outputs dialog.

Note: Once you create an output, you cannot later edit the output to
change its type. If you must change the type of an output, create a new
one of the correct type and delete the old one.

6. Click the button to add another output.


The Cleanse Functions tool displays the Add Parameter dialog.

Field Description
Name Unique, descriptive name for this parameter.
Data Type Data type of this parameter.
Description Optional description of this parameter.
7. Click OK.
Add as many outputs as you need for your functions.

Testing Functions
Once you have added and configured a graph or regular expression function, it
is recommended that you test it to make sure it is behaving as expected. This
test process mimics a single record coming into the function.

To test your function:

- 330 -
1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the cleanse function that you want to test.
4. Click the Test tab.
The Cleanse Functions tool displays the test screen.

5. For each input, specify the value that you want to test by clicking the cell
in the Value column and typing a value that complies with the data type of
the input.
• For Boolean inputs, the Cleanse Functions tool displays a true/false
drop-down list.
• For Calendar inputs, the Cleanse Functions tool displays a Calendar
button that you can click to select a date from the Date dialog.

6. Click Test.
If the test completed successfully, the output is displayed in the output
section.

Using Conditions in Cleanse Functions


This section describes how to add conditions to graph functions.

- 331 -
About Conditional Execution Components

Conditional execution components are similar to the construct of a case (or


switch) statement in a programming language. The cleanse function evaluates
the condition and, based on this evaluation, applies the appropriate graph
function associated with the case that matches the condition. If no case
matches the condition, then the default case is used—the case flagged with an
asterisk (*).

When to Use Conditional Execution Components

Conditional execution components are useful when, for example, you have
segmented data. Suppose a table has several distinct groups of data (such as
customers and prospects). You could create a column that indicated the group
of which the record is a member. Each group is called a segment. In this
example, customers might have C in this column. while prospects would have
P. You could use a conditional execution component to cleanse the data
differently for each segment. If the conditional value does not meet any of the
conditions you specify, then the default case will be executed.

Adding Conditional Execution Components

To add a conditional execution component:


1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the cleanse function that you want to configure.
4. Right-click on the workspace and choose Add Condition.
The Cleanse Functions tool displays the Edit Condition dialog.

5. Click the button to add a value.

- 332 -
The Cleanse Functions tool displays the Add Value dialog.

6. Enter a value for the condition. Using the customer and prospect example,
you would enter C or P. Click OK.
The Cleanse Functions tool displays the new condition in the list of
conditions on the left, as well as in the input box.
Add as many conditions as you require. You do need to specify a default
condition—the default case is automatically created when you create a
new conditional execution component. However, you can specify the
default case with the asterisk (*). The default case will be executed for all
cases that are not covered by the cases you specify.
7. Add as many functions as you require to process all of the conditions. For
more information, see "Adding Functions to a Graph Function" on page
323.
8. For each condition—including the default condition—draw a link between
the input node to the input of the function. In addition, draw links between
the outputs of the functions and the output of your cleanse function.

Note: You can specify nested processing logic in graph functions. For
example, you can nest conditional components within other conditional
components (such as nested case statements). In fact, you can define an
entire complex process containing many conditional tests, each one of which
contains any level of complexity as well.

Configuring Cleanse Lists


This section describes how to configure cleanse lists in your Informatica MDM
Hub implementation.

About Cleanse Lists


A cleanse list is a logical grouping of string functions that are executed at run
time in a predefined order. Use cleanse lists to standardize known string
values and to remove extraneous characters (such as punctuation) from input
strings.

Adding Cleanse Lists


To add a new cleanse list:

- 333 -
1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click Refresh to refresh your cleanse library. Used with external cleanse
engines.
Important: You must choose Refresh after acquiring a write lock and
before processing any records. Otherwise, your external cleanse engine
will throw an error.
4. Right-click your cleanse library in the list under Cleanse Functions and
choose choose Add Cleanse List.
The Cleanse Functions tool displays the Add Cleanse List dialog.

5. Specify the following properties:

Field Description
Name Unique, descriptive name for this cleanse list.
Description Optional description of this cleanse list.
6. Click OK.
The Cleanse Functions tool displays the details pane for the new (empty)
cleanse list on the right side of the screen.

Cleanse List Properties


This section describes the input and output properties of cleanse lists.

- 334 -
Input Properties

The following table describes input properties for cleanse lists.


Input Properties for Cleanse Lists
Property Description
Input string String value from the source system. Used as the target
of the search.
searchType Specifies the type of match (comparing cleanse list
items with the input string) to be executed against the
input string. One of the following values:
• ENTIRE—Compares cleanse list items with the
entire string. A match succeeds only when entire
input string is the same as a cleanse list item.
Default setting if this parameter is not specified.
• WORD—Compares the cleanse list items with each
word substring in the input string. A match succeeds
only if a cleanse list item is a substring flanked by
the following word boundaries in the input string:
beginning of string, end of string, or a space
character.
• ANYWHERE—Compares the cleanse list items with
any part of the input string. A match succeeds if a
cleanse list item is a substring of the input string,
regardless of where the substring appears in the
input string.
Note: String comparisons are case sensitive.
replaceAllOccurrences Specifies the degree to which matched substrings in the
input string are replaced with the matching cleanse list
item. One of the following values.
• TRUE—Replaces all occurrences of the matching
substring in the input string with the matching
cleanse list item.
• FALSE—Replaces only the first occurrence of the
matching substring in the input string with the
matching cleanse list item. Default setting if
replaceAllOccurrences is not specified.
Note: If the Strip parameter is TRUE, then occurrences
of the matching substring are removed rather than
replaced.
stopOnHit Specifies whether to continue processing the rest of the
cleanse list after one matching item has been found in
the input string. One of the following values.
• TRUE—Stops processing the cleanse list as soon as
the first cleanse list item is found in the input string
(as long as the searchType condition is met). Default
setting if stopOnHit is not specified.
• FALSE—Continues to search the input string for the
rest of the items in the cleanse list (in order to find
any further matching substrings).
Strip Specifies whether the matched text in the input string
will stripped from—or replaced in—the input string. One
of the following values.
• TRUE—Removes (rather than replaces) the

- 335 -
Property Description
matched text in the input string.
• FALSE—Replaces the match text in the input string.
Default setting if Strip is not specified.
Note: The replaceAllOccurrences parameter determines
whether replacement or removal affects all matches in
the input string or just the first match.
defaultValue Value to use for the output if none of the cleanse list
items was found in the input string. If this property is
not specified and no match was found, then the original
input string is used as the output.

Output Properties

The following table describes output properties for cleanse lists.


Property Description
output Output value of the cleanse list function.
matched Last matched value of the cleanse list.
matchFlag Indicates whether a match was found in the list (true) or not
(false).

Editing Cleanse List Properties


New cleanse lists are empty lists. You need to edit the cleanse list to add
match and output strings.

To edit your cleanse list to add match and output strings:


1. Start the Cleanse Functions tool according to the instructions in "Starting
the Cleanse Functions Tool" on page 315.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the cleanse list that you want to configure.
The Cleanse Functions tool displays information about the cleanse list in
the right pane.

- 336 -
4. Change the display name and description in the right pane, if you want, by

clicking the Edit button next to a value that you want to change.
5. Click the Details tab.
The Cleanse Functions tool displays the details for the cleanse list.

6. Click the button in the right hand pane.


The Cleanse Functions tool displays the Output String dialog.

7. Specify a search string, an output string, a match type, and click OK.

- 337 -
The search string is the input that you want to cleanse, resulting in the
output string.
Important: Informatica MDM Hub will search through the strings in the
order in which they are entered. The order in which you specify the items
can therefore affect the results obtained. To learn more about the types of
matches available, see "Types of String Matches" on page 338.
Note: As soon as you add strings to a cleanse list, the cleanse list is
saved.
The strings that you specified are shown in the Cleanse List Details section.
8. You can add and remove strings. You can also move string forward or
backward in the cleanse list, which affects their order in run-time
execution sequence and, therefore, the results obtained.
9. You can also specify the “Default value” for every input string that does not
match any of the search strings.
If you do not specify a default value, every input string that does not
match a search string is passed to the output string with no changes.

Types of String Matches

For the output string, you can specify one of the following match types:
Match Description
Type
Exact Text string (for example, “IBM”). Note that string matches are not
Match case sensitive. For example, the string test will also match TEST or
Test.
Regular Pattern using the Java syntax for regular expressions (for
Expression example, “I.M.*” would match “IBM”, “IB Corp” and “IXM Inc.”) To
parse a name field that consists of first, middle, and last names,
you could use the following regular expression (\S+$) will give
you the last name no matter what name you give it.
The regular expression that is typed in as a parameter will be used
against the string and the matched output will be sent to the
outlet. You can also specify the group number to match an inner
group of the regular expression. Refer to the Javadoc for
java.util.regex.Pattern for the documentation on the regular
expression construction and how groups work.
SQL Match Pattern using the SQL syntax for the LIKE operator in SQL (for
example, “I_M%” would match “IBM”, “IBM Corp” and “IXM Inc.”).

Importing Match Strings

To import match strings (such as a file or a database table):

1. Click the button in the right hand pane.


The Import Match Strings wizard opens.

- 338 -
2. Specify the connection properties for the source of the data and click
Next.
The Cleanse Functions tool displays a list of tables available for import.

3. Select the table you want to import and click Next.


The Cleanse Functions tool displays a list of columns available for import.

4. Click the columns you want to import and click Next.


The Cleanse Functions tool displays a list of match strings available for
import.

- 339 -
You can import the records of the sample data either as phrases (one entry
for each record) or as words (one entry for each word in each record).
Choose whether to import the match strings as words or phrases and then
click Finish.
The Cleanse List Details box is now populated with data from the specified
source.

Note: The imported match strings are not part of the match list. To add
them to the match list, you need to move them to the Search Strings on
the right hand side.

- 340 -
• To add match strings to the match list with the match string value in
both the Search String and Output String, select the strings in the
Match Strings list, and

click the button.


• If you add match strings to the match list with an Output String value
that you want to define, simply click the record you added and specify
a new Search and Output String.

• To add all Match Strings to the match list, click the button.

• To clear all Match Strings from the match list, click the button.
• Repeat these steps until you have constructed a complete match list.

5. When you have finished changing the match list properties, click the
button to save your changes.

Importing Match Output Strings

To import match output strings, such as a file or a database table:

1. Click the button in the right hand pane.


The Import Match Output Strings wizard opens.

2. Specify the connection properties for the source of the data.


3. Click Next.
The Cleanse Functions tool displays a list of tables available for import.

4. Select the table that you want to import.


5. Click Next.
The Cleanse Functions tool displays a list of columns available for import.

- 341 -
6. Select the columns that you want to import.
7. Click Next.
The Cleanse Functions tool displays a list of match strings available for
import.

8. Click Finish.
The Cleanse List Details box is now populated with data from the specified
source.

9. When you have finished changing the match list properties, click the
button to save your changes.

- 342 -
Chapter 13: Configuring the Load
Process

This chapter explains how to configure the load process in your Informatica
MDM Hub implementation. For an introduction, see "Load Process" on page
227.

Chapter Contents
• "Before You Begin" on page 343
• "Configuration Tasks for Loading Data" on page 343
• "Configuring Trust for Source Systems" on page 344
• "Configuring Validation Rules" on page 353

Before You Begin


Before you begin to configure the load process, you must have completed the
following tasks:
• Installed Informatica MDM Hub and created the Hub Store according to the
instructions in the Informatica MDM Hub Installation Guide
• Built the schema according to the instructions in "Building the Schema" on
page 73
• Defined source systems according to the instructions in "Configuring
Source Systems" on page 264
• Created landing tables according to the instructions in "Configuring Landing
Tables" on page 269
• Created staging tables according to the instructions in "Configuring Staging
Tables" on page 275
• Learned about the load process described in "Load Process" on page 227

Configuration Tasks for Loading Data


In addition to the prerequisites described in "Before You Begin" on page 343,
to set up the process of loading data in your Informatica MDM Hub
implementation, you must complete the following tasks in the Hub Console:
• "Configuring Trust for Source Systems" on page 344

- 343 -
• "Configuring Validation Rules" on page 353

For additional configuration settings that can affect the load process, see:
• "Loading by RowID" on page 296
• "Distinct Systems" on page 445
• "Generate Match Tokens on Load" on page 92
• "Load Process" on page 227

Configuring Trust for Source Systems


This section describes how to configure trust in your Informatica MDM Hub
implementation. For an introduction, see "Trust Settings" on page 229.

About Trust
Several source systems may contain attributes that correspond to the same
column in a base object table. For example, several systems may store a
customer’s address. However, one system might be a more reliable source for
that data than others. If these systems disagree, then Informatica MDM Hub
must decide which value is the best one to use.

To help with comparing the relative reliability of column data from different
source systems, Informatica MDM Hub allows you to configure trust for a
column. Trust is a designation the confidence in the relative accuracy of a
particular piece of data. For each column from each source, you can define a
trust level represented by a number between 0 and 100, with zero being the
least trustworthy and 100 being the most trustworthy. By itself, this number
has no meaning. It becomes meaningful only when compared with another
trust number to determine which is higher.

Trust takes into account the age of data, how much its reliability has decayed
over time, and the validity of the data. Trust is used to determine survivorship
(when two records are consolidated), and whether updates from a source
system are sufficiently reliable to update the master record.

Trust Levels

A trust level is a number between 0 and 100. By itself, this number has no
meaning. It has meaning only when compared with another trust number.

Data Reliability Decays Over Time

The reliability of data from a given source system can decay (diminish) over
time. In order to reflect this fact in trust calculations, Informatica MDM Hub
allows you to configure decay characteristics for trust-enabled columns. The

- 344 -
decay period is the amount of time that it takes for the trust level to decay
from the maximum trust level (see "Maximum Trust" on page 347) to the
minimum trust level (see "Minimum Trust" on page 347). For more
information, see "Units" on page 348, "Decay" on page 348, and "Graph Type"
on page 348.

Trust Calculations

The load process calculates trust for trust-enabled columns in the base object.
For records with trust-enabled columns, the load process assigns a trust score
to cell data. This trust score is initially based on the configured trust settings
for that column. The trust score may be subsequently downgraded when the
load process applies validation rules—if configured for a trust-enabled
column—after the trust calculations. For more information, see "Run-time
Execution Flow of the Load Process" on page 231.

Trust Calculations for Load Update Operations

During the load process, if a record in the staging table will be used for a load
update operation, and if that record contains a changed cell value in a trust-
enabled column, the load process calculates trust scores for:
• the cell data in the source record in the staging table (which contains the
updated information)
• the cell data in the target record in the base object (which contains the
existing information)

If the cell data in the source record has a higher trust score than the cell data
in the target record, then Informatica MDM Hub updates the cell in the base
object record with the cell data in the staging table record.

Trust Calculations When Consolidating Two Base Object


Records

When two records in a base object are consolidated, Informatica MDM Hub
calculates the trust score for each trusted column in the two records being
merged. Cells with the highest trust scores survive in the final consolidated
record. If the trust scores are the same, then Informatica MDM Hub compares
records according to an order of precedence, as described in "Survivorship
and Order of Precedence" on page 221.

Control Tables for Trust-Enabled Columns

The following figure shows control tables associated with trust-enabled


columns in a base object.

- 345 -
For each trust-enabled column in a base object record, Informatica MDM Hub
maintains a record in a corresponding control table that contains the last
update date and an identifier of the source system. Based on these settings,
Informatica MDM Hub can always calculate the current trust for the column
value.

If history is enabled for a base object, Informatica MDM Hub also maintains a
separate history table for the control table, in addition to history tables for the
base object and its cross-reference table.

Cell Values in Base Object Records and Cross-Reference


Records

The cross-reference table for a base object contains the most recent value
from each source system. By default (without trust settings), the base object
contains the most recent value no matter which source system it comes from.

For trust-enabled columns, the cell value in a base object record might not
have the same value as its corresponding record in the cross-reference table.
Validation rules, which are run during the load process after trust calculations,
can downgrade trust for a cell so that a source that had previously provided
the cell value might not update the cell. For more information about validation
rules, see "Configuring Validation Rules" on page 353.

- 346 -
Overriding Trust Scores

Data stewards can manually override a calculated trust setting if they have
direct knowledge that a particular value is correct. Data stewards can also
enter a value directly into a record in a base object. For more information, see
the Informatica MDM Hub Data Steward Guide.

Trust for State-Enabled Base Objects

For state-enabled base objects, trust is calculated for records with a PENDING
or ACTIVE state, but records with a DELETE state are ignored. For more
information, see "State Management" on page 159.

Batch Job Constraints on Number of Trust-Enabled Columns

Synchronize batch jobs can fail for base objects with a large number of trust-
enabled columns. Similarly, Automerge jobs can fail if there is a large number
of trust-enabled or validation-enabled columns. The exact number of columns
that cause the job to fail is variable and is based on the length of the column
names and the number of trust-enabled columns (or, for Automerge jobs,
validation-enabled columns as well). Long column names are at—or close to—
the maximum allowable length of 26 characters. To avoid this problem, keep
the number of trust-enabled columns below 60 and/or the length of the column
names short. A work around is to enable all trust/validation columns before
saving the base object to avoid running the synchronization job.

Trust Properties
This section describes the trust properties that you can configure for trust-
enabled columns. Trust properties are configured separately for each source
system that could provide records for trust-enabled columns in a base object.

Maximum Trust

The maximum trust (starting trust) is the trust level that a data value will
have if it has just been changed. For example, if source system X changes a
phone number field from 555-1234 to 555-4321, the new value will be given
system X’s maximum trust level for the phone number field. By setting the
maximum trust level relatively high, you can ensure that changes in the
source systems will usually be applied to the base object.

Minimum Trust

The minimum trust is the trust level that a data value will have when it is old
(after the decay period has elapsed). This value must be less than or equal to

- 347 -
the maximum trust.

Note: If the maximum and minimum trust are equal, then the decay curve is
a flat line and the decay period and decay type have no effect.

Units

Specifies the units used in calculating the decay period—day, week, month,
quarter, or year.

Decay

Specifies the number (of days, weeks, months, quarters, or years) used in
calculating the decay period.

Note: For the best graph view, limit the decay period you specify to between
1 and 100.

Graph Type

Decay follows a pattern in which the trust level decreases during the decay
period. The graph types show these decay patterns have any of the following
settings.
Icon Graph Description
Type
Linear Simplest decay. Decay follows a straight line from the
maximum trust to the minimum trust.

Rapid Most of the decrease occurs toward the beginning of the


Initial decay period. Decay follows a concave curve. If a source
Slow system has this graph type, then a new value from the
Later system will probably be trusted, but this value will soon
(RISL) become much more likely to be overridden.
Slow Most of the decrease occurs toward the end of the decay
Initial period. Decay follows a convex curve. If a source system
Rapid has this graph type, it will be relatively unlikely for any
Later other system to override the value that it sets until the value
(SIRL) is near the end of its decay period.

Test Offset Date

By default, the start date for trust decay shown in the Trust Decay Graph is the
current system date. To see the impact of trust decay based on a different

- 348 -
start date for a given source system, specify a different test offset date
according to the instructions in "Changing the Offset Date for a Trust-Enabled
Column" on page 352.

Considerations for Setting Trust Values


Choosing the correct trust values can be a complex process. It is not enough
to consider one system in isolation. You must ensure that the combinations of
trust settings for all of the source systems that contribute to a particular
column produce the behavior that you want. Trust levels for a source system
are not absolute—they are meaningful only in relation to the trust levels of
other source systems that contribute data for the trust-enabled column.

When determining trust, consider the following questions.


• Does the source system validate this data value? How reliably does it do
this?
• How important is this data value to the users of the source system, as
compared with other data values? Users are likely to put the most effort
into validating the data that is central to their work.
• How frequently is the source system updated?
• How frequently is a particular attribute likely to be updated?

Enabling Trust for a Column


Trust is enabled and configured on a per-column basis for base objects in the
Schema Manager. Trust does not apply to columns in any other tables in an
ORS. For more information, see "Configuring Columns in Tables" on page 102.

Trust is disabled by default. When trust is disabled, Informatica MDM Hub uses
the value from the most recently-executed load process regardless of which
source system it comes from. If column data for a base object comes from
only one system, then trust should remain disabled for that column.

Trust should be enabled, however, for columns in which data can come from
multiple source systems. If you enable trust for a column, you also assign
trust levels to specify the relative reliability of any source systems that could
provide records that update the column.

Assigning Trust Levels to Trust-Enabled Columns


This section describes how to configure trust levels for trust-enabled columns.
Assigning Trust Levels to the Admin Source System

- 349 -
Before You Configure Trust for Trust-Enabled Columns

Before you configure trust for trust-enabled columns, you must have:
• enabled trust for base object columns according to the instructions in
"Enabling Trust for a Column" on page 349
• configured staging tables in the Schema Manager, including associated
source systems and staging table columns that correspond to base object
columns, according to the instructions in "Configuring Staging Tables" on
page 275

Specifying Trust for the Administration Source System

At a minimum, you must specify trust settings for trust-enabled columns in


the administration source system (called Admin by default). This source
system represents manual updates that you make within Informatica MDM
Hub. This source system can contribute data to any trust-enabled column. Set
the trust settings for this source system to high values (relative to other
source systems) to ensure that manual updates override any existing values
from other source systems. For more information, see "Administration Source
System" on page 265.

Assigning Trust Levels to Trust-Enabled Columns in a Base Object

To assign trust levels to trust-enabled columns in a base object:


1. Start the Systems and Trust tool according to the instructions in "Starting
the Systems and Trust Tool" on page 265.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation pane, expand the Trust node.
The Systems and Trust tool displays all base objects with trust-enabled
columns.

4. Select a base object.

- 350 -
The Systems and Trust tool displays a read-only view of the trust-enabled
columns in the selected base object, indicating with a check mark whether
a given source system supplies data for that column.
Note: The association between trust-enabled columns and source systems
is specified in the staging tables for this base object. For more
information, see "Configuring Staging Tables" on page 275.
5. Expand a base object to see its trust-enabled columns.

6. Select the trust-enabled column that you want to configure.


For the selected trust-enabled column, the Systems and Trust tool displays
the list of source systems associated with the column, along with editable
trust settings to be configured per source system, and a trust decay graph.
7. Specify the trust properties for each column. For more information, see
"Trust Properties" on page 347.
8. Optionally, you can change the offset date, as described as "Changing the
Offset Date for a Trust-Enabled Column" on page 352.

9. Click the button to save your changes.


The Systems and Trust tool refreshes the Trust Decay Graph based on the
trust settings you specified for each source system for this trust-enabled
column.

The X-axis is the trust score and the Y-axis is the time.

- 351 -
Changing the Offset Date for a Trust-Enabled Column

By default, the Trust Decay Graph shows the trust decay across all source
systems from the current system date. You can specify a different date (such
as a future date) to test your current trust settings and see how trust would
decay from that date. Note that offset dates are not saved.

To change the offset date for a trust-enabled column:


1. In the Systems and Trust tool, select a trust-enabled column according to
the instructions in "Assigning Trust Levels to Trust-Enabled Columns in a
Base Object" on page 350.

2. Click the Calendar button next to the source system for which you want
to specify a different offset date.
The Systems and Trust tool prompts you to specify a date.

3. Select a different date.


4. Choose OK.
The Systems and Trust tool updates the Trust Decay Graph based on your
current trust settings and the Offset Date you specified.

To remove the Offset Date:


• Click the Delete button next to the source system for which you want
to remove the Offset Date.
The Systems and Trust tool updates the Trust Decay Graph based on your
current trust settings and the current system date.

Running Synchronize Batch Jobs After Changes to Trust Settings

After records have been loaded into a base object, if you enable trust for any
column, or if you change trust settings for any trust-enabled column(s) in that

- 352 -
base object, then you must run the Synchronize batch job (see "Synchronize
Jobs" on page 557) before running the consolidation process. If this batch job
is not run, then errors will occur during the consolidation process.

Configuring Validation Rules


This section describes how to configure validation rules in your Informatica
MDM Hub implementation. For an introduction, see "Validation Rules" on page
231.

About Validation Rules


A validation rule downgrades trust for a cell value when the cell value matches
a given condition. Each validation rule specifies:
• a condition that determines whether the cell value is valid
• an action to take if the condition is met (downgrade trust by a certain
percentage)

For example, the following validation rule:


Downgrade trust on First_Name by 50% if Length < 3’

consists of:
• Condition: Length < 3
• Action: Downgrade trust on First_Name by 50%

If the Reserve Minimum Trust flag is set for the column, then the trust cannot
be downgraded below the column’s minimum trust. You use the Schema
Manager to configure validation rules for a base object.

Validation rules are executed during the load process, after trust has been
calculated for trust-enabled columns in the base object. If validation rules
have been defined, then the load process applies them to determine the final
trust scores, and then uses the final trust values to determine whether to
update records in the base object with cell data from the updated records. For
more information, see "Run-time Execution Flow of the Load Process" on page
231.

Validation Checks

A validation check can be done on any column in a base object. The


downgrade resulting from the validation check can be applied to the same
column, as well as to any other columns that can be validated. Invalid data in
one column can therefore result in trust downgrades on many columns.

- 353 -
For example, supposed you used an address verification flag in which the flag
is OK if the address is complete and BAD if the address is not complete. You
could configure a validation rule that downgrades the trust on all address
fields if the verification flag is not OK. Note that, in this case, the verification
flag should also be downgraded.

Required Columns

Validation rules are applied regardless of the source of the incoming data.
However, validation rules are applied only if the staging table or if the input—
a Services Integration Framework (SIF) request—contains all of the required
columns. If any required columns are missing, validation rules are not
applied.

Recalculating Trust Scores After Changing Validation Rules

If a base object contains existing data and you change validation rules, you
must run the Revalidate job to recalculate trust scores for new and existing
data, as described in "Revalidate Jobs" on page 556.

Validation Rules and State-Enabled Base Objects

For state-enabled base objects, validation rules are applied to records with a
PENDING or ACTIVE state, but records with a DELETE state are ignored. For
more information, see "State Management" on page 159

Automerge Job Constraints on Number of Validation Columns

Automerge jobs can fail if there is a large number of validation-enabled


columns. The exact number of columns that cause the job to fail is variable
and is based on the length of the column names and the number of validation-
enabled columns. Long column names are at—or close to—the maximum
allowable length of 26 characters. To avoid this problem, keep the number of
validation-enabled columns below 60 and/or the length of the column names
short. A work around is to enable all trust/validation columns before saving
the base object to avoid running the synchronization job.

Enabling Validation Rules for a Column


A validation rule is enabled and configured on a per-column basis for base
objects in the Schema Manager. Validation rules do not apply to columns in
any other tables in an ORS. For more information, see "Configuring Columns
in Tables" on page 102.

- 354 -
Validation rules are disabled by default. Validation rules should be enabled,
however, for any trust-enabled columns that will use validation rules for trust
downgrades.

How the Downgrade Percentage is Applied

Validation rules downgrade trust scores according to the following algorithm:


Final trust = Trust - (Trust * Validation_Downgrade / 100)

For example, with a validation downgrade percentage of 50%, and a trust


level calculated at 60:
Final Trust Score = 60 - (60 * 50 / 100)

The final trust score is:


Final Trust Score = 60 - 30 = 30

Execution Sequence of Validation Rules

Validation rules are executed in sequence. If multiple validation rules are


configured for a column, only one validation rule—the rule with the greatest
downgrade percentage—is applied to the column. Downgrade percentages are
not cumulative—rather, the “winning” validation rule overwrites any previous-
applied changes.

Therefore, when configuring multiple validation rules for a column, specify an


execution order of increasing downgrade percentage, starting with the
validation rule that has the lowest impact (downgrade percentage) first, and
ending with the validation rule that has the highest impact (downgrade
percentage) last.

Note: The execution sequence for validation rules differs between the load
process described in this chapter and PUT requests invoked by external
applications using the Services Integration Framework (SIF). For PUT
requests, validation rules are executed in order of decreasing downgrade
percentage. For more information, see the Informatica MDM Hub Services
Integration Framework Guide and the Informatica MDM Hub Javadoc.

Navigating to the Validation Rules Node


To configure validation rules, you navigate to the Validation Rules node for a
base object in the Schema Manager:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.

- 355 -
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Expand the tree for the base object that you want to configure, and then
click its Validation Rules Setup node.
The Schema Manager displays the Validation Rules editor.

The Validation Rules editor is divided into the following sections.

Pane Description
Number Number of configured validation rules for the selected base
of Rules object.
Validation List of configured validation rules for the selected base object.
Rules
Properties Properties for the selected validation rule. For more
Pane information, see "Validation Rule Properties" on page 356.

Validation Rule Properties


Validation rules have the following properties.

Rule Name

A unique, descriptive name for this validation rule.

Rule Type

The type of validation rule. One of the following values.

Rule Type Description


Existence Trust will be downgraded if the cell has a null value (the cell value
Check does not exist).
Domain Trust will be downgraded if the cell value does not fall within a list
Check or range of allowed values.
Referential Trust will be downgraded if the value in a cell does not exist in the
Integrity set of values in a column on a different table. This rule is for use in
cases where an explicit foreign key has not been defined, and an
incorrect cell value can be allowed if there is no correct cell value
that has higher trust.
Pattern Trust will be downgraded if the value in a cell conforms (LIKE) or
Validation does not conform (NOT LIKE) to the specified pattern.
Custom Used for entering complex validation rules. This rule type should
only be used when SQL functions (such as LENGTH, ABS, etc.)
might be required, or if a complex join is required.
Note: Custom SQL code must conform with the SQL syntax for
your database platform. SQL entered in this pane is not validated
at design time. Invalid SQL syntax errors cause problems when
the load process executes.

- 356 -
Rule Columns

For each column, you specify the downgrade percentage and whether to
reserve minimum trust.

Downgrade Percentage

Percentage by which the trust level of the specified column will be decreased
if this validation rule condition is met. The larger the percentage, the greater
the downgrade. For example, 0% has no effect on the trust, while 100%
downgrades the trust completely (unless the reserve minimum trust is
specified, in which case 100% downgrades the trust so that it equals minimum
trust).

If trust is downgraded by 100% and you have not enabled minimum reserve
trust for the column, then the value of that column will not be populated into
the base object.

Reserve Minimum Trust

Specifies what will happen if the downgrade causes the trust level to fall below
the column’s minimum trust level. You can retain the minimum trust (so that
the trust level will be reduced to the minimum trust but no lower). If this box
is cleared (unchecked), then the trust level will be reduced by the specified
percentage even if this means going below the minimum trust.

Rule SQL

Specifies the SQL WHERE clause representing the condition for this validation
rule. During the load process, the validation rule is executed. If data meets
the criteria specified in the Rule SQL field, then the trust value is downgraded
by the downgrade percentage configured for this validation rule.

SQL WHERE Clause Based on the Rule Type

The Validation Rules editor prompts you to configure the SQL WHERE clause
based on the selected Rule Type for this validation rule.

During the load process, this query is used to check the validity of the data in
the staging table.

Example SQL WHERE Clauses

The following table provides examples of SQL WHERE clauses based on the
selected rule type.

- 357 -
Examples of WHERE Clause for Each Rule Type
Rule Type WHERE clause Examples Result
Existence WHERE S.ColumnName WHERE S.MIDDLE_ Affected columns will be
Check IS NULL NAME IS NULL downgraded for records
with middle names that
are null. The records that
do not meet the condition
will not be affected.
Domain WHERE S.ColumnName WHERE S.Gender NOT Affected columns will be
Check IN ('?', '?', '?') IN ('M', 'F', 'U') downgraded if the Gender
is any value other than M,
F, or U.
Referential WHERE NOT EXISTS WHERE NOT EXISTS Affected columns will be
Integrity (SELECT <blank>’a’ (SELECT DISTINCT downgraded for records
FROM ? WHERE ?.? = 'a' FROM ACCOUNT_ with Account Type values
S.<Column_Name> TYPE WHERE
that are not on the
Account Type table.
WHERE NOT EXISTS ACCOUNT_
(SELECT <blank> TYPE.Account_Type
'a' FROM <Ref_ = S.Account_Type
Table> WHERE <Ref_
Table>.<Ref_
Column> =
S.<Column_Name>
Pattern WHERE S.ColumnName WHERE S.eMail_ Downgrade will be applied
Validation LIKE 'Pattern' Address NOT LIKE if the e-mail address does
'%@%' not contain an @
character.
Custom WHERE WHERE Downgrade will be applied
LENGTH(S.ZIP_CODE) if the length of the zip
> 4 code column is less than
4.

Table Aliases and Wildcards

You can use the wildcard character (*) to reference tables via an alias.
• s.* aliases the staging table
• I.* aliases a temporary table and provides ROWID_OBJECT, PKEY_SRC_
OBJECT, and ROWID_SYSTEM information for the records being updated.

Custom Rule Types and SQL WHERE Syntax

For Custom rule types, write SQL statements that are well formed and well
tuned. If you need more information about SQL WHERE clause syntax and wild
card patterns, refer to the product documentation for the database platform
used in your Informatica MDM Hub implementation.

Note: Be sure to specify precedence correctly using parentheses according to


the SQL syntax for your database platform. Incorrect or omitted parentheses
can have unexpected results and long-running queries. For example, the
following statement is ambiguous and leaves it up to the database server to
determine precedence:

- 358 -
WHERE conditionA OR conditionB or conditionC

The following statements use parentheses to explicitly specify precedence:


WHERE (conditionA AND conditionB) OR conditionC
WHERE conditionA AND (conditionB OR conditionC)

These two statements will yield very different results when evaluating
records.

Adding Validation Rules


To add a validation rule:
1. Navigate to the Validation Rules editor. For more information, see
"Navigating to the Validation Rules Node" on page 355.

2. Click the button.


The Schema Manager displays the Add Validation Rule dialog.

3. Specify the properties for this validation rule. For more information, see
"Validation Rule Properties" on page 356.

- 359 -
4. If you want, select the rule column(s) for this validation rule by clicking

the button.
The Validation Rules editor displays the Select Rule Columns dialog.

The available columns are those that have the Validate flag enabled (see
"Column Properties" on page 103. For more information, see "Configuring
Columns in Tables" on page 102.
Select the column(s) for which the trust level will be downgraded if the
condition specified in the WHERE clause for this validation rule is met, and
then click OK.
5. Click OK.
The Schema Manager adds the new rule to the list of validation rules.
Note: If a base object contains existing data and you change validation
rules, you must run the Revalidate job to recalculate trust scores for new
and existing data, as described in "Revalidate Jobs" on page 556.

Editing Validation Rule Properties


To edit a validation rule:
1. Navigate to the Validation Rules editor in the Schema Manager. For more
information, see "Navigating to the Validation Rules Node" on page 355.
2. In the Navigation Rules list, select the navigation rule that you want to
configure.
The Validation Rules editor displays the properties for the selected
validation rule.

- 360 -
3. Specify the editable properties for this validation rule. You cannot change
the rule type. For more information, see "Validation Rule Properties" on
page 356.
4. If you want, select the rule column(s) for this validation rule by clicking

the button.
The Validation Rules editor displays the Select Rule Columns dialog.

The available columns are those that have the Validate flag enabled (see
"Column Properties" on page 103. For more information, see "Configuring
Columns in Tables" on page 102.
Select the column(s) for which the trust level will be downgraded if the
condition specified in the WHERE clause for this validation rule is met, and
then click OK.

5. Click the button to save changes.


Note: If a base object contains existing data and you change validation
rules, you must run the Revalidate job to recalculate trust scores for new
and existing data, as described in "Revalidate Jobs" on page 556.

- 361 -
Changing the Sequence of Validation Rules
The execution order for validation rules is extremely important. For more
information, see "Execution Sequence of Validation Rules" on page 355.

Use the following buttons to change the sequence of validation rules in the list.
Click To....
Move the selected validation rule higher in the sequence.

Move the selected validation rule further down in the sequence.

Removing Validation Rules


To remove a validation rule:
1. Navigate to the Validation Rules editor in the Schema Manager. For more
information, see "Navigating to the Validation Rules Node" on page 355.
2. In the Validation Rules list, select the validation rule that you want to
remove.

3. Click the button.


The Schema Manager prompts you to confirm deletion.
4. Click Yes.
Note: If a base object contains existing data and you change validation
rules, you must run the Revalidate job to recalculate trust scores for new
and existing data, as described in "Revalidate Jobs" on page 556.

- 362 -
Chapter 14: Configuring the Match
Process

This chapter describes how to configure your Hub Store to identify and handle
potential duplicate records. For an introduction to the match process, see
"Match Process" on page 245.

Chapter Contents
• "Configuration Tasks for the Match Process" on page 363
• "Navigating to the Match/Merge Setup Details Dialog" on page 365
• "Configuring Match Properties for a Base Object" on page 366
• "Configuring Match Paths for Related Records" on page 373
• "Configuring Match Columns" on page 387
• "Configuring Match Rule Sets" on page 399
• "Configuring Match Column Rules for Match Rule Sets" on page 407
• "Configuring Primary Key Match Rules" on page 434
• "Investigating the Distribution of Match Keys" on page 438
• "Excluding Records from the Match Process" on page 441

Before You Begin


Before you begin, you must have installed Informatica MDM Hub, created the
Hub Store according to the instructions in Informatica MDM Hub Installation
Guide, and built the schema according to the instructions in "Building the
Schema" on page 73.

Configuration Tasks for the Match Process


This section provides an overview of the configuration tasks associated with
the match process. For an introduction to the match process, see "Match
Process" on page 245.

- 363 -
Understanding Your Data
Before you define match rules, you must be very familiar with your data and
understand:
• the distribution of the values in the columns you intend to use to determine
duplicate records, and
• the general proportion of the total number of records that are duplicates.

Base Object Properties Associated with the Match


Process
The following base object properties affect the behavior of the match process.
Property Description
Duplicate Used only with the Match for Duplicate Data job for initial data
Match loads. For more information, see "Duplicate Match Threshold" on
Threshold page 91.
Max Timeout (in minutes) when executing a match rule. If exceeded,
Elapsed the match process exits. For more information, see "Max Elapsed
Match Match Minutes" on page 91.
Minutes
Match If enabled, then an audit table (BusinessObjectName_FMHA) is
Flag audit created and populated with the userID of the user who, in Merge
table Manager, queued a manual match record for automerging. For
more information, see "Match Flag Audit Table" on page 92 and the
Informatica MDM Hub Data Steward Guide.

Configuration Steps for Defining Match Rules


To define match rules:
1. Configure the match properties for the base object. For more information,
see "Setting Match Properties" on page 366.
2. Define your match columns. For more information, see "Match Columns
Depend on the Search Strategy" on page 388.
3. Define a match rule set for your match rules. For more information, see
"Adding Match Rule Sets" on page 405.
4. Define your match rules for the rule set. For more information, see
"Adding Match Column Rules" on page 424.
5. Repeat steps 3 and 4 until you are finished creating match rules.
6. Based on your knowledge of your data, determine whether you require
matching based on primary keys. For more information, see "Configuring
Primary Key Match Rules" on page 434.
7. If your data is appropriate for primary key matching, create your primary
key match rules. For more information, see "Adding Primary Key Match
Rules" on page 434.

- 364 -
8. Tune your rules. This is an iterative process by which you apply your
match rules to a representative data set, analyze the results, and adjust
your settings to optimize the match performance.

Configuring Base Objects with International Data


Informatica MDM Hub supports matching for base objects that contain data
from non-United States populations, as well as base objects that contain data
from different populations (for example, the United States and China). For
more information, see "Configuring Match Settings for Non-US Populations" on
page 699.

Navigating to the Match/Merge Setup Details


Dialog
To set up the match and merge process for a base object, begin by completing
the following steps:
1. Start the Schema Manager. For more information, see "Starting the
Schema Manager" on page 81.
2. In the schema navigation tree, expand the base object for which you want
to define match properties.
3. In the schema navigation tree, select Match/Merge Setup.
The Schema Manager displays the Match/Merge Setup Details dialog.

If you want to change settings, you need to Acquire a write lock according
to the instructions in "Acquiring a Write Lock" on page 36.

The Match/Merge Setup Details dialog contains the following tabs:


Tab Name Description
Properties Summarizes the match/merge setup and provides various
configurable match/merge settings. For more information, see

- 365 -
Tab Name Description
"Configuring Match Properties for a Base Object" on page 366.
Paths Allows you to configure the match path for parent/child
relationships for records in different base objects or in the same
base object. For more information, see "Configuring Match Paths
for Related Records" on page 373.
Match Allows you to configure match columns for match column rules.
Columns For more information, see "Configuring Match Columns" on page
387 and "Configuring Match Column Rules for Match Rule Sets" on
page 407.
Match Rule Allows you to define a search strategy and rules using match rule
Sets sets. For more information, see "Configuring Match Rule Sets" on
page 399.
Primary Allows you to define primary key match rules. For more
Key Match information, see "Configuring Primary Key Match Rules" on page
Rules 434.
Match Key Shows the distribution of match keys. For more information, see
Distribution "Investigating the Distribution of Match Keys" on page 438.
Merge Allows you to merge and link settings. For more information, see
Settings "Configuring the Consolidate Process" on page 443

Configuring Match Properties for a Base


Object
You must set the match properties for a base object before you can configure
other match features, such as match columns and match rules. These match
properties apply to all rules for the base object.

Setting Match Properties


You configure match properties for each base object. These settings apply to
all of its match rules and rule sets.

To configure match properties for a base object:


1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the base object that you want to configure according to the instructions in
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. In the Match/Merge Setup Details pane, click the Properties tab.
The Schema Manager displays the Properties tab.

- 366 -
3. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

For a description of each property, see the next section, "Match


Properties" on page 367.
4. Edit the property settings that you want to change, clicking the Edit button

next to the field if applicable.

5. Click the Save button to save your changes.

Match Properties
This section describes the configuration settings on the Match Properties tab.

- 367 -
Calculated, Read-Only Fields

The Match Properties tab displays the following read-only fields.


Read-Only Match Properties
Property Description
Match Columns Number of match columns configured for this base
object. Read-only.
Match Rule Sets Number of match rule sets configured for this
base object. Read-only.
Match Rules in Active Number of match rules configured for this base
Set object in the rule set currently selected as active.
Read-only.
Primary key match Number of primary key match rules configured for
rules this base object. Read-only.

Maximum Matches for Manual Consolidation

This setting helps prevent data stewards from being overwhelmed with
thousands of matches for manual consolidation. This sets the limit on the list
of possible matches that must be decided upon by a data steward (default is
1000). Once this limit is reached, Informatica MDM Hub stops the match
process until the number of records for manual consolidation has been
reduced.

This value is calculated by checking the count of records with a consolidation_


ind=2. At the end of each automatch and merge cycle, this count is checked
and, if the count exceeds the maximum number of matches for manual
consolidation, then the automatch-and-merge process will exit.

Number of Rows per Match Job Batch Cycle

This setting specifies an upper limit on the number of records that Informatica
MDM Hub will process for matching during match process execution (Match or
Auto Match and Merge jobs). When the match process starts executing, it
begins by flagging records to be included in the match job batch. From the
pool of new/unconsolidated records that are ready for match
(CONSOLIDATION_IND=4, as described in "Consolidation Indicator" on page
219), the match process changes CONSOLIDATION_IND to 3. The number of
records flagged is determined by the Number of Rows per Match Job Batch
Cycle. The match process then matches those records in the match job batch
against all of the records in the base object.

The number of records in the match job batch affects how long the match
process takes to execute. The value to specify depends on the size of your
data set, the complexity of your match rules, and the length of the time
window you have available to run the match process. The default match batch
size is low (10). You increase this based on the number of records in the base

- 368 -
object, as well as the number of matches generated for those records based
on its match rules.
• The lower your match batch size, the more times you will need to run the
match and consolidation processes.
• The higher your match batch size, the more work each match and
consolidation process does.

For each base object, there is a medium ground where you reach the optimal
match batch size. You need to identify this optimal batch size as part of
performance tuning in your environment. Start with a match batch size of 10%
of the volume of records to be matched and merged, run the match job only,
see how many matches are generated by your match rules, and then adjust
upwards or downwards accordingly.

Accept All Unmatched Rows as Unique

Enable (set to Yes) this feature to have Informatica MDM Hub mark as unique
(CONSOLIDATION_IND=1) any records that have been through the match
process, but for which no matches were identified. If enabled, for such
records, Informatica MDM Hub automatically changes their state to
consolidated (changes the consolidation indicator from 2 to 1). Consolidated
records are removed from the data steward’s queue via the Automerge batch
job.

By default, this option is disabled. In a development environment, you might


want this option disabled, for example, while iteratively testing and tuning
match rules to determine which records are found to be unique for a given set
of match rules.

This option should always be enabled in a production environment. Otherwise,


you can end up with a large number of records with a consolidation indicator
of 2. If this backlog of records exceeds the Maximum Matches for Manual
Consolidation setting (see "Maximum Matches for Manual Consolidation" on
page 368), then you will need to process these records first before you can
continue matching and consolidating other records.

For more information, see:


• "Initial Data Loads and Incremental Loads" on page 229
• "Consolidation Indicator" on page 219
• "Accept Non-Matched Records As Unique " on page 532
• "Automerge Jobs" on page 534
• "Autolink Jobs" on page 532

- 369 -
Match/Search Strategy

Select the match/search strategy to specify the reliability of the match versus
the performance you require. Select one of the following options.
Strategy Description
Option
Fuzzy Probabilistic match that takes into account spelling variations,
possible misspellings, and other differences that can make matching
records non-identical. This is the primary means of matching data in
a base object. Referred to in this document as fuzzy-match base
objects.
Note: If you specify a Fuzzy match/search strategy, you must
specify a fuzzy match key.
Exact Matches only records with identical values in the match column(s). If
you specify an exact match, you can define only exact-match
columns for this base object (exact-match base objects cannot have
fuzzy-match columns). Referred to in this document as exact-match
base objects.

An exact strategy is faster, but an exact match will miss some matches if the
data is imperfect. The best option to choose depends on the characteristics of
the data, your knowledge of the data, and your particular match and
consolidation requirements.

Certain configuration settings the Match / Merge Setup tab apply to only one
type of base object. In this document, such features are indicated with a
graphic that shows whether it applies to fuzzy-match base objects only (as in
the following example), or exact-match base objects only. No graphic means
that the feature applies to both.

Note: The match / search strategy is configured at the base object level. For
more information about the match / search strategy configured at the match
rule level, see "Match / Search Strategy" on page 409.

Fuzzy Population

If the match/search strategy is Fuzzy, then you must select a population,


which defines certain characteristics about the records that you are matching.
Data characteristics can vary from country to country. By default, Informatica
MDM Hub comes with the US population, but Informatica provides standard

- 370 -
populations per country. If you require another population, contact
Informatica support. If you chose an exact match/search strategy, then this
value is ignored.

Populations perform the following functions for matching:


• accounts for the inevitable variations and errors that are likely to exist in
name, address, and other identification data
For example, the population for the US has some intelligence about the
typical identification numbers used in US data, such as the social security
number. Populations also have some intelligence about the distribution of
common names. For example, the US population has a relatively high
percentage of the surname Smith. But a population for a non-English-
speaking country would not have Smith among the common names.
• specifies how Informatica MDM Hub builds match tokens, which are
described in "Tokenize Process" on page 240
• specifies how search strategies and match purposes operate on the
population of data to be matched

Match Only Previous Rowid Objects

If this setting is enabled (checked), then Informatica MDM Hub matches the
current records against records with lower ROWID_OBJECT values. For
example, if the current record has a ROWID_OBJECT value of 100, then the
record will be matched only against other records in the base object with a
ROWID_OBJECT value that is less than 100 (ignoring all records with a
ROWID_OBJECT value that is higher than 100).

Using this feature can reduce the number of matches required and speed
performance. However, if PUTs are executed, or if records are inserted out of
rowid order, then records might not be fully matched. You must assess the
trade-off between performance and match quantity based on the
characteristics of your data and your particular match requirements. By
default, this option is disabled (unchecked).

Match Only Once

Available only for fuzzy key matching and only if “"Match Only Previous Rowid
Objects" on page 371” is checked (selected). If Match Only Once is enabled
(checked), then once a record has found a match, Informatica MDM Hub will
not match it any further within this search range (the set of similar match key
values). Using this feature can reduce duplicates and increase performance.

- 371 -
Instead of finding every match for a record in a search range, Informatica
MDM Hub can find a single match for each. In subsequent match cycles, the
merge process will put these into large groups of XREF records associated
with the base object.

By default, this option is unchecked (disabled). If this feature is enabled,


however, you can miss matches. For example, suppose record A matches
record B, and record A matches record C, but record B and C do not match.
You must assess the trade-off between performance and match quantity based
on the characteristics of your data and your particular match requirements.

Dynamic Match Analysis Threshold

During the match process, dynamic match analysis determines whether the
match process will take an unacceptably long period of time. This threshold
value specifies the maximum acceptable number of comparisons.

To enable the dynamic match threshold, specify a non-zero value. Enable this
feature if you have data that is very similar (with high concentrations of
matches) to reduce the amount of work expended for a hot spot in your data.
A hotspot is a group of records representing overmatched data—a large
intersection of matches. If Dynamic Match Analysis Threshold is enabled, then
records that produce more than the specified number of potential match
candidates will be skipped during the match process. By default, this option is
zero (disabled).

Before conducting a match on a given search range, Informatica MDM Hub


calculates the number of search records (records being searched for
matches), and multiplies it by the number of file records (the number of
records returned from the match key table that need to be compared). If the
result is greater than the specified Dynamic Match Analysis Threshold, then no
comparisons are performed on that range of data, and the range is noted in
the application server log for further investigation.

Enable Match on Pending Records

By default, the match process includes only ACTIVE records and ignores
PENDING records. For state management-enabled objects, select this check
box to include PENDING records in the match process. Note that, regardless of
this setting, DELETED records are ignored by the match process. For more
information, see "Enabling Match on Pending Records" on page 163.

Reset Link Properties for Link-style Base Objects

For link-style base objects only, you can unlink consolidated records and
requeue them for match. This can be configured to occur automatically on load

- 372 -
update, or manually by via the Reset Links batch job. For more information,
see "Reset Links Jobs" on page 555.

For link-style base objects only, the Schema Manager displays the following
properties.
Property Description
Allow prompt for reset of Specifies whether to prompt for a reset of match
match links when match links when configuration settings for match rules
rules / columns are or match columns are changed.
changed
Allow reset of match links Specifies whether the reset links prompt applies
for updated data to updated data (load updates). This prompt is
triggered automatically upon load update.
Allow reset of links to Specifies whether the reset links process applies
include consolidated to consolidated records.
records Note: The reset links process always applies to
unconsolidated records.
Allow reset of links to Specifies whether manually-linked records are
include manually linked included by the reset links process. Autolinked
records records are always included.
Note: This setting affects the scope of all other
reset links settings.

Supporting Long ROWID_OBJECT Values


If a base object has such a large number of records that the ROWID_OBJECT
values might exceed 12 digits or more, you need to explicitly enable support
for longer values in the Cleanse Match Server. To enable the Cleanse Match
Server to use long Rowid Object values, edit the cmxcleanse.properties file
and configure the cmx.server.bmg.use_longs setting:
cmx.server.bmg.use_longs=1

By default, this option is disabled.

Configuring Match Paths for Related Records


This section describes how to configure match paths for related records, which
are used for matching in your Informatica MDM Hub implementation.

About Match Paths


This section describes match paths and related concepts.

Match Paths

A match path allows you to traverse the hierarchy between records—whether


that hierarchy exists between base objects (inter-table paths) or within a

- 373 -
single base object (intra-table paths). Match paths are used for configuring
match column rules involving related records in either separate tables or in
the same table.

Foreign Key Relationships and Filters

Configuring match paths that point to other records involves two main
components:
Component Description
foreign key Used to traverse the relationships to other records. Allows you
relationships to specify parent-to-child and child-to-parent relationships.
filters Allow you to selectively include or exclude records based on
(optional) values in a given column, such as ADDRESS_TYPE or PARTY_
TYPE. For more information, see "Configuring Filters for Match
Paths" on page 384.

Relationship Base Objects

In order to configure match rules for these kinds of relationships, particularly


many-to-many relationships, you need create a separate base object that
serves as a relationship base object to describe to Informatica MDM Hub the
relationships between records. You populate this relationship base object with
information about the relationships using a data management tool (outside of
Informatica MDM Hub) rather than using the Informatica MDM Hub processes
(land, stage, and load, as described in "Informatica MDM Hub Processes" on
page 218).

You configure a separate relationship base object for each type of


relationship. You can include additional attributes of the relationship type,
such as start date, end date, and other relationship details. The relationship
base object defines a match path that enables you to configure match column
rules.

Important: Do not run the match and consolidation processes on a base


object that is used to define relationships between records in inter-table or
intra-table match paths. Doing so will change the relationship data, resulting
in the loss of the associations between records.

Inter-Table Paths

An inter-table path defines the relationship between records in two different


base objects. In many cases, this relationship can be defined simply by
configuring a foreign key relationship: a key column in the child base object
points to the primary key of the parent base object. For more information, see
"Configuring Foreign-Key Relationships Between Base Objects" on page 113.

- 374 -
In some cases, however, the relationship between records can be more
complex, requiring an intermediary base object that defines the relationship
between records in the two tables.

Example Base Objects for Inter-Table Paths

Consider the following example in which a Informatica MDM Hub


implementation has two base objects:
Base Description
Object
Person Contains any type of person, such as employees for your
organization, employees for some other organizations (prospects,
customers, vendors, or partners), contractors, and so on.
Address Contains any type of address—mailing, shipping, home, work, and so
on.

In this example, there is the potential for many-to-many relationships:


• A person could have multiple addresses, such as a home and work
address.
• A single address could have multiple persons, such as a workplace or
home.

In order to configure match rules for this kind of relationship between records
in different base objects, you would create a separate base object (such as
PersAddrRel) that describes to Informatica MDM Hub the relationships
between records in the two base objects.

Columns in the Example Base Objects

Suppose the Person base object had the following columns:


Column Type Description
ROWID_ CHAR(14) Primary key. Uniquely identifies this person in the
OBJECT base object.
TYPE CHAR(14) Type of person, such as an employee or customer
contact.
NAME VARCHAR(50) Person’s name (simplified for this example).
EMPLOYER VARCHAR(50) Person’s employer.
... ... ...

Suppose the Address base object had the following columns:


Column Type Description
ROWID_ CHAR(14) Primary key. Uniquely identifies this employee.
OBJECT
TYPE CHAR(14)Type of address, such as their home, work, mailing,
or shipping address.
NAME VARCHAR(50) Name of the individual or organization residing at
this address.
ADDRESS_ VARCHAR(50) First address line.

- 375 -
Column Type Description
1
ADDRESS_ VARCHAR(50) Second address line.
2
CITY VARCHAR(50) City
STATE_ VARCHAR(50) State or province
PROV
POSTAL_ VARCHAR(50) Postal code
CODE
... ... ...

To define the relationship between records in the two base objects, the
PersonAddrRel base object could have the following columns:
Column Type Description
ROWID_ CHAR(14) Primary key. Uniquely identifies this person in the
OBJECT base object.
PERS_FK CHAR(14) Foreign key to the ROWID_OBJECT column in the
Person base object.
ADDR_FK CHAR(14) Foreign key to the ROWID_OBJECT column in the
Address base object.

Note that the column type of the foreign key columns—CHAR(14)—matches


the primary key to which they point.

Example Configuration Steps

After you have configured the relationship base object (PersonAddrRel), you
would complete the following tasks:
1. Configure foreign keys from this base object to the ROWID_OBJECT of the
Person and Address base objects. For more information, see "Configuring
Foreign-Key Relationships Between Base Objects" on page 113.

2. Load the PersAddrRel base object with data that describes the
relationships between records.

ROWID_OBJECT PERS_FKEY ADDR_FKEY


1 380 132

- 376 -
ROWID_OBJECT PERS_FKEY ADDR_FKEY
2 480 920
3 786 432
4 786 980
5 12 1028
6 922 1028
7 1302 110
... ... ...
In this example, note that Person #786 has two addresses, and that
Address #1028 has two persons.
3. Use the PersonAddrRel base object when configuring match column rules
for the related records. For more information, see "Configuring Match
Column Rules for Match Rule Sets" on page 407.

Intra-Table Paths

Within a base object, parent/child relationships can exist between individual


records. Informatica MDM Hub allows you to clarify relationships between
records in the same base object, and then use those relationships when
configuring column match rules.

Example Base Object for Intra-Table Paths

Consider the following example of an Employee base object in which reporting


relationships exist between employees.

- 377 -
The relationships among employees is hierarchical. The CEO is at the top of
the hierarchy, representing what is called the global ultimate parent record.

Columns in the Example Base Object

Suppose the Employee base object had the following columns:


Column Type Description
ROWID_ CHAR(14) Primary key. Uniquely identifies this employee in
OBJECT the base object.
NAME VARCHAR(50) Employee name.
TITLE VARCHAR(50) Employee’s job title.
... ... ...

Create a Relationship Base Object

In order to configure match rules for this kind of object, you would create a
separate base object to describe to Informatica MDM Hub the relationships
between records. For example, you could create and configure a EmplRepRel
base object with the following columns:
Column Type Description
ROWID_ CHAR(14) Primary key. Uniquely identifies this relationship
OBJECT record.
EMPLOYEE_FK CHAR(14) Foreign key to the ROWID_OBJECT of the employee
record.
REPORTS_TO_ CHAR(14) Foreign key to the ROWID_OBJECT of a manager
FK record.

Note that the column type of the foreign key columns—CHAR(14)—matches


the primary key to which they point.

Example Configuration Steps

After you have configured this base object, you must complete the following
tasks:
1. Configure foreign keys from this base object to the ROWID_OBJECT of the
Employee base object. For more information, see "Configuring Foreign-
Key Relationships Between Base Objects" on page 113.

- 378 -
2. Load this base object with data that describes the relationships between
records.

ROWID_OBJECT EMPLOYEE REPORTS_TO


1 7 93
2 19 71
3 24 82
4 29 82
5 31 82
6 31 71
7 48 16
8 53 12
Note that you can define many-to-many relationships between records.
For example, the employee whose ROWID_OBJECT is 31 reports to two
different managers (ROWID_OBJECT=82 and ROWID_OBJECT=71), while
this manager (ROWID_OBJECT=82) has three reports (ROWID_
OBJECT=24, 29, and 31).
3. Use the EmplRepRel base object when configuring match column rules for
the related records according to the instructions in "Configuring Match
Column Rules for Match Rule Sets" on page 407.
For example, you could create a match rule that takes into account the
employee’s manager to produce more accurate matches.

Note: This example used a REPORTS_TO field to define the relationship, but
you could use piece of information to associate the records—even something
more generic and flexible like RELATIONSHIP_TYPE.

- 379 -
Navigating to the Paths Tab
To navigate to the Paths tab for a base object:
1. In the Schema Manager, navigate to the Match/Merge Setup Details dialog
for the base object that you want to configure. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. Click the Paths tab.
The Schema Manager displays the Paths tab.

Sections of the Paths Tab

The Paths tab has two sections:


Section Description
Path Configure the foreign keys used to traverse the relationships. For
Components more information, see "Configuring Path Components" on page
381.
Filters Configure filters used to include or exclude records for matching.
For more information, see "Configuring Filters for Match Paths"
on page 384.

Root Base Object

The root base object is displayed automatically in the Path Components


section of the screen and is always available. The root base object represents
an entity without child or parent relationships. If you want to configure match
rules that involve parent or child records, you need to explicitly add path
components to the root base object, and these relationships must have been
configured beforehand (see "Configuring Foreign-Key Relationships Between
Base Objects" on page 113).

- 380 -
Configuring Path Components
This section describes how to configure path components in the Schema
Manager. Path components provide a way to define the connection between
parent and child tables using foreign keys for the purpose of using columns
from that table in a match column.

Properties of Path Components

This section describes properties of path components.

Display Name

The name of this path component as it will be displayed in the Hub Console.

Physical Name

Actual name of the path component in the database. Informatica MDM Hub will
suggest a physical name for the path component based on the display name
that you enter.

Check For Missing Children

The Check for Missing Children check box instructs Informatica MDM Hub to
either allow for missing child records (enabled, the default) or to require all
parent records to have child records.
Setting Description
Enabled If you might have some missing child records and you have rules
(Checked) that do not include columns in the tables that might be missing
records.
Disabled If all of your rules use the child columns and do not have null
(Unchecked) match enabled. In this case, checking for missing children does
not add any value, and it can have an negative impact on
performance.

If you are certain that your data is complete (parent records have child
records), and you include the parent in the child match rule, then inter-table
matching works as expected. However, if your data tends to contain parent
records that are missing child records, or if you do not include the parent
column in the child match rule, you must check (select) the Check for Missing
Children check box in the path component associated with this match column
rule to ensure that an outer join occurs when Informatica MDM Hub checks for
records to match.

- 381 -
Note: If the Check for Missing Children option is enabled, Informatica MDM
Hub performs an outer join between the parent and child tables, which can
have a performance impact. Therefore, when not needed, it is more efficient
to disable this option.

Constraints
Property Description
Table List of tables in the schema.
Direction Direction of the foreign key:
• Parent-to-Child
• Child-to-Parent
• N/A
Foreign Column to which the foreign key points. This column can be either in
Key On a different base object or the same base object.

Adding Path Components

To add a path component:


1. In the Schema Manager, navigate to the Paths tab according to the
instructions in "Navigating to the Paths Tab" on page 380.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. In the Path Components section, click the Add button.


The Schema Manager displays the Add Path Component dialog.

- 382 -
4. Specify the properties for this path component. For more information, see
"Properties of Path Components" on page 381.
5. Click OK.

6. Click the button to save your changes.

Editing Path Components

To edit a path component:


1. In the Schema Manager, navigate to the Paths tab according to the
instructions in "Navigating to the Paths Tab" on page 380.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the Path Components tree, select the path component that you want to
delete.

4. In the Path Components section, click the button.


The Schema Manager displays the Edit Path Component dialog.

5. Specify the properties for this path component. You can change the
following values:
• Display Name (see "Display Name" on page 381)
• Check for Missing Children (see "Check For Missing Children" on page
381)
6. Click OK.

7. Click the button to save your changes.

Deleting Path Components

You can delete path components but not the root base object. To delete a path
component:
1. In the Schema Manager, navigate to the Paths tab according to the
instructions in "Navigating to the Paths Tab" on page 380.

- 383 -
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the Path Components tree, select the path component that you want to
delete.

4. In the Path Components section, click the button.


The Schema Manager prompts you to confirm deletion.
5. Click Yes.

6. Click the button to save your changes.

Configuring Filters for Match Paths


This section describes how to configure filters for match paths in the Schema
Manager.

About Filters

In match paths, a filter allows you to selectively determine whether to include


or exclude records for matching based on values in a given column. When you
define a filter for a column, you specify the filter condition with one or more
values that determine which records qualify for match processing. For
example, if you have an Address base object that contains both shipping and
billing addresses, you might configure a filter that includes only billing
addresses for matching and ignores the shipping addresses. During execution,
the match process will match records in the match batch with billing address
records only.

Filter Properties

In Informatica MDM Hub, filters have the following properties.


Setting Description
Column Column to configure in the currently-selected base object.
Operator Operator to use for this filter. One of the following values:
• IN—Include columns that contain the specified values.
• NOT IN—Exclude columns that contain the specified
values.
Values One or more values to use for this filter.

Example Filter

For example, if you wanted to match only on mailing addresses in an Address


base object, you could specify:
Setting Example Value
Column ADDR_TYPE

- 384 -
Setting Example Value
Operator IN
Values MAILING

In this example, only mailing addresses would qualify for matching—records


in which the COLUMN field contains “MAILING”. All other records would be
ignored.

Adding Filters

If you add multiple filters, Informatica MDM Hub evaluates the entire
expression using the logical AND operator. For example,
xExpr AND yExpr AND zExpr

To add a filter:
1. In the Schema Manager, navigate to the Paths tab according to the
instructions in "Navigating to the Paths Tab" on page 380.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. In the Filters section, click the Add button.


The Schema Manager displays the Add Filter dialog.

4. Specify the properties for this path component. For more information, see
"Properties of Path Components" on page 381.
5. Specify the value(s) for this filter according to the instructions in "Editing
Values for a Filter" on page 385.

6. Click the button to save your changes.

Editing Values for a Filter

To edit values for a filter:


1. Do one of the following:
• Add a filter. For more information, see "Adding Filters" on page 385.

- 385 -
• Edit filter properties. For more information, see "Editing Filter
Properties" on page 386.

2. In either the Add Filter or Edit Filter dialog, click the button next to the
Values field.
The Schema Manager displays the Edit Values dialog.
3. Configure the values for this filter.

• To add a value, click the button. When prompted, specify a value


and then click OK.

• To delete a value, select it in the Edit Values dialog, click the


button, and then click Yes when prompted to delete the value.
4. Click OK.

5. Click the button to save your changes.

Editing Filter Properties

To edit filter properties:


1. In the Schema Manager, navigate to the Paths tab according to the
instructions in "Navigating to the Paths Tab" on page 380.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. In the Filters section, click the button.


The Schema Manager displays the Add Filter dialog.

4. Specify the properties for this path component. For more information, see
"Properties of Path Components" on page 381.
5. Specify the value(s) for this filter according to the instructions in "Editing
Values for a Filter" on page 385.

- 386 -
6. Click the button to save your changes.

Deleting Filters

To delete a filter:
1. In the Schema Manager, navigate to the Paths tab according to the
instructions in "Navigating to the Paths Tab" on page 380.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the Filters section, select the filter that you want to delete, and then
click the button.
The Schema Manager prompts you to confirm deletion.
4. Click Yes.

Configuring Match Columns


This section describes how to configure match columns so that you can use
them in match column rules (see "Configuring Match Column Rules for Match
Rule Sets" on page 407). If you want to configure primary key match rules
instead, see the instructions in "Configuring Primary Key Match Rules" on page
434.

About Match Columns


A match column is a column that you want to use in a match rule, such as
name or address columns. Before you can use a column in rule definitions,
you must first designate it as a column that can be used in match rules, and
provide information about the data it contains. For more information, see
"Match Columns Depend on the Search Strategy" on page 388.

Match Column Types

There are two types of columns used in match rules:


Column Description
Type
Fuzzy Probabilistic match. Suitable for columns containing data that
varies in spelling, abbreviations, word sequence, completeness,
reliability, and other inconsistencies. Examples include street
addresses and names of people or organizations.
Exact Deterministic match. Suitable for columns containing consistent
and predictable patterns. Exact match columns match only on
identical data. Examples include IDs, postal codes, industry codes, or
any other well-defined piece of information.

- 387 -
Match Columns Depend on the Search Strategy

The types of match columns that you can configure depend on the type of the
base object that you are configuring (see "Exact-match and Fuzzy-match Base
Objects" on page 247). The type of base object is defined by the selected
match / search strategy (see "Match/Search Strategy" on page 370).
Match Description
Strategy
Fuzzy- Allows you to configure fuzzy-match columns as well as exact-
match match columns. For more information, see "Configuring Match
base Columns for Fuzzy-match Base Objects" on page 390.
objects
Exact- Allows you to configure exact-match columns but not fuzzy-match
match columns. For more information, see "Configuring Match Columns
base for Exact-match Base Objects" on page 396.
objects

Path Component

The path component is either the source table to use for a match column
definition, or the match path used to navigate a hierarchy of records. Match
paths are used for configuring match column rules involving related records in
either separate tables or in the same table. Before you can specify a path
component, the match path must be configured. For more information, see
"Configuring Match Paths for Related Records" on page 373.

To specify a path component for a match column:

1. Click the key next to the Path Component field.


The Schema Manager displays the Select Match Path Component dialog.

2. Select the match path component.


3. Click OK.

Field Types

For fuzzy-match columns, the field name drop-down list displays the following
field types. For more information, see "Adding Exact-match Columns for
Fuzzy-match Base Objects" on page 395.

- 388 -
Field Types
Field Name Description
Address_ Includes the part of address up to, but not including, the locality
Part1 last line. The position of the address components should be the
normal word order used in your data population. Pass this data
in one field. Depending on your base object, you may
concatenate these attributes into one field before matching. For
example, in the US, an Address_Part1 string includes the
following fields: Care-of + Building Name + Street Number +
Street Name + Street Type + Apartment Details. Address_
Part1 uses methods and options designed specifically for
addresses.
Address_ Locality line in an address. For example, in the US, a typical
Part2 Address_Part2 includes: City + State + Zip (+ Country).
Matching on Address_Part2 uses methods and options designed
specifically for addresses.
Attribute1, Two general purpose fields. These fields are matched using a
Attribute2 general purpose, string matching algorithm that compensates
for transpositions and missing characters or digits.
Date Matches any type of date, such as date of birth, expiry date,
date of contract, date of change, creation date, and so on. It
expects the date to be passed in Day+Month+Year format. It
supports the use or absence of delimiters between the date
components. Matching on dates uses methods and options
designed specifically for dates. It overcomes the typical error
and variation found in this data type.
ID Matches any type of ID number, such as: Account number,
Customer number, Credit Card number, Drivers License
number, Passport, Policy number, SSN or other identity code,
VIN, and so on. It uses a string matching algorithm that
compensates for transpositions and missing characters or
digits.
Organization_ Matches the names of organizations, such as company names,
Name business names, institution names, department names, agency
names, trading names, and so on. This field supports matching
on a single name or on a compound name (such as a legal
name and its trading style). You may also use multiple names
(for example, a legal name and a trading style) in a single
Organization_Name column for the match.
Person_Name Matches the names of people. Use the full person name. The
 position of the first name, middle names, and family names,
should be the normal word order used in your population. For
example, in English-speaking countries, the normal order is:
First Name + Middle Name(s) + Family Name(s). Depending on
your base object design, you can concatenate these fields into
one field before matching. This field supports matching on a
single name, or an account name (such as JOHN & MARY
SMITH). You may also use multiple names, such as a married
name and a former name.
Postal_Area Can be used to place more emphasis on the postal code than if
it were included in the Address_Part2 field. It is for all types of
postal codes, including Zip codes. It uses a string matching
algorithm that compensates for transpositions and missing
characters or digits.
Telephone_ Used to match telephone numbers. It uses a string matching
Number algorithm that compensates for transpositions and missing
digits or area codes.

- 389 -
Selecting Multiple Columns for Matching

If you specify more than one column for matching:


• Values are concatenated into the field used by the match purpose, with a
space inserted between each value. For example, you can select first,
middle, last, and suffix columns in your base object. The concatenated
fields will look like this (a space follows the last word in the string):
first middle last suffix

For example:
Anna Maria Gonzales MD

• For data containing spaces or null data:


• If there are spaces in the data, then the spaces remain and the field is
not NULL.
• If all the fields are null, then the combined value is null.
• If any component on the combined field is null, then no extra space will
be added to replace the null.

Note: Concatenating columns is not recommended for exact match columns.

Configuring Match Columns for Fuzzy-match Base


Objects

Fuzzy-match base objects can have both fuzzy and exact-match columns. For
 exact-match base objects instead, see "Configuring Match Columns for Exact-
match Base Objects" on page 396.

Navigating to the Match Columns Tab for a Fuzzy-match Base


Object

To define match columns for a fuzzy-match base object:


1. In the Schema Manager, select the fuzzy-match base object that you want
to configure.
2. Click the Match/Merge Setup node. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
3. Click the Match Columns tab.
The Schema Manager displays the Match Columns tab for the fuzzy-match
base object.

- 390 -
The Match Columns tab for a fuzzy-match base object has the following
sections.

Property Description
Fuzzy Properties for the fuzzy match key. For more information, see
Match Key "Configuring Fuzzy Match Key Properties" on page 391.
Match Match columns and their properties:
Columns • Field Name (see "Field Types" on page 388)
• Column Type (see "Match Column Types" on page 387)
• Path Component (see "Path Component" on page 388)
• Source Table—table referenced in the path component, or
the base object (if the path component is root)
Match List of available columns in the base object, as well as
Column columns that have been selected for match.
Contents

Configuring Fuzzy Match Key Properties

- 391 -
This section describes how to configure the match column properties for
fuzzy-match base objects (see "Match/Search Strategy" on page 370). The
Fuzzy Match Key is a special column in the base object that the Schema
Manager adds if a match column uses the fuzzy match / search strategy. This
column is the primary field used during searching and matching to generate
match candidates for this base object. All fuzzy base objects have one and
only one Fuzzy Match Key.

Key Types

The match key type describes important characteristics about a column to


Informatica MDM Hub. Informatica MDM Hub has some intelligence about
names and addresses, so this information helps Informatica MDM Hub
generate keys correctly and conduct better searches. This is the main criterion
for the search that builds the initial list of potential match candidates. This key
type should be based on the main type of data that is in physical column(s)
that make up the fuzzy match key.

For a fuzzy-match base object, you can select one of the following key types:
Key Type Description
Person_Name Used if your fuzzy match key contains data for individuals only.
Organization_ Used if your fuzzy match key contains data for organizations
Name only, or if it contains data for both organizations and
individuals.
Address_ Used if your fuzzy match key contains address data to be
Part1 consolidated.

Note: Key types are based on the population you select. The above list of key
types applies to the default population (US). Other populations might have
different key types. If you require another population, contact Informatica
support.

Key Widths

The match key width determines the thoroughness of the analysis of the fuzzy
match key, the number of possible match candidates returned, and how much
disk space the keys consume. Key widths apply to fuzzy-match objects only.

- 392 -
Key Description
Width
Standard Appropriate for most fuzzy match keys, balancing reliability and
space usage.
Extended Might result in more match candidates, but at the cost of longer
processing time to generate keys. This option provides some
additional matching capability due to the concatenation of columns.
This key width works best when:
• your data set is not extremely large
• your data set is not complete
• you have sufficient resources to handle the processing time and
disk space requirements
Limited Trades some match reliability for disk space savings. This option
might result in fewer match candidates, but searches can be faster.
This option works well if you are willing to undermatch for faster
searches that use less disk space for the keys. Limited keys match
fewer records with word-order variations than standard keys. This
choice provides a subset of the Standard key set, but might be the
best option if disk space is restricted or the data volume is
extremely large.
Preferred Generates a single key per base object record. This option trades
some match reliability for performance (reduces the number of
matches that need to be performed) and disk space savings
(reduces the size of the match key table). Depending on
characteristics of the data, a preferred key width might result in
fewer match candidates.

Steps to Configure Fuzzy Match Key Properties

To configure fuzzy match key properties for a fuzzy-match base object:


1. In the Schema Manager, navigate to the Match Columns tab according to
the instructions in "Navigating to the Match Columns Tab for a Fuzzy-
match Base Object" on page 390.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Configure the following settings for this fuzzy-match base object.

Property Description
Key Type Type of field primarily used in the match. This is the main
criterion for the search that builds the initial list of potential
match candidates. This key type should be based on the main
type of data stored in the base object. For more information,
see "Key Types" on page 392.
Key Width Size of the search range for which keys are generated. For
more information, see "Key Widths" on page 392.
Path Path component for this fuzzy match key. This is a table
Component containing the column(s) to designate as the key type: Base
Object, Child Base Object table, or Cross-reference table. For
more information, see "Path Component" on page 388.

4. Click the Save button to save your changes.

- 393 -
Adding a Fuzzy-match Column for Fuzzy-match Base Objects

To define a fuzzy-match column for a fuzzy-match base object:


1. In the Schema Manager, navigate to the Match Columns tab. For more
information, see "Navigating to the Match Columns Tab for a Fuzzy-match
Base Object" on page 390.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. To add a fuzzy-match column, click the button.


The Schema Manager displays the Add Fuzzy-match Column dialog.

4. Specify the following settings.

Property Description
Match Path Match path component for this fuzzy-match column. For a
Component fuzzy-match column, the source table can be the parent table, a
parent cross-reference table, or any child base object table. For
more information, see "Path Component" on page 388.
Field Name of this field as it will be displayed in the Hub Console. For
Name fuzzy match columns, this is a drop-down list where you can
select the type of data in the match column being defined, as
described in "Field Types" on page 388.
5. Specify the base object column(s) for the fuzzy match.
To add a column to the Selected Columns list, select a column name and
then click the right arrow button.
Note: If you add multiple columns, the values are concatenated, with a
separator space between values. For more information, see "Selecting
Multiple Columns for Matching" on page 390.
6. Click OK.
The Schema Manager adds the match column to the Match Columns list.

- 394 -
7. Click the Save button to save your changes.

Adding Exact-match Columns for Fuzzy-match Base Objects

To define an exact-match column for a fuzzy-match base object:


1. In the Schema Manager, navigate to the Match Columns tab. For more
information, see "Navigating to the Match Columns Tab for a Fuzzy-match
Base Object" on page 390.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. To add an exact-match column, click the button.


The Schema Manager displays the Add Exact-match Column dialog.

4. Specify the following settings.

Property Description
Match Path Match path component for this exact-match column. For an
Component exact-match column, the source table can be the parent table
and / or child physical columns. For more information, see
"Path Component" on page 388.
Field Name of this field as it will be displayed in the Hub Console.
Name
5. Specify the base object column(s) for the exact match.
To add a column to the Selected Columns list, select a column name and
then click the right arrow.
Note: If you add multiple columns, the values are concatenated, with a
separator space between values. For more information, see "Selecting
Multiple Columns for Matching" on page 390.
Note: Concatenating columns is not recommended for exact match
columns.
6. Click OK.

- 395 -
The Schema Manager adds the match column to the Match Columns list.

7. Click the Save button to save your changes.

Editing Match Column Properties for Fuzzy-match Base Objects

Instead of editing match column properties, you must:


• delete the match column, as described in "Deleting Match Columns for
Fuzzy-match Base Objects" on page 396
• add a new match column, specifying the settings that you want, as
described in "Adding Exact-match Columns for Fuzzy-match Base Objects"
on page 395

Deleting Match Columns for Fuzzy-match Base Objects

To delete a match column for a fuzzy-match base object:


1. In the Schema Manager, navigate to the Match Columns tab. For more
information, see "Navigating to the Match Columns Tab for a Fuzzy-match
Base Object" on page 390.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the Match Columns list, select the match column that you want to
delete.

4. Click the button.


The Schema Manager prompts you to confirm deletion.
5. Click Yes.

6. Click the Save button to save your changes.

Configuring Match Columns for Exact-match Base


Objects

Before you define match column rules, you must define the match columns on
which they will be based. Exact-match base objects can have only exact-
match columns. For more information about configuring match columns for
fuzzy-match base objects instead, see "Configuring Match Columns for Fuzzy-
match Base Objects" on page 390.

- 396 -
Navigating to the Match Columns Tab for an Exact-match Base
Object

To define match columns for an exact-match base object:


1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the exact-match base object that you want to configure. For more
information, see "Navigating to the Match/Merge Setup Details Dialog" on
page 365.
2. Click the Match Columns tab.
The Schema Manager displays the Match Columns tab for the exact-match
base object.

The Match Columns tab for an exact-match base object has the following
sections.

Property Description
Match Match columns and their properties:
Columns • Field Name
• Column Type (see "Match Column Types" on page 387)
• Path Component (see "Path Component" on page 388)
• Source Table—table referenced in the path component, or
the base object (if the path component is root)
Match List of available columns and columns selected for matching.
Column
Contents

Adding Match Columns for Exact-match Base Objects

You can add only exact-match columns for exact-match base objects. Fuzzy-
match columns are not allowed.

- 397 -
To add an exact-match column for an exact-match base object:
1. In the Schema Manager, navigate to the Match Columns tab. For more
information, see "Navigating to the Match Columns Tab for an Exact-match
Base Object" on page 397.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. To add an exact-match column, click the button.


The Schema Manager displays the Add Exact-match Column dialog.

4. Specify the following settings.

Property Description
Match Path Match path component for this exact-match column. For an exact-
Component match column, the source table can be the parent table and / or
child physical columns. For more information, see "Path
Component" on page 388.
Field Name of this field as it will be displayed in the Hub Console.
Name
5. Specify the base object column(s) for the exact match.
To add a column to the Selected Columns list, select a column name and
then click the right arrow.
Note: If you add multiple columns, the values are concatenated, with a
separator space between values. For more information, see "Selecting
Multiple Columns for Matching" on page 390.
Note: Concatenating columns is not recommended for exact match
columns.
6. Click OK.
The Schema Manager adds the selected match column(s) to the Match
Columns list.

- 398 -
7. Click the Save button to save your changes.

Editing Match Column Properties for Exact-match Base Objects

Instead of editing match column properties, you must:


1. Delete the match column, as described in "Deleting Match Columns for
Exact-match Base Objects" on page 399.
2. If you want to add a match column with the same name, click the Save
button to save your changes first.
3. Add a new match column, specifying the settings that you want, as
described in "Adding Match Columns for Exact-match Base Objects" on
page 397.

Deleting Match Columns for Exact-match Base Objects

To delete a match column for an exact-match base object:


1. In the Schema Manager, navigate to the Match Columns tab. For more
information, see "Navigating to the Match Columns Tab for an Exact-match
Base Object" on page 397.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the Match Columns list, select the match column that you want to
delete.

4. Click the button.


The Schema Manager prompts you to confirm deletion.
5. Click Yes.

6. Click the Save button to save your changes.

Configuring Match Rule Sets


This section describes how to configure match rule sets for your Informatica
MDM Hub implementation.

About Match Rule Sets


A match rule set is a logical collection of match column rules (see "Configuring
Match Column Rules for Match Rule Sets" on page 407) that have some
properties in common. Match rule sets are associated with match column rules
only—not primary key match rules (which are described in "Configuring
Primary Key Match Rules" on page 434).

- 399 -
Match rule sets allow you to execute different sets of match column rules at
different times. The match process uses only one match rule set per
execution. To match using a different match rule set, the match rule set must
be selected and the match process must be executed again.

Note: Only one match column rule in the match rule set needs to succeed in
order to declare a match between records.

What Match Rule Sets Specify

Match rule sets include:


• a search level that dictates the search strategy
• any number of automatic and manual match column rules
• optionally, a filter that allows you to selectively include or exclude records
from the match batch during the match process

Multiple Match Rule Sets and the Specified Default

You can configure any number of rule sets. When users want to run the Match
batch job, they select one rule set from the list of rule sets that have been
defined for the base object.

For more information about choosing match rule sets, see "Selecting a Match
Rule Set" on page 549.

In the Schema Manager, you designate one match rule set as the default.

When to Use Match Rule Sets

Match rule sets allow you to accommodate different match column rule
requirements at different times. For example, you might use one match rule
set for an initial data load and a different match rule set for subsequent
incremental loads. Similarly, you might use one match rule set to process all
records, and another match rule set with a filter to process just a subset of
records (see "Filtering SQL" on page 403).

Rule Set Evaluation

Before saving any changes to a match rule set (including any changes to
match rules in the match rule set), the Schema Manager analyzes the match

- 400 -
rule set and prompts you with a warning message if the match rule set has
any issues.

Note: This is only a warning message. You can choose to ignore the message
and save changes anyway.

Example issues include a match rule set that:


• is identical to an already existing match rule set
• is empty—no match column rules have been added
• contains no fuzzy-match column rules for a fuzzy-match base object
• contains one or more fuzzy-match columns but no exact-match column
(can impact match performance)
• contains fuzzy and exact-match columns with the same source columns

Match Rule Set Properties


This section describes the properties for match rule sets.

Name

The name of the rule set. Specify a unique, descriptive name.

Search Levels

Used with fuzzy-match base objects only. When you configure a match rule
set, you define a search level that instructs Informatica MDM Hub on how
stringently and thoroughly to search for candidate matches.

The goal of the match process is to find the optimal number of matches for
your data:
• not too few (called undermatching), which misses relevant matches, or

- 401 -
• not too many (called overmatching), which generates too many matches,
including matches that are not relevant

For any name or address in a fuzzy match key, Informatica MDM Hub uses the
defined search level to generate different key ranges for the purpose of
determining which records are possible match candidates—and to which
records the match column rules will be applied.

You can choose one of the following search levels:


Search Description
Level
Narrow Most stringent level in searching for possible match
candidates.This search level is fast, but it can result in fewer
matches than other search levels might generate and possibly
result in undermatching. Narrow can be appropriate if your data
set is relatively correct and complete, or for very large data sets
with highly matchy data.
Typical Appropriate for most rule sets.
Exhaustive Generates a larger set of possible match candidates than the
Typical level. This can result in more matches than other search
levels might generate, possibly result in overmatching, and take
more time. This level might be appropriate for smaller data sets
that are less complete.
Extreme Generates a still larger set of possible match candidates, which
can result in overmatching and take more much more time. This
level might be appropriate for smaller data sets that are less
complete, or to identify the highest possible number of matching
records.

The search level you choose should be determined by the size of your data
set, your time constraints, and how critical the matches are. Depending on
your circumstances and requirements, it is sometimes more appropriate to
undermatch, while at other times, it is more appropriate to overmatch.
Implementations dealing with relatively reliable and complete data can use
the Narrow level, while implementations dealing with less reliable data or with
more critical problems should use Exhaustive or Extreme.

The search level might also differ depending on the phase of a project. It


might be necessary to have a looser level (exhaustive or extreme) for initial
matching, and tighten as the data is deduplicated.

Enable Search by Rules

This setting specifies whether searching by rules is enabled (checked) or not


(unchecked, the default). Used with fuzzy-match base objects only and applies
only to the SIF searchMatch request. The searchMatch request searches for

- 402 -
records in a package based on match column and rule definitions. The
searchMatch request uses the columns in these records to generate match
columns that are used by the match server to find match candidates. For more
information about searchMatch, see the Informatica MDM Hub Services
Integration Framework Guide and the Informatica MDM Hub Javadoc.

By default, when an application calls the SIF searchMatch request, all possible
match columns are generated from the package or mapping records specified
in the request, and the match is performed by treating all columns with equal
weight. You can enable this option, however, to allow applications to specify
input match columns, in which case the searchMatch API ignores any columns
that were not passed as part of the request. You might use this feature if, for
example, you were using a custom population definition and wanted to call the
searchMatch API with a particular set of rules.

Enable Filtering

Specifies whether filtering is enabled for this match rule set.


• If checked (selected), allows you to define a filter (see "Filtering SQL" on
page 403) for this match rule set. When running a Match job, users can
select the match rule set (see "Selecting a Match Rule Set" on page 549)
with a filter defined so that the Match job processes only the subset of
records that meet the filter criteria.
• If unchecked (not selected), then all records will be processed by the
match rule set when the Match batch job runs.

For example, if you had an Organization base object that contained multiple
types of organizations (customers, vendors, prospects, partners, and so on),
you could define different match rule sets that selectively processed only the
type of records you want to match: MatchAll (no filter), MatchCustomersOnly,
MatchVendorsOnly, and so on.

Filtering SQL

By default, when the Match batch job is run (see "Match Jobs" on page 547),
the match rule set processes all records. If the Enable Filtering check box (see
"Enable Filtering" on page 403) is selected (checked), you can specify a filter
condition to restrict processing to only those rules that meet the filter
condition. A filter is analogous to a WHERE clause in a SQL statement. The
filter expression can be any expression that is valid for the WHERE clause
syntax used in your database platform.

Note: The match rule set filter is applied to the base object records that are
selected for the match batch only (the records to match from)—not the

- 403 -
records in the match pool (the records to match to). For more information,
see "Flagging the Match Batch" on page 251.

For example, suppose your implementation had an Organization base object


that contained multiple types of organizations (customers, vendors,
prospects, partners, and so on). Using filters, you could define a match rule
set (MatchCustomersOnly) that processed customer data only.
org_type=’C’

All other, non-customer records would be ignored and not processed by the
Match job.

Note: It is the administrator’s responsibility to specify an appropriate SQL


expression that correctly filters records during the Match job. The Schema
Manager validates the SQL syntax according to your database platform, but it
does not check the logic or suitability of your filter condition.

Match Rules

This area of the window displays a list of match column rules that have been
configured for the selected match rule set. For more information, see
"Configuring Match Column Rules for Match Rule Sets" on page 407.

Navigating to the Match Rule Set Tab


To navigate to the Match Rule Set tab:
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the base object that you want to configure. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. Click the Match Rule Sets tab.
The Schema Manager displays the Match Rule Sets tab for the selected
base object.

The Match Rule Sets tab consists of the following sections:

- 404 -
Search Level Description
Match Rule Sets List of configured match rule sets.
Properties Properties for the selected match rule set.

Adding Match Rule Sets


To add a new match rule set:
1. In the Schema Manager, display the Match Rule Sets tab in the
Match/Merge Setup Details dialog for the base object that you want to
configure. For more information, see "Navigating to the Match Rule Set
Tab" on page 404.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

3. Click the button.


The Schema Manager displays the Add Match Rule Set dialog.

4. Enter a unique, descriptive name for this new match rule set.
5. Click OK.
The Schema Manager adds the new match rule set to the list.
6. Configure the match rule set according to the instructions in the next
section, "Editing Match Rule Set Properties" on page 405.

Editing Match Rule Set Properties


To edit the properties of a match rule set:
1. In the Schema Manager, display the Match Rule Sets tab in the
Match/Merge Setup Details dialog for the base object that you want to
configure. For more information, see "Navigating to the Match Rule Set
Tab" on page 404.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the match rule set that you want to configure.
The Schema Manager displays its properties in the properties panel.
• The following example shows the properties for a fuzzy-match base
object.

- 405 -
• The following example shows the properties for an exact-match base
object.

4. Configure properties for this match rule set. For more information, see
"Match Rule Set Properties" on page 401.
5. Configure match columns for this match rule set according to the
instructions in "Configuring Match Column Rules for Match Rule Sets" on
page 407.

6. Click the Save button to save your changes.


Before saving changes, the Schema Manager analyzes the match rule set
and prompts you with a message if the match rule set contains certain
incongruences. For more information, see "Rule Set Evaluation" on page
400.
7. If you are prompted to confirm saving changes, click OK button to save
your changes.

Renaming Match Rule Sets


To rename a match rule set:

- 406 -
1. In the Schema Manager, display the Match Rule Sets tab in the
Match/Merge Setup Details dialog for the base object that you want to
configure. For more information, see "Navigating to the Match Rule Set
Tab" on page 404.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the match rule set that you want to rename.

4. Click the button.


The Schema Manager displays the Edit Rule Set Name dialog.

5. Specify a unique, descriptive name for this match rule set.


6. Click OK.
The Schema Manager updates the name of the match rule set in the list.

Deleting Match Rule Sets


To delete a match rule set:
1. In the Schema Manager, display the Match Rule Sets tab in the
Match/Merge Setup Details dialog for the base object that you want to
configure. For more information, see "Navigating to the Match Rule Set
Tab" on page 404.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Select the name of the match rule set that you want to delete.

4. Click the button.


The Schema Manager prompts you to confirm deletion.
5. Click Yes.
The Schema Manager removes the deleted match rule set, along with all of
the match column rules it contains, from the list.

Configuring Match Column Rules for Match


Rule Sets
This section describes how to configure match column rules for a match rule
set in your Informatica MDM Hub implementation. For more information about
match rules sets, see "Configuring Match Rule Sets" on page 399. For more

- 407 -
information about the difference between match column rules and primary
key rules, see "Configuring Primary Key Match Rules" on page 434.

About Match Column Rules


A match column rule determines what constitutes a match during the match
process. Match column rules determine whether two records are similar
enough to consolidate. Each match rule is defined as a set of one or more
match columns that it needs to examine for points of similarity. Match rules
are configured by setting the conditions for identifying matching records
within and across source systems. For more information, see "About the
Match Process" on page 245.

Prerequisites for Configuring Match Column Rules

You can configure match column rules only after you have:


• configured the columns that you intend to use in your match rules, as
described in "Configuring Match Columns" on page 387
• created at least one match rule set, as described in "Configuring Match
Rule Sets" on page 399

Match Column Rules Differ Between Exact-Match and Fuzzy-Match


Base Objects

The properties for match column rules differ between exact match and fuzzy-
match base objects (see "Exact-match and Fuzzy-match Base Objects" on
page 247).
• For exact-match base objects, you can configure only exact column types.
• For fuzzy-match base objects, you can configure fuzzy or exact column
types. For more information, see "Match Rule Properties for Fuzzy-match
Base Objects Only" on page 409.

Specifying Consolidation Options for Matched Records

For each match column rule, decide whether matched records should be
automatically or manually consolidated. For more information, see
"Specifying Consolidation Options for Match Column Rules " on page 431 and
"Consolidating Records Automatically or Manually" on page 256.

- 408 -
Match Rule Properties for Fuzzy-match Base
Objects Only

This section describes match rule properties for fuzzy-match base objects.
These properties do not apply to exact-match base objects.

Match / Search Strategy

For fuzzy-match base objects, the match / search strategy defines the
strategy that Informatica MDM Hub uses for searching and matching in the
match rule. Select one of the following options:
Strategy Description
Option
Fuzzy Probabilistic match that takes into account spelling variations,
possible misspellings, and other differences that can make matching
records non-identical.
Exact Matches only records that are identical.

Certain configuration settings on the Match / Merge Setup tab apply to only
one type of column. In this document, such features are indicated with a
graphic that shows whether it applies to fuzzy-match columns only (as in the
following example), or exact-match columns only. No graphic means that the
feature applies to both.

The match / search strategy determines how to match candidate A with


candidate B using fuzzy or exact methods. The match / search strategy can
affects the quantity and quality of the match candidates. An exact match /
search strategy requires clean and complete data—it might miss some
matches if the data is less clean, incomplete, or full of duplicates. When
defining match rule properties, you must find the optimal balance between
finding all possible candidates, and not encumber the process with too many
irrelevant candidates.

- 409 -
Note: This match / search strategy is configured at the match rule level. For
more information about the match / search strategy configured at the base
object level (which determines whether it is a fuzzy-match base object or
exact-match base object), see "Match/Search Strategy" on page 370.

When specifying the match / search strategy for a fuzzy-match base object,
consider the implications of configuring the following types of match rules:
Type of Applies to
Match Rule
Fuzzy - Fuzzy Fuzzy and exact-match columns.
Search
Strategy
Exact - Exact Exact-match columns only. This option bypasses the fuzziness
Search of the base object and executes a simple exact match rule on a
Strategy fuzzy base object.
Filtered - Exact-match columns only. This option uses the fuzzy match
Fuzzy Search key as a filter, and then applies the exact match rule.
Strategy

Match Purpose

For fuzzy-match base objects, the match purpose defines the primary goal
behind a match rule. For example, if you're trying to identify matches for
people where address is an important part of determining whether two
records are for the same person, then you would choose the Match Purpose
called Resident.

For every match rule you define, you must choose the purpose of the rule
from a list of predefined match purposes provided by Informatica. Each match
purpose contains knowledge about how best to compare two records to
achieve the purpose of the match. Informatica MDM Hub uses the selected
match purpose as a basis for applying the match rules to determine matched
records. The behavior of the rules is dependent on the selected purpose. The
list of available match purposes depends on the population used, as described
in "Fuzzy Population" on page 370.

What the Match Purpose Determines

The match purpose determines:


• how your match rules behave
• which columns are required
• how much attention Informatica MDM Hub pays to each of the columns
used in the match process

- 410 -
Two rules with all attributes identical (except for the purpose) will return
different sets of matches because of the different purpose.

Mandatory and Optional Fields

Each match purpose supports a combination of mandatory and optional fields.


Each field is weighted according to its influence in the match decision. Some
fields in some purposes may be grouped. There are two types of groupings:
• Required—requires at least one of the field members to be non-null
• Best of—contributes only the best score from the fields in the group to the
overall match score

For example, in the Individual match purpose:


• Person_Name is a mandatory field
• One of either ID Number or Date of Birth is required
• Other attributes are optional

The overall score returned by each purpose is calculated by adding the


participating field scores multiplied by their respective weight and divided by
the total of all field weights. If a field is optional and is not provided, it is not
included in the weight calculation.

Name Formats

Informatica MDM Hub match has the concept of a default name format which
tells it where to expect the last name. The options are:
• Left—last name is at the start of the full name, for example Smith Jim
• Right—last name is at the end of the full name, for example, Jim Smith

The name format used by Informatica MDM Hub depends on the purpose that
you're using. If you are using Organization, then the default is Last name,
First name, Middle name. If using Person/Resident then the default is First
Middle Last.

Bear this in mind when formatting data for matching. It might not make a big
difference, but there are edge cases where it helps, particularly for names
that do not fall within the selected population.

List of Match Purposes

Informatica supplies the following match purposes:

- 411 -
Match Purpose Settings
Match Description
Purpose
Person_ This purpose is for matches intended to identify a person by
Name name. This purpose is best suited to online searches when a
name-only lookup is required and a human is available to make
the choice. Matching in batch typically requires other attributes
in addition to name to make match decisions. Use this purpose
only when the rule does not contain address fields. This purpose
will allow matches between people with an address and those
without an address. If the rules contain address fields, use the
Resident purpose instead.
This purpose uses the following fields:
• Person_Name (Required)
• Address_Part1
• Address_Part2
• Postal_Area
• Telephone_Number
• ID
• Date
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
To achieve a “best of” score between Address_Part2 and Postal_
Area, use Postal_Area as a repeat value in the Address_Part2
field.
Individual This purpose is intended to identify a specific individual by name
and with either the same ID number or date of birth attributes.
Since this purpose requires additional information, it is typically
used after a search by Person_Name.
This purpose uses the following fields:
• Person_Name (Required)
• ID-Either ID or Date are required (Using both is acceptable.)
• Date
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
Resident Intended to identify a person at an address. This purpose is
typically used after a search by either Person_Name or Address_
Part1. Optional input fields help qualify or rank a match if more
information is available.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2
field.
This purpose uses the following fields:
• Person_Name (Required)
• Address_Part1 (Required)
• Address_Part2
• Postal_Area
• Telephone_Number
• ID

- 412 -
Match Description
Purpose
• Date
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
Household Designed to identify matches where individuals with the same or
similar family names share the same address.
This purpose is typically used after a search by Address_Part1.
(Note: it is not practical to search by Person_Name because
ultimately only one word from the Person_Name must match,
and a one-word search will not perform well in most situations).
Emphasis is placed on the Last Name, the major word of the
Person_Name field, so this is one of the few cases where word
order is important in the way the records are presented for
matching.
However, a reasonable score will be generated provided that a
match occurs between the major word in one name and any
other word in the other name.
This purpose uses the following fields:
• Person_Name (Required)
• Address_Part1 (Required)
• Address_Part2
• Postal_Area
• Telephone_Number
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2
field.

- 413 -
Match Description
Purpose
Family Designed to identify matches where individuals with the same or
similar family names share the same address or the same
telephone number.
This purpose is typically used after a tiered search (multi-
search) by Address_Part1 and Telephone_Number. (Note: it is
not practical to search by Person_Name because ultimately only
one word from the Person_Name needs to match, and a one-
word search will not perform well in most situations).
Emphasis is placed on the Last Name, the major word of the
Person_Name field, so this is one of the few cases where word
order is important in the way the records are presented for
matching.
However, a reasonable score will be generated provided that a
match occurs between the major word in one name and any
other word in the other name.
This purpose uses the following fields:
• Person_Name (Required)
• Address_Part1 (Required)
• Telephone_Number (Required) (Score will be based on best
of Address_Part_1 and Telephone_Number)
• Address_Part2
• Postal_Area
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2
field.

- 414 -
Match Description
Purpose
Wide_ Designed to identify matches where the same address is shared
Household by individuals with the same family name or with the same
telephone number.
This purpose is typically used after a search by Address_Part1.
(Note: it is not practical to search by Person_Name because
ultimately only one word from the Person_Name needs to
match, and a one-word search will not perform well in most
situations).
Emphasis is placed on the last name, the major word of the
Person_Name field, so this is one of the few cases where word
order is important in the way the records are presented for
matching.
However, a reasonable score will be generated provided that a
match occurs between the major word in one name and any
other word in the other name.
This purpose uses the following fields:
• Address_Part1 (Required)
• Person_Name (Required)
• Telephone_Number (Required)
• Score will be based on best of Person_Name and Telephone_
Number
• Address_Part2
• Postal_Area
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2
field.
Address Designed to identify an address match. The address might be
postal, residential, delivery, descriptive, formal, or informal.
The only required field is Address_Part1. The fields Address_
Part2, Postal_Area, Telephone_Number, ID, Date, Attribute1 and
Attribute2 are available as optional input fields to further
differentiate an address. For example if the name of a City
and/or State is provided as Address_Part2, it will help
differentiate between a common street address [100 Main
Street] in different locations.
This purpose uses the following fields:
• Address_Part1 (Required)
• Address_Part2
• Postal_Area
• Telephone_Number
• ID
• Date
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2.

- 415 -
Match Description
Purpose
In that case, the Address_Part2 score used will be the higher of
the two scored fields.
Organization Designed to match organizations primarily by name. It is
targeted at online searches when a name-only lookup is required
and a human is available to make the choice. Matching in batch
typically requires other attributes in addition to name to make
match decisions. Use this purpose only when the rule does not
contain address fields. This purpose will allow matches between
organizations with an address and those without an address. If
the rules contain address fields, use the Division purpose.
This purpose uses the following fields:
• Organization_Name (Required)
• Address_Part1
• Address_Part2
• Postal_Area
• Telephone_Number
• ID
• Date
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required. Any optional
input fields you provide refine the ranking of matches.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2
field.

- 416 -
Match Description
Purpose
Division Designed to identify an Organization at an Address. It is typically
used after a search by Organization_Name or by Address_Part1,
or both.
It is in essence the same purpose as Organization, except that
Address_Part1 is a required field. Thus, this Purpose is designed
to match company X at an address of Y (or Z, etc., if multiple
addresses are supplied).
This purpose uses the following fields:
• Organization_Name (Required)
• Address_Part1 (Required)
• Address_Part2
• Postal_Area
• Telephone_Number
• ID
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2
field.
Contact Designed to identify a contact within an organization at a specific
location.
This Match purpose is typically used after a search by Person_
Name. However, either Organization_Name or Address_Part1
may be used as the search criteria.
This purpose uses the following fields:
• Person_Name (Required)
• Organization_Name (Required)
• Address_Part1 (Required)
• Address_Part2
• Postal_Area
• Telephone_Number
• ID
• Date
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2
field.

- 417 -
Match Description
Purpose
Corporate_ Designed to identify an Organization by its legal corporate
Entity name, including the legal endings such as INC, LTD, etc. It is
designed for applications that need to honor the differences
between such names as ABC TRADING INC and ABC TRADING
LTD.
This purpose is typically used after a search by Organization_
Name. It is in essence the same purpose as Organization, except
that tighter matching is performed and legal endings are not
treated as noise.
This purpose uses the following fields:
• Organization_Name (Required)
• Address_Part1
• Address_Part2
• Postal_Area
• Telephone_Number
• ID
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
To achieve a “best of” score between Address_Part2 and Postal_
Area, pass Postal_Area as a repeat value in the Address_Part2
field.
Wide_ Designed to loosely identify a contact within an organization—
Contact that is, without regard to actual location.
It is typically used after a search by Person_Name.
In addition to the required fields, ID, Attribute1 and Attribute2
may be optionally provided for matching to further qualify a
contact.
This purpose uses the following fields:
• Person_Name (Required)
• Organization_name (Required)
• ID
• Attribute1
• Attribute2
Unless otherwise indicated, fields are not required.
Fields Provided for general, non-specific use. It is designed in such a
way that there are no required fields. All field types are
available as optional input fields.

Match Levels

For fuzzy-match base objects, the match level determines how precise the
match is. You can specify one of the following match levels for a fuzzy-match
base object:

- 418 -
Match Levels
Level Description
Typical Appropriate for most matches.
Conservative Produces fewer matches than the Typical level. Some data that
actually matches may pass through the match process without
being flagged as a match. This situation is called
undermatching.
Loose Produces more matches than the Typical level. Loose matching
may produce a significant number of match candidates that are
not really matches. This situation is called overmatching. You
might choose to use this in a match rule for manual merges, to
make sure that other, tighter match rules have not missed any
potential matches.

Select the level based on your knowledge of the data to be matched: Typical,
Conservative (fewer matches), or Looser (more matches). When in doubt, use
Typical.

Accept Limit Adjustment

For fuzzy-match base objects, the accept limit is a number that determines
the acceptability of a match. This setting does the exact same thing as the
match level (see "Match Levels" on page 418), but to a more granular degree.
The accept limit is defined by Informatica within a population in accordance
with its match purpose. The Accept Limit Adjustment allows a coarse
adjustment to what is considered to be a match for this match rule.
• A positive adjustment results in more conservative matching.
• A negative adjustment results in looser matching.

For example, suppose that, for a given field and a given population, the accept
limit for a typical match level is 80, for a loose match level is 70, and for a
conservative match level is 90. If you specify a positive number (such as 3)
for the adjustment, then the accept level becomes slightly more conservative.
If you specify a negative number (such as -2), then the accept level becomes
looser.

Configuring this setting provides a optional refinement to your match settings


that might be helpful in certain circumstances. Adjusting the accept limit even
a few points can have a dramatic effect on your matches, resulting in
overmatching or undermatching. Therefore, it is recommended that you test
different settings iteratively, with small increments, to determine the best
setting for your data.

- 419 -
Match Column Properties for Match Rules
This section describes the match column properties that you can configure for
match rules.

Match Subtype

For base objects containing different types of data, the match subtype option
allows you to apply match rules to specific types of data within the same base
object. You have the option to enable or disable match subtyping for exact-
match columns that have parent/child path components. Match subtype is
available only for:
• exact-match column types that are based on a non-root Path Component,
and
• match rules that have a fuzzy match / search strategy

To use match subtyping, for each match rule, specify one or more exact-
match column(s) that will serve as the “subtyping” column(s) to use. The
 subtype indicator can be set for any of the exact-match columns regardless of
whether they are used for segment match or not. During the match process,
evaluation of the subtype column precedes evaluation of the other match
columns. Use match subtyping judiciously, because it can have a performance
impact on the match process.

Match Subtype behaves just like a standard parent/child matching scenario


with the additional requirement that the match column marked as Match
Subtype must be the same across all records being matched. In the following
example, the Match Subtype column is Address Type and the match rule
consists of Address Line1, City, and State.
Parent ID Address Line 1 City State Address Type
3 123 Main NYC ON Billing
3 50 John St Toronto NY Shipping
5 123 Main Toronto BC Billing
5 20 Adelaide St Markham AB Shipping
5 50 John St Ottawa ON Billing
7 50 John St Barrie BC Billing
7 20 Adelaide St Toronto NB Shipping
7 90 Yonge St Toronto ON Billing

- 420 -
Without Match Subtype, Parent ID 3 would match with 5 and 7. With Match
Subtype, however, Parent ID 3 will not match with 5 nor 7 because the
matching rows are distributed between different Address Types. Parent ID 5
and 7 will match with each other, however, because the matching rows all fall
within the 'Billing' Address Type.

Non-Equal Matching

Note: Non-Equal Matching and Segment Matching are mutually exclusive. If


one is selected, then the other cannot be selected.

Use non-equal matching in match rules to prevent equal values in a column


from matching each other. Non-equal matching applies only to exact-match
columns.

NULL Matching

Note: Null Matching and Segment Matching are mutually exclusive. If one is
selected, then the other cannot be selected.

Use NULL matching to specify how the match process should behave when null
values match other null values. NULL matching applies only to exact-match
columns.

By default, null matching is disabled, meaning that Informatica MDM Hub


treats nulls as unequal values when it searches for matches (a null value will
not match with anything). To enable null matching, you must explicitly select a
null matching option for the match columns to allow null matching.

A match column containing a NULL value is identified as matching based on


the following settings:
Property Description
Disabled Regardless of the other value, nothing will match (nulls are
unequal values). Default setting. A NULL is seen as a placeholder
for an unknown value.
NULL If both values are NULL, then it is considered a match.
Matches
NULL
NULL If one value is NULL and the other value is not NULL, or if cell

- 421 -
Property Description
Matches values are identical between records, then it is considered a
Non-NULL match.

Once null matching is configured, Build Match Groups will allow only a single
“Null to non NULL” match into any group, thereby reducing the possibility of
unwanted transitive matching. For more information, see "Build Match Groups
and Transitive Matches" on page 249.

Note: Null matching is exclusive of exact matching. For example, if you


enable NULL Matches Non-Null, the match rule returns only those matches in
which one of the cell values is NULL. It will not provide exact matches where
both cells are equal in addition to also matching NULL against non-NULL.
Therefore, if you need both behaviors, you must create two exact match
rules—one with NULL matching enabled, and the other with NULL matching
disabled.

Segment Matching

Note: Segment Matching and Non-Equal Matching are mutually exclusive. If


one is selected, then the other cannot be selected. Segment Matching and
NULL Matching are also mutually exclusive. If one is selected, then the other
cannot be selected.

For exact-match columns only, you can use segment matching to limit match
rules to specific subsets of data. For example, you could define different
match rules for customers in different countries by using segment matching to
limit certain rules to specific country codes. Segment matching applies to both
exact-match and fuzzy-match base objects. For more information, see
"Configuring Segment Matching for a Column" on page 433.

If the Segment Matching check box is checked (selected), you can configure
two other options: Segment Matches All Data and Segment Match Values.

Segment Matches All Data

When unchecked (the default), Informatica MDM Hub will only match records
within the set of values defined in Segment Match Values. For example,
suppose a base object contained Leads, Partners, Customers, and Suppliers. If
Segment Match Values contained the values Leads and Partners, and Segment
Matches All Data were unchecked, then Informatica MDM Hub would only
match within records that contain Leads or Partners. All Customers and
Suppliers records will be ignored.

- 422 -
With Segment Matches All Data checked (selected), then Leads and Partners
would match with Customers and Suppliers, but Customers and Suppliers
would not match with each other.

Segment Match Values

For segment matching, specifies the list of segment values to use for segment
matching. You must specify one or more values (for a match column) that
defines the segment matching. For example, for a given match rule, suppose
you wanted to define segment matching by Gender. If you specified a segment
match value of M (for male), then, for that match rule, Informatica MDM Hub
searches for matches (based on the other match columns) only on male
records—and can only match to other male records, unless you also enabled
Segment Matches All Data.

Note: Segment match values are case-sensitive. When using segment


matching on fuzzy and exact base objects, the values that you set are case-
sensitive when executing the Match batch job.

Concatenation of Values in Multiple Columns

For exact matches with segment matching enabled on concatenated columns,


a space character must be added to each piece of data present in the
concatenated fields.

Note: Concatenating columns is not recommended for exact match columns.

Requirements for Exact-match Columns in Match


Column Rules

Exact-match columns are subject to the following rules:


• The names of exact-match columns cannot be longer than 26 characters.
• Exact-match columns must be of type VARCHAR or CHAR.
• Match columns can be used to match on any text column or combination of
text columns from a base object.
• If you want to use numerics or dates, you must convert them to VARCHAR
using cleanse functions before they are loaded into your base object. For
more information, see "Using Cleanse Functions" on page 314.
• Match columns can also be used to match on a column from a child base
object, which in turn can be based on any text column or combination of

- 423 -
text columns in the child base object. Matching on the match columns of a
child base object is called intertable matching.
• When using intertable match and creating match rules for the child table
(via a foreign key), you must include the foreign key from the parent table
in each match rule on the child. If you do not, when the child is merged,
the parent records would lose the child records that had previously
belonged to them.

For more information, see "Match Columns Depend on the Search Strategy" on
page 388.

Command Buttons for Configuring Column Match


Rules
In the Match Rule Sets tab, if you select a match rule set in the list, the
Schema Manager displays the following command buttons.
Button Description
Adds a match rule. For more information, see "Adding Match Column
Rules" on page 424.
Edits properties for the selected a match rule. For more information,
see "Editing Match Column Rules" on page 428.
Deletes the selected match rule. For more information, see "Deleting
Match Column Rules" on page 429.
Moves the selected match rule up in the sequence. For more
information, see "Changing the Execution Sequence of Match Column
Rules" on page 430.
Moves the selected match rule down in the sequence. For more
information, see "Changing the Execution Sequence of Match Column
Rules" on page 430.
Changes a manual consolidation rule to an automatic consolidation
rule. Select a manual consolidation record and then click the button.
For more information, see "Specifying Consolidation Options for Match
Column Rules " on page 431.
Changes an automatic consolidation rule to a manual consolidation
rule. Select an automatic consolidation record and then click the
button. For more information, see "Specifying Consolidation Options
for Match Column Rules " on page 431.

Important: If you change your match rules after matching, you are prompted
to reset your matches. When you reset your matches, it deletes everything in
the match table and, in records where the consolidation indicator is 2, resets
the consolidation indicator to 4. For more information, see "About the
Consolidate Process" on page 255 and "Reset Match Table Jobs" on page 555.

Adding Match Column Rules


To add a new match rule using match columns:

- 424 -
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the base object that you want to configure. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Match Rule Sets tab. For more information, see "Navigating to
the Match Rule Set Tab" on page 404.
4. Select a match rule set in the list.
The Schema Manager displays the properties for the selected match rule
set.

5. In the Match Rules section of the screen, click the plus button .
The Schema Manager displays the Edit Match Rule dialog. This dialog
differs slightly between exact match and fuzzy-match base objects.
Exact-match Base Objects

- 425 -
Fuzzy-match Base Objects

6. For fuzzy-match base objects, configure the match rule properties at the
top of the dialog box. For more information, see "Match Rule Properties for
Fuzzy-match Base Objects Only" on page 409.
7. Configure the match column(s) for this match rule.
Only columns you have previously defined as match columns are shown.
• For exact-match base objects or match rules with an exact match /
search strategy, only exact column types are available.
• For fuzzy-match base objects, you can choose fuzzy or exact column
types.

- 426 -
For more information, see "Match Columns Depend on the Search
Strategy" on page 388.

a. Click the Edit button next to the Match Columns list.


The Schema Manager displays the Add/Remove Match Columns dialog.

b. Check (select) the check box next to any column that you want to
include.
c. Uncheck (clear) the check box next to any column that you want to
omit.
d. Click OK.
The Schema Manager displays the selected columns in the Match Columns
list.

8. Configure the match properties for each match column in the Match
Columns list. For more information, see:
• "Match Column Properties for Match Rules" on page 420
• "Configuring the Match Weight of a Column" on page 432
• "Configuring Segment Matching for a Column" on page 433
• "NULL Matching" on page 421
• "Match Subtype" on page 420
9. Click OK.

- 427 -
10. If this is an exact match, specify the match properties for this match rule.
For more information, see "Requirements for Exact-match Columns in
Match Column Rules" on page 423. Click OK.

11. Click the Save button to save your changes.


Before saving changes, the Schema Manager analyzes the match rule set
and prompts you with a message if the match rule set contains certain
incongruences. For more information, see "Rule Set Evaluation" on page
400.
12. If you are prompted to confirm saving changes, click OK button to save
your changes.

Editing Match Column Rules


To edit the properties for an existing match rule:
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the exact-match base object that you want to configure. For more
information, see "Navigating to the Match/Merge Setup Details Dialog" on
page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Match Rule Sets tab. For more information, see "Navigating to
the Match Rule Set Tab" on page 404.
4. Select a match rule set in the list.
The Schema Manager displays the properties for the selected match rule
set.

5. In the Match Rules section of the screen, click the Edit button.
The Schema Manager displays the Edit Match Rule dialog. This dialog
differs slightly between exact match and fuzzy-match base objects. For
more information, see "Adding Match Column Rules" on page 424.
6. For fuzzy-match base objects, change the match rule properties at the top
of the dialog box, if you want. For more information, see "Match Rule
Properties for Fuzzy-match Base Objects Only" on page 409.
7. Configure the match column(s) for this match rule, if you want.
Only columns you have previously defined as match columns are shown.
• For exact-match base objects or match rules with an exact match /
search strategy, only exact column types are available.
• For fuzzy-match base objects, you can choose fuzzy or exact columns
types.
For more information, see "Match Columns Depend on the Search
Strategy" on page 388.

- 428 -
a. Click the Edit button next to the Match Columns list.
The Schema Manager displays the Add/Remove Match Columns dialog.

b. Check (select) the check box next to any column that you want to
include.
c. Uncheck (clear) the check box next to any column that you want to
omit.
d. Click OK.
The Schema Manager displays the selected columns in the Match Columns
list.
8. Change the match properties for any match column that you want to edit.
For more information, see:
• "Match Column Properties for Match Rules" on page 420
• "Configuring the Match Weight of a Column" on page 432
• "Configuring Segment Matching for a Column" on page 433
• "NULL Matching" on page 421
• "Match Subtype" on page 420
9. Click OK.
10. If this is an exact match, specify the match properties for this match rule.
For more information, see "Requirements for Exact-match Columns in
Match Column Rules" on page 423. Click OK.

11. Click the Save button to save your changes.


Before saving changes, the Schema Manager analyzes the match rule set
and prompts you with a message if the match rule set contains certain
incongruences. For more information, see "Rule Set Evaluation" on page
400.
12. If you are prompted to confirm saving changes, click OK button to save
your changes.

Deleting Match Column Rules


To delete a match column rule:

- 429 -
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the exact-match base object that you want to configure. For more
information, see "Navigating to the Match/Merge Setup Details Dialog" on
page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Match Rule Sets tab. For more information, see "Navigating to
the Match Rule Set Tab" on page 404.
4. Select a match rule set in the list.
5. In the Match Rules section, select the match rule that you want to delete.

6. Click the Remove button.


The Schema Manager prompts you to confirm deletion.
7. Click Yes.

Changing the Execution Sequence of Match Column


Rules
To change the execution sequence of match column rules:
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the exact-match base object that you want to configure. For more
information, see "Navigating to the Match/Merge Setup Details Dialog" on
page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Match Rule Sets tab. For more information, see "Navigating to
the Match Rule Set Tab" on page 404.
4. Select a match rule set in the list.
5. In the Match Rules section, select the match rule that you want to move up
or down.
6. Do one of the following:

• Click the button to move the selected match rule up in the


execution sequence.

• Click the button to move the selected match rule down in the
execution sequence.

7. Click the Save button to save your changes.


Before saving changes, the Schema Manager analyzes the match rule set
and prompts you with a message if the match rule set contains certain

- 430 -
incongruences. For more information, see "Rule Set Evaluation" on page
400.
8. If you are prompted to confirm saving changes, click OK button to save
your changes.

Specifying Consolidation Options for Match Column


Rules
During the match process, a match column rule must determine whether
matched records should be queued for manual or automatic consolidation. For
more information, see "About the Consolidate Process" on page 255.

Note: A base object cannot have more than 200 user-defined columns if it will
have match rules that are configured for automatic consolidation.

To toggle between manual and automatic consolidation for a match rule:


1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the exact-match base object that you want to configure. For more
information, see "Navigating to the Match/Merge Setup Details Dialog" on
page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Match Rule Sets tab. For more information, see "Navigating to
the Match Rule Set Tab" on page 404.
4. Select a match rule set in the list.
5. In the Match Rules section, select the match rule that you want to
configure.
6. Do one of the following:

• Click the button to change a manual consolidation rule to an


automatic consolidation rule.

• Click the button to change an automatic consolidation rule to a


manual consolidation rule.

7. Click the Save button to save your changes.


Before saving changes, the Schema Manager analyzes the match rule set
and prompts you with a message if the match rule set contains certain
incongruences. For more information, see "Rule Set Evaluation" on page
400.
8. If you are prompted to confirm saving changes, click OK button to save
your changes.

- 431 -
Configuring the Match Weight of a Column

For a fuzzy-match column, you can change its match weight in the Edit Match
Rule dialog box. For each column, Informatica MDM Hub assigns an internal
match weight, which is a number that indicates the importance of this column
(relative to other columns in the table) for matching. The match weight varies
according to the selected match purpose and population. For example, if the
match purpose is Person_Name, then Informatica MDM Hub, when evaluating
matches, views a data match in the name column with greater importance
than a data match in a different column (such as the address).

By adjusting the match weight of a column, you give added weight to, and
elevate the significance of, that column (relative to other columns) when
Informatica MDM Hub analyzes values for matches.

To configure the match weight of a column:


1. In the Edit Match Rule dialog box, select a column in the list.

2. Click the Match Weight Adjustment button.


If adjusted, the name of the selected column shows in a bold font.

3. Click the Save button to save your changes.


Before saving changes, the Schema Manager analyzes the match rule set
and prompts you with a message if the match rule set contains certain

- 432 -
incongruences. For more information, see "Rule Set Evaluation" on page
400.
4. If you are prompted to confirm saving changes, click OK button to save
your changes.

Configuring Segment Matching for a Column

As described in "Segment Matching" on page 422, segment matching is used


with exact-match columns to limit match rules to specific subsets of data.

To configure segment matching for an exact-match column


1. In the Edit Match Rule dialog box, select an exact-match column in the
Match Columns list.
2. Check (select) the Segment Matching check box to enable this feature.
3. Check (select) the Segment Matches All Data check box, if you want. For
more information, see "Segment Matches All Data" on page 422.
4. Specify the segment match values for segment matching. For more
information, see "Segment Match Values" on page 423.

a. Click the Edit button.


The Schema Manager displays the Edit Values dialog.

b. Do one of the following:

• To add a value, click , type the value you want to add, and click
OK.

• To delete a value, select it in the list, click , and choose Yes


when prompted to confirm deletion.
5. Click OK.

6. Click the Save button to save your changes.

- 433 -
Before saving changes, the Schema Manager analyzes the match rule set
and prompts you with a message if the match rule set contains certain
incongruences. For more information, see "Rule Set Evaluation" on page
400.
7. If you are prompted to confirm saving changes, click OK button to save
your changes.

Configuring Primary Key Match Rules


This section describes how to configure primary key match rules for your
Informatica MDM Hub implementation. If you want to configure match column
match rules instead, see the instructions in "Configuring Match Columns" on
page 387.

About Primary Key Match Rules


Matching on primary keys can be used when two or more different source
systems for a base object have identical primary key values. This situation
occurs infrequently in source systems, but when it does occur, you can make
use of the primary key matching option in Informatica MDM Hub to rapidly
match and automatically consolidated records from the source systems that
have the matching primary keys.

For example, two systems might use the same set of customer IDs. If both
systems provide information about customer XYZ123 using identical primary
key values, the two systems are certainly referring to the same customer and
the records should be automatically consolidated.

When you specify a primary key match, you simply specify which source
systems that have the same primary key values. You also check the Auto-
merge matching records check box to have Informatica MDM Hub
automatically consolidate matching records when a Merge or Link batch job is
run. For more information, see "Automerge Jobs" on page 534 and "Autolink
Jobs" on page 532.

Adding Primary Key Match Rules


To add a new primary key match rule:
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the base object that you want to configure. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Primary Key Match Rules tab.
The Schema Manager displays the Primary Key Match Rules tab.

- 434 -
The Primary Key Match Rules tab has the following columns.

Column Description
Key Two source systems for which this primary match key rule
Combination will be used for matching. These source systems must
already be defined in Informatica MDM Hub (see "Configuring
Source Systems" on page 264), and staging tables for this
base object must be associated with these source systems
(see "Configuring Staging Tables" on page 275).
Auto-Merge Specifies whether this primary key match rule results in
automatic or manual consolidation. For more information,
see "About the Consolidate Process" on page 255.

4. Click the Plus button to add a primary match key rule.


The Add Primary Key Match Rule dialog is displayed.

5. Check (select) the check box next to two source systems for which you
want to match records based on the primary key.
6. Check (select) the Auto-merge matching records check box if you are
certain that records with identical primary keys are matches.
You can change your choice for Auto-merge matching records later, if
you want.
7. Click OK.
The Schema Manager displays the new rule in the Primary Key Rule tab.

- 435 -
8. Click the Save button to save your changes.
The Schema Manager asks you whether you want to reset existing
matches.

9. Choose Yes. to delete all matches currently stored in the match table, if
you want.

Editing Primary Key Match Rules


Once you have defined a primary key match rule, you can change the value of
the Auto-merge matching records check box.

To edit an existing primary key match rule:


1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the base object that you want to configure. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Primary Key Match Rules tab.
The Schema Manager displays the Primary Key Match Rules tab.

4. Scroll to the primary key match rule that you want to edit.

- 436 -
5. Check or uncheck the Auto-merge matching records check box to
enable or disable auto-merging, respectively.

6. Click the Save button to save your changes.


The Schema Manager asks you whether you want to reset existing
matches.

7. Choose Yes to delete all matches currently stored in the match table, if
you want.

Deleting Primary Key Match Rules


To delete an existing primary key match rule:
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the base object that you want to configure. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Primary Key Match Rules tab.
The Schema Manager displays the Primary Key Match Rules tab.

4. Select the primary key match rule that you want to delete.

5. Click the Delete button.


The Schema Manager prompts you to confirm deletion.
6. Choose Yes.
The Schema Manager removes the deleted rule from the Primary Key
Match Rules tab.

7. Click the Save button to save your changes.

- 437 -
The Schema Manager asks you whether you want to reset existing
matches.

8. Choose Yes to delete all matches currently stored in your Match table, if
you want.

Investigating the Distribution of Match Keys


This section describes how to investigate the distribution of match keys in the
match key table.

About Match Keys Distribution


As described in "Match Tokens and Match Keys" on page 240, match keys are
strings that encode data in the fuzzy match key column used to identify
candidates for matching. The tokenize process generates match keys for all
the records in a base object and stores them in its match key table. Depending
on the nature of the data in the base object record, the tokenize process
generates at least one match key—and possibly multiple match keys—for each
base object record. Match keys are used subsequently in the match process to
help determine possible matches between base object records.

In the Match / Merge Setup Details pane of the Schema Manager, the Match
Keys Distribution tab allows you to investigate the distribution of match keys
in the match key table. This tool can assist you with identifying potential hot
spots in your data—high concentrations of match keys that could result in
overmatching—where the match process generates too many matches,
including matches that are not relevant. By knowing where hot spots occur in
your data, you can refine data cleansing and match rules to reduce hot spots
and generate an optimal distribution of match keys for use in the match
process. Ideally, you want to have a relatively even distribution across all
keys.

Navigating to the Match Keys Distribution Tab


To navigate to the Match Keys Distribution tab:
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the base object that you want to configure. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. Click the Match Keys Distribution tab.
The Schema Manager displays the Match Keys Distribution tab.

- 438 -
Components of the Match Keys Distribution Tab
The Match Key Distribution tab displays a histogram, match keys, and match
columns.

Histogram

The histogram displays the statistical distribution of match keys in the match
key table.
Axis Description
Key Starting character(s) of the match key. If no filter is applied (the
(X- default), this is the starting character of the match key. If a filter is
axis) applied, this is the starting sequence of characters in the match key,
beginning with the left-most character. For more information, see
"Filtering Match Keys" on page 440.
Count Number of match keys in the match key table that begins with the
(Y- starting character(s). Hotspots in the match key table show up as
axis) disproportionately tall spikes (high number of match keys), relative to
other characters in the histogram.

Match Keys List

67

The Match Keys List on the Match Keys Distribution tab displays records in the
match key table. For each record, it displays cell data for the following
columns:
Column Description
Name
ROWID ROWID_OBJECT that uniquely identifies the record in the base
object that is associated with this match key.
KEY Generated match key. SSA_KEY column in the match key table.

Depending on the configured match rules and the nature of the data in a
record, a single record in the base object table can have multiple generated
match keys.

- 439 -
Paging Through Records in the Match Key Table

Use the following command buttons to navigate the records in the match key
table.
Button Description
Displays the first page of records in the match key table.

Displays the previous page of records in the match key table.

Displays the next page of records in the match key table.

Jumps to the page number you enter.

Match Columns

The Match Columns area on the Match Keys Distribution tab displays match
column data for the selected record in the match keys list. This is the SSA_
DATA column in the match key table. For each match column that is
configured for this base object (see "Configuring Match Columns" on page
387), it displays the column name and cell data.

Filtering Match Keys


You can use a match key filter to focus your investigation on hotspots or other
match key distribution patterns. A match key filter restricts the data in the
Histogram and the Match Keys List to the subset of match keys that meets the
filter condition. By default, no filter is defined—all records in the match key
table are displayed.

The filter condition specifies the beginning string sequence for qualified match
keys, evaluated from left to right. For example, to view only match keys
beginning with the letter M, you would select M for the filter. To further restrict
match keys and view data for only the match keys that start with the letters MD
you would add the letter D to the filter. The longer the filter expression, the
more restrictive the display.

Setting a Filter

To set a filter:
• Click the vertical bar in the Histogram associated with the character you
want to add to the filter.

For example, suppose you started with the following default view in the
Histogram.

- 440 -
If you click the vertical bar above a character (such as the M character), the
Histogram refreshes and displays the distribution for all match keys beginning
with that character.

Note that the Match Keys List now displays only those match keys that meet
the filter condition.

Navigating Filters

Use the following command buttons to navigate filters.


Button Description
Clears the filter. Displays the default view (no filter).

Displays the previously-selected filter (removes the right-most


character from the filter).

Excluding Records from the Match Process

- 441 -
Informatica MDM Hub provides a mechanism for selectively excluding records
from the match process. You might want to do this if, for example, your data
contained records that you wanted the match process to ignore.

To configure this feature, in the Schema Manager, you add a column named
EXCLUDE_FROM_MATCH to a base object. This column must be an integer type
with a default value of zero (0), as described in "Adding Columns" on page
108.

Once the table is populated and before running the Match job, to exclude a
record from matching, change its value in the EXCLUDE_FROM_MATCH column
to a one (1) in the Data Manager. When the Match job runs, only those records
with an EXCLUDE_FROM_MATCH value of zero (0) will be tokenized and
processed—all other records will be ignored. When the cell value is changed,
the DIRTY_IND for this record is set to 1 so that match keys will be
regenerated when the tokenize process is executed, as described in "Match
Tokens and Match Keys" on page 240.

Excluding records from the match process is available for:


• fuzzy-match base objects only (see "Exact-match and Fuzzy-match Base
Objects" on page 247),
• match column rules only (not primary key match rules) that do not match
for duplicates (see "Match for Duplicate Data Jobs" on page 552)

- 442 -
Chapter 15: Configuring the
Consolidate Process

This chapter describes how to configure the consolidate process for your
Informatica MDM Hub implementation.

Chapter Contents
• "Before You Begin" on page 443
• "About Consolidation Settings" on page 443
• "Changing Consolidation Settings" on page 447

Before You Begin


Before you begin, you must have installed Informatica MDM Hub, created the
Hub Store according to the instructions in Informatica MDM Hub Installation
Guide, and built the schema according to the instructions in "Building the
Schema" on page 73. To learn about the consolidate process, see "Consolidate
Process" on page 255.

About Consolidation Settings


Consolidation settings affect the behavior of the consolidate process in
Informatica MDM Hub. This section describes the settings that you can
configure on the Merge Settings tab in the Match/Merge Setup Details dialog.
For more information, see "About the Consolidate Process" on page 255.

Immutable Rowid Object


For a given base object, you can designate a source system as an immutable
source, which means that records from that source system will be accepted as
unique (CONSOLIDATION_IND = 1)—even in the event of a merge. Once a
record from that source has been fully consolidated, it will not be changed
subsequently, nor will it be matched to any other record (although other
records can be matched to it). Only one source system can be configured as
an immutable source.

- 443 -
Note: If the Requeue on Parent Merge setting for a child base object is set to
2, in the event of a merging parent, the consolidation indicator will be set to 4
for the child record. For more information, see "Requeue On Parent Merge" on
page 91.

Immutable sources are also distinct systems, as described in "Distinct Source


Systems" on page 445. All records are stored in the Informatica MDM Hub as
master records. For all source records from an immutable source system, the
consolidation indicator for Load and PUT is always 1 (consolidated record). If
the Requeue on Parent Merge setting for a child base object is set to 2, then in
the event of a merging parent, the consolidation indicator will be set to 4 for
the child record. For more information, see "Consolidation Status for Base
Object Records" on page 219.

To specify an immutable source for a base object, click the drop-down list
next to Immutable Rowid Object and select a source system.

This list displays the source system(s) associated with this base object. Only
one source system can be designated an immutable source system. For more
information, see "Configuring Source Systems" on page 264.

Immutable source systems are applicable when, for example, Informatica


MDM Hub is the only persistent store for the source data. Designating an
immutable source system streamlines the load, match, and merge processes
by preventing intra-source matches and automatically accepting records from
immutable sources as unique. If two immutable records must be merged, then
a data steward needs to perform a manual verification in order to allow that
change. At that point, Informatica MDM Hub allows the data steward to choose
the key that remains.

- 444 -
Distinct Systems
A distinct system provides data that gets inserted into the base object without
being consolidated. Records from a distinct system will never match with
other records from the same system, but they can be matched to and from
other records in other systems (their CONSOLIDATION_IND is set to 4 on
load). You can specify distinct source systems and configure whether, for each
source system, records are consolidated automatically or manually.

Distinct Source Systems

You can designate a source system as a distinct source (also known as a


golden source), which means that records from that source will not be
merged. For example, if the ABC source has been designated as a distinct
source, then the match rules will never match (or merge) two records that
come from the same source. Records from a distinct source will not match
through a transient match in an Auto Match and Merge process (see "Auto
Match and Merge Jobs" on page 532). Such records can be merged only
manually by flagging them as matches.

To designate a distinct source system:


1. From the list of source systems on the Merge Settings tab, select (check)
any source system that should not allow intra-system merges to prevent
records from merging.
2. For each distinct source system, designate whether you want it to use Auto
Rules only (see "Auto Rules Only" on page 446).

The following example shows both options selected for the Billing system.

- 445 -
Auto Rules Only

For distinct systems only, you can enable this option to allow you to configure
what types of rules are executed for the associated distinct source system.
Check (select) this check box if you want Informatica MDM Hub to apply only
the automatic consolidation rules (not the manual consolidation rules) for this
distinct system. By default, this option is disabled (unchecked).

Unmerge Child When Parent Unmerges (Cascade


Unmerge)
Important: This feature applies only to child base objects with configured
match rules and foreign keys.

For child base objects, Informatica MDM Hub provides a cascade unmerge
feature that allows you to specify what happens if records in the parent base
object are unmerged. By default, this feature is disabled, so that unmerging
parent records does not unmerge associated child records. In the Unmerge
Child When Parent Unmerges portion near the bottom of the Merge Settings
tab, if you check (select) the Cascade Unmerge check box for a child base
object, when records in the parent object are unmerged, Informatica MDM
Hub also unmerges affected records in the child base object.

Prerequisites for Cascade Unmerge

To enable cascade unmerge:


• the parent-child relationship must already be configured in the child base
object
• the foreign key column in the child base object must be a match-enabled
column

In the Unmerge Child When Parent Unmerges portion near the bottom of the
Merge Settings tab, the Schema Manager displays only those match-enabled
columns in the child base object that are configured with a foreign key. For
more information, see "Configuring Foreign-Key Relationships Between Base
Objects" on page 113.

Parents with Multiple Children

In situations where a parent base object has multiple child base objects, you
can explicitly enable cascade unmerge for each child base object. Once
configured, when the parent base object is unmerged, then all affected
records in all associated child base objects are unmerged as well.

- 446 -
Considerations for Using Cascade Unmerge

A full unmerge of affected records is not required in all implementations, and


it can have a performance overhead on the unmerge because many child
records can be affected. In addition, it does not always make sense to enable
this property. One example is when Customer is a child of Customer Type. In
this situation, you might not want to unmerge Customers if Customer Type is
unmerged. However, in most cases, it is a good idea to unmerge addresses
linked to customers if Customer unmerges.

Note: When cascade unmerge is enabled, the child record may not be
unmerged if a previous manual unmerge was done on the child base object.

When you enable the unmerge feature, it applies to the child table and the
child cross-reference table. Once enabled, if you then unmerge the parent
cross-reference, the original child cross-reference should be unmerged as
well. This feature has no impact on the parent—the feature operates on the
child tables to provide additional flexibility.

Changing Consolidation Settings


To change consolidation settings on the Merge Settings tab:
1. In the Schema Manager, display the Match/Merge Setup Details dialog for
the base object that you want to configure. For more information, see
"Navigating to the Match/Merge Setup Details Dialog" on page 365.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Click the Merge Settings tab.
The Schema Manager displays the Merge Settings tab for the selected base
object.

- 447 -
4. Change any of the following settings:
• "Immutable Rowid Object" on page 443
• "Distinct Systems" on page 445
• "Unmerge Child When Parent Unmerges (Cascade Unmerge)" on page
446

5. Click the Save button to save your changes.

- 448 -
Chapter 16: Configuring the Publish
Process

This chapter describes how to configure the publish process for Informatica
MDM Hub data using message triggers and embedded message queues. For an
introduction, see "Publish Process" on page 260.

Chapter Contents
• "Before You Begin" on page 449
• "Configuration Steps for the Publish Process" on page 450
• "Starting the Message Queues Tool" on page 450
• "Configuring Global Message Queue Settings" on page 451
• "Configuring Message Queue Servers" on page 452
• "Configuring Outbound Message Queues" on page 454
• "Configuring Message Triggers" on page 456
• "JMS Message XML Reference" on page 464

Before You Begin


Before you begin, you must have completed the following tasks:
• Installed Informatica MDM Hub, created the Hub Store, and successfully
set up message queues according to the instructions in the Informatica
MDM Hub Installation Guide
• Completed the tasks in the Informatica MDM Hub Installation Guide to
configure Informatica MDM Hub to handle asynchronous Services
Integration Framework (SIF) requests, if applicable
Note: SIF uses a message-driven bean (MDB) on the JMS message queue
(named siperian.sif.jms.queue) to process incoming asynchronous SIF
requests. This required queue is set up during the installation process. as
described in the Informatica MDM Hub Installation Guide for your
platform. If your Informatica MDM Hub implementation does not require
any additional message queues, then you can skip this chapter.
• Built the schema according to the instructions in "Building the Schema" on
page 73

- 449 -
• Read the introduction to the publish process in "Publish Process" on page
260

Configuration Steps for the Publish Process


After installing Informatica MDM Hub, you use the Message Queues tool in the
Hub Console to configure message queues for your Informatica MDM Hub
implementation. The following tasks are mandatory if you want to publish
events in the outbound message queue:
1. Configure the message queues on your application server.
The Informatica MDM Hub installer automatically sets up message queues
and the connection factory configuration. For more information, see the
Informatica MDM Hub Installation Guide for your platform.
2. Configure global message queue settings. For more information, see
"Configuring Global Message Queue Settings" on page 451.
3. Add at least one message queue server. For more information, see
"Configuring Message Queue Servers" on page 452.
4. Add at least one message queue to the message queue server. For more
information, see "Configuring Outbound Message Queues" on page 454.
5. Generate the JMS event message schema for each ORS that has data that
you want to publish. For more information, see "Generating and Deploying
ORS-specific Schemas" on page 617.
6. Configure message triggers for your message queues. For more
information, see "Configuring Message Triggers" on page 456.

After you have configured message queues, you can review run-time activities
using the Audit Manager according to the instructions in "Auditing Message
Queues" on page 689.

Starting the Message Queues Tool


To start the Message Queues tool:
1. In the Hub Console, connect to the Master Database.
Message queues are defined in the Master Database.
2. In the Hub Console, expand the Configuration workbench, and then click
Message Queues.
The Hub Console displays the Message Queues tool.

The Message Queues tool is divided into two panes.


Pane Description
Navigation Shows (in a tree view) the message queues that are defined for
pane this Informatica MDM Hub implementation.

- 450 -
Pane Description
Properties Shows the properties for the selected message queue.
pane

Configuring Global Message Queue Settings


To configure the global message queue settings for your Informatica MDM Hub
implementation:
1. In the Hub Console, start the Message Queues tool. For more information,
see "Starting the Message Queues Tool" on page 450.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Specify settings for Data Changes Monitoring, which monitors the queue
for outgoing messages.
To enable or disable Data Changes Monitoring, click the Toggle Data
Changes Monitoring Status button.
4. Specify the following monitoring settings:

Monitoring Description
Setting
Receive Default is 0. Amount of time allowed to receive the
Timeout messages from the queue.
(milliseconds)
Receive Batch Default is 100. Maximum number of events processed and
Size placed in the message queue in a single pass.
Message Default is 300000. Amount of time to pause before polling
Check for inbound messages or processing outbound messages.
Interval The same value applies to both inbound and outbound
(milliseconds) message queues.
Out of sync If configured, periodically polls for ORS metadata and
check interval regenerates the XML message schema if subsequent
(milliseconds) changes have been made to design objects in the ORS. For
more information, see "Generating and Deploying ORS-
specific Schemas" on page 617.
By default, this feature is disabled—set to zero (0)—and is
available only if:
• Data Changes Monitoring is enabled.
• ORS-specific XML message schema has been generated
using the JMS Event Schema Manager.
Note: Make sure that this value is greater than or equal to
the Message Check Interval.

Click the button next to any property that you want to change.

5. Click the button to save your changes.

- 451 -
Configuring Message Queue Servers
This section describes how to configure message queue servers for your
Informatica MDM Hub implementation.

About Message Queue Servers


Before you can define message queues in Informatica MDM Hub, you must
define the message queue server(s) that Informatica MDM Hub will use for
handling message queues. Before you can define a message queue server in
Informatica MDM Hub, it must already be defined on your application server
according to the documented instructions for your application server. You will
need the connection factory name.

Message Queue Server Properties


This section describes the settings that you can configure for message queue
servers.

WebLogic and JBoss Properties

You can configure the following message queue server properties.


Property Description
Connection Name of the connection factory for this message queue
Factory Name server.
Display Name Name of this message queue server as it will be displayed
in the Hub Console.
Description Descriptive information for this message queue server.

WebSphere Properties

IBM WebSphere implementations have the following properties.


Property Description
Server Name Name of the server where the message queue is defined.
Channel Channel of the server where the message queue is defined.
Port Port on the server where the message queue is defined.

Adding Message Queue Servers


To add a message queue server:
1. In the Hub Console, start the Message Queues tool. For more information,
see "Starting the Message Queues Tool" on page 450.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

- 452 -
3. Right-click anywhere in the Navigation pane and choose Add Message
Queue Server.
The Message Queues tool displays the Add Message Queue Server dialog.

4. the Message Queues tool displays Specify the properties for this message
queue server. For more information, see "Message Queue Server
Properties" on page 452.

Editing Message Queue Server Properties


To edit the properties of an existing message queue server:
1. In the Hub Console, start the Message Queues tool. For more information,
see "Starting the Message Queues Tool" on page 450.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation pane, select the name of the message queue server that
you want to configure.
4. Change the editable properties for this message queue server. For more
information, see "Message Queue Server Properties" on page 452.

Click the button next to any property that you want to change.

5. Click the button to save your changes.

Deleting Message Queue Servers


To delete an existing message queue server:
1. In the Hub Console, start the Message Queues tool. For more information,
see "Starting the Message Queues Tool" on page 450.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation pane, right-click the name of the message queue server
that you want to delete, and then choose Delete from the pop-up menu.
4. The Message Queues tool prompts you to confirm deletion.
5. Click Yes.

- 453 -
Configuring Outbound Message Queues
This section describes how to configure outbound JMS message queues for
your Informatica MDM Hub implementation.

About Message Queues


Before you can define outbound JMS message queues in Informatica MDM
Hub, you must define the message queue server(s) that will service the
message queue. For more information, see "Configuring Message Queue
Servers" on page 452. In JMS, a message queue is a staging area for XML
messages. Informatica MDM Hub publishes XML messages to the message
queue. External applications retrieve these published XML messages from the
message queue.

Message Queue Properties


You can configure the following message queue properties.
Property Description
Queue Name of this message queue. This must match the JNDI queue
Name name as configured on your application server.
Display Name of this message queue as it will be displayed in the Hub
Name Console.
Description Descriptive information for this message queue.

Adding Message Queues to a Message Queue Server


To add a message queue to a message queue server:
1. In the Hub Console, start the Message Queues tool. For more information,
see "Starting the Message Queues Tool" on page 450.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation pane, right-lick the name of the message queue server to
which you want to add a message queue, and choose Add Message
Queue.
The Message Queues tool displays the Add Message Queue dialog.

- 454 -
4. Specify the message queue properties. For more information, see
"Message Queue Properties" on page 454.
5. Click OK.
The Message Queues tool prompts you to choose the queue assignment.

6. Select one of the following options:

Assignment Description
Leave Queue is currently unassigned and not in use. Select this
Unassigned option to use this queue as the outbound queue for
Informatica MDM Hub API responses, or to indicate that the
queue is currently unassigned and is not in use.
Use with Queue is currently assigned and is available for use by
Message message triggers that are defined in the Schema Manager
Queue according to the instructions in "Configuring Message
Triggers Triggers" on page 456.
Use Legacy Select (check) this option only if your Informatica MDM Hub
XML implementation requires that you use the legacy XML
message format (Informatica MDM Hub XU version) instead
of the current version of the XML message format. For more
information, see "Legacy JMS Message XML Reference" on
page 479.

7. Click the button to save your changes.

Editing Message Queue Properties


To edit the properties of an existing message queue:
1. In the Hub Console, start the Message Queues tool. For more information,
see "Starting the Message Queues Tool" on page 450.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation pane, select the name of the message queue that you
want to configure.

- 455 -
4. Change the editable properties for this message queue. For more
information, see "Message Queue Properties" on page 454.

Click the button next to any property that you want to change.
5. Change the queue assignment, if you want.

6. Click the button to save your changes.

Deleting Message Queues


To delete an existing message queue:
1. In the Hub Console, start the Message Queues tool. For more information,
see "Starting the Message Queues Tool" on page 450.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. In the navigation pane, right-click the name of the message queue that
you want to delete, and then choose Delete from the pop-up menu.
4. The Message Queues tool prompts you to confirm deletion.
5. Click Yes.

Configuring Message Triggers


This section describes how to configure message triggers for your Informatica
MDM Hub implementation. You configure message triggers in the Schema
Manager tool.

About Message Triggers


Use message triggers to identify which actions within Informatica MDM Hub
are communicated to external applications, and where to publish XML
messages. When an action occurs for which a rule is defined, an XML message
is placed in a JMS message queue. A message trigger specifies the JMS
message queue in which messages are placed. For example:
1. A user inserts a record in a base object.
2. This insert action initiates a message trigger.
3. Informatica MDM Hub evaluates the message trigger and sends a message
to the appropriate message queue.
4. An outside application polls the message queue, picks up the message, and
processes it.

You can use the same message queue for all triggers, or you can use a
different message queue for each trigger. In order for an action to trigger a

- 456 -
message trigger, the message queues must be configured, and a message
trigger must be defined for that base object and action.

Types of Events for Message Triggers

The following types of events can cause a message trigger to be fired and a
message placed in the queue.
Events for Which Message Queue Rules Can Be Defined
Event Description
Add new • Add the data through the load process
data • Add the data through the Data Manager
• Add the data through the API verb using PUT or CLEANSE_PUT
(either through HTTP, SOAP, MQ, and so on)
Add new A new record with a PENDING state is created. Applies to state-
pending enabled base objects only.
data
Update • Update the data through the load process
existing • Update the data through the Data Manager
data
• Update the data through the API verb using PUT or CLEANSE_
PUT (either through HTTP, SOAP, MQ, and so on)
Note:
• If trust rules prevent the base object columns from being
updated, no message is generated.
• If one or more of the specified columns are updated, a single
message is generated. This single message includes data from
all of the cross-references in all output systems.
Update An existing record with a PENDING state is updated. Applies to
existing state-enabled base objects only. For more information, see "State
pending Management" on page 159
data
Update, • updating data when only the XREF has changed through the
only XREF load process
changed • updating data when only the XREF has changed through the
API using PUT or CLEANSE_PUT (either through HTTP, SOAP,
MQ, and so on)
Pending An XREF record with a PENDING state is updated. This includes
update, promotion of a record. Applies to state-enabled base objects only.
only XREF For more information, see "State Management" on page 159
changed
Merging • Manual Merge via Merge Manager
data • Merge via the API Verb (either though HTTP, SOAP, MQ etc.)
• Automatch and Merge
Merging Merging data when the base object has been updated
data, Base
object
updated
Unmerging • Unmerge the data through the Data Manager
data • Unmerge the data through the API verb using UNMERGE
(either through HTTP, SOAP, EJB etc.)
Accepting • Accepting a single record as unique via the Merge Manager
data as • Accepting multiple records as unique via the Merge Manager
unique

- 457 -
Event Description
• Having Accept as Unique turned on in the Base Object's Match
rules (this happens during the match/merge process)
Note: When a record is accepted as unique—either automatically
through a match rule or manually by a data steward—Informatica
MDM Hub generates a message with the record information,
including the cross-reference information for all output systems.
This message is placed in the queue.
Delete BO A base object record is soft deleted (state changed to DELETED).
data Applies to state-enabled base objects only. For more information,
see "State Management" on page 159
Delete An XREF record is soft deleted (state changed to DELETED).
XREF data Applies to state-enabled base objects only. For more information,
see "State Management" on page 159
Delete A base object record with a PENDING state is hard deleted.
pending Applies to state-enabled base objects only. For more information,
BO data see "State Management" on page 159
Delete An XREF record with a PENDING state is hard deleted. Applies to
pending state-enabled base objects only. For more information, see "State
XREF data Management" on page 159
No action Applies only to Activity Manager. Returned only by a cleanse_put
operation and only if delta detection is enabled. If delta detection
is not enabled, then an Update action type is returned.

Considerations for Message Triggers

Consider the following issues when setting up message triggers for your
implementation:
• If a message queue is used in any message trigger definition under a base
object in any Hub Store, the message queue displays the following
message: “The message queue is currently in use by message triggers.” In
this case, you cannot edit the properties of the message queue. Instead,
you must create another message queue to make the necessary changes.
• Message triggers apply to one base object only, and they fire only when a
specific action occurs directly on that base object. If you have two tables
that are in a parent-child relationship, then you need to explicitly define
message queues separately, for each table. Change detection is based on
specific changes to each base object (such as a load INSERT, load UPDATE,
MERGE, or PUT). Changes to a record of the parent table can fire a
message trigger for the parent record only. If changes in the parent record
affect one or more associated child records, then a message trigger for the
child table must be explicitly configured to fire when such an action occurs
in the child records.

Adding Message Triggers


To add a message trigger for a base object:
1. Configure the message queue to be usable with message triggers. For
more information, see "Editing Message Queue Properties" on page 455.

- 458 -
2. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
3. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
4. Expand the base object that will be monitored, and select the Message
Trigger Setup node.
If no message triggers have been set up, then the Schema Tool displays an
empty screen.

5. Do one of the following:


• If no message triggers have been defined, click Add Message
Trigger.
OR

• If message triggers have been defined, then click the button.


The Schema Manager displays the Add Message Trigger wizard.

- 459 -
6. Specify a name and description for the new message trigger.
7. Click Next.
The Add Message Trigger wizard prompts you to specify the messaging
package.

8. Select the package that will be used to build the message. For more
information, see "Configuring Packages" on page 151.
9. Click Next.
The Add Message Trigger wizard prompts you to specify the target
message queue.

10. Select the message queue to which the message will be written.
11. Click Next.
The Add Message Trigger wizard prompts you to specify the rules for this
message trigger.

- 460 -
12. Select the event type(s) for this message trigger.

For more information, see "Types of Events for Message Triggers" on page
457.
13. Configure the system properties for this message trigger:

Check Description
Box
Triggering System(s) that will trigger the action.
In For each message that is placed on a message queue due to the
Message trigger, the message includes the pkey_src_object value for
each cross-reference that it has in one of the 'In Message'
systems.
Note: You must select at least one Triggering system and one In Message
system.
For example, suppose your implementation had three source systems (A,
B, and C) and a base object record had cross-reference records for A and
B. Suppose the cross-reference in system A for this base object record
were updated. The following table shows possible message trigger
configurations and the resulting message:

- 461 -
In Message Systems Resulting Message
A Message with cross-reference for system A
B Message with cross-reference for system B
C No message – no cross-references from In Message
A& B Message with cross-reference for systems A and B
A& C Message with cross-reference for system A
B& C Message with cross-reference for system B
A& B&C Message with cross-reference for systems A and B
14. Identify the system to which the event applies, columns to listen to for
changes, and the package used to construct the message.
All events send the base object record—and all corresponding cross-
references that make up that record—to the message, based on the
specified package.
15. Click Next if you have selected an Update option. Otherwise click Finish.
16. If you have clicked the Update action, the Schema Manager prompts you
to select the columns to monitor for update actions.

17. Do one of the following:


• Select the column(s) to monitor for the events associated with this
message trigger, or
• Select the Trigger message if change on any column check box to
monitor all columns for updates.
18. Click Finish.

Editing Message Triggers


To edit the properties of an existing message trigger:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.

- 462 -
3. Expand the base object that will be monitored, and select the Message
Trigger Setup node.
4. In the Message Triggers list, click the message trigger that you want to
configure.
The Schema Manager displays the settings for the selected message
trigger.

5. Change the settings you want. For more information, see "Adding Message
Triggers" on page 458 and "Types of Events for Message Triggers" on page
457.

Click the button next to editable property that you want to change.

6. Click the button to save your changes.

Deleting Message Triggers


To delete an existing message trigger:
1. Start the Schema Manager according to the instructions in "Starting the
Schema Manager" on page 81.
2. Acquire a write lock according to the instructions in "Acquiring a Write
Lock" on page 36.
3. Expand the base object that will be monitored, and select the Message
Trigger Setup node.
4. In the Message Triggers list, click the message trigger that you want to
delete.

5. Click the button.


The Schema Manager prompts you to confirm deletion.
6. Click Yes.

- 463 -
JMS Message XML Reference
This section describes the structure of Informatica MDM Hub XML messages
and provides example messages.

Note: If your Informatica MDM Hub implementation requires that you use the
legacy XML message format (Informatica MDM Hub XU version) instead of the
current version of the XML message format (described in this section), see
"Legacy JMS Message XML Reference" on page 479 instead.

Generating ORS-specific XML Message Schemas


As described in "ORS-specific XML Message Schemas" on page 262, to create
XML messages, the publish process relies on an ORS-specific schema file
(<ors-name>-siperian-mrm-event.xsd) that you generate using the JMS Event
Schema Manager tool in the Hub Console. For more information, see
"Generating and Deploying ORS-specific Schemas" on page 617.

Elements in an XML Message


The following table describes the elements in an XML message.
Field Description
Root Node
<siperianEvent> Root node in the XML message.
Event Metadata
<eventMetadata> Root node for event metadata.
<messageId> Unique ID for siperianEvent messages.
<eventType> Type of event, as described in "Types of Events for
Message Triggers" on page 457. One of the following
values:
• Insert
• Update
• Update XREF
• Accept as Unique
• Merge
• Unmerge
• Merge Update
<baseObjectUid> UID of the base object affected by this action.
<packageUid> UID of the package associated with this action.
<messageDate> Date/time when this message was generated.
<orsId> ID of the Operational Reference Store (ORS) associated
with this event.
<triggerUid> UID of the rule that triggered the event that generated this
message.
Event Details
<eventTypeEvent> Root node for event details.

- 464 -
Field Description
<sourceSystemName> Name of the source system associated with this event.
<sourceKey> Value of the PKEY_SRC_OBJECT associated with this event.
<eventDate> Date/time when the event was generated.
<rowid> RowID of the base object record that was affected by the
event.
<xrefKey> Root node of a cross-reference record affected by this
event.
<systemName> System name of the cross-reference record affected by
this event.
<sourceKey> PKEY_SRC_OBJECT of the cross-reference record affected
by this event.
<packageName> Name of the secure package associated with this event.
<columnName> Each column in the package is represented by an element
in the XML file. Examples: rowidObject and
consolidationInd. Defined in the ORS-specific XSD that is
generated using the JMS Event Schema Manager tool. For
more information, see "Generating and Deploying ORS-
specific Schemas" on page 617.
<mergedRowid> List of ROWID_OBJECT values for the losing records in the
merge. This field is included in messages for Merge events
only.

Filtering Messages
You can use the custom JMS header named MessageType to filter incoming
messages based on the message type. The following message types are
indicated in the message header.
Message Description
Type
siperianEvent Event notification message.
< For Services Integration Framework (SIF) responses, the
serviceName response begins with the name of the SIF request, as in the
Return> following fragment of a response to a get request:
<getReturn>
 <message>The GET was executed successfully - retrieved 1
records</message>
 <recordKey>
 <ROWID>2</ROWID>
 </recordKey>
...

Example XML Messages


This section provides listings of example XML messages.

Accept As Unique Message

The following is an example of an Accept As Unique message:

- 465 -
<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Accept as Unique</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>192</messageId>
<messageDate>2008-09-10T16:33:14.000-07:00</messageDate>
</eventMetadata>
<acceptAsUniqueEvent>
<sourceSystemName>Admin</sourceSystemName>
<sourceKey>SVR1.1T1</sourceKey>
<eventDate>2008-09-10T16:33:14.000-07:00</eventDate>
<rowid>2 </rowid>
<xrefKey>
<systemName>Admin</systemName>
<sourceKey>SVR1.1T1</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>2 </rowidObject>
<creator>admin</creator>
<createDate>2008-08-13T20:28:02.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-10T16:33:14.000-07:00</lastUpdateDate>
<consolidationInd>1</consolidationInd>
<lastRowidSystem>SYS0 </lastRowidSystem>
<dirtyInd>0</dirtyInd>
<firstName>Joey</firstName>
<lastName>Brown</lastName>
</contactPkg>
</acceptAsUniqueEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

AMRule Message

The following is an example of an AMRule message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>AM Rule Event</eventType>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<interactionId>12</interactionId>
<activityName>Changed Contact and Address </activityName>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdateLegacy</triggerUid>
<messageId>291</messageId>
<messageDate>2008-09-19T11:43:42.979-07:00</messageDate>
</eventMetadata>
<amRuleEvent>

- 466 -
<eventDate>2008-09-19T11:43:42.979-07:00</eventDate>
<contactPkgAmEvent>
<amRuleUid>AM_RULE.RuleSet1|Rule1</amRuleUid>
<contactPkg>
<rowidObject>64 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-08T16:24:35.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-18T16:26:45.000-07:00</lastUpdateDate>
<consolidationInd>2</consolidationInd>
<lastRowidSystem>SYS0 </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>Johnny</firstName>
<lastName>Brown</lastName>
<hubStateInd>1</hubStateInd>
</contactPkg>
<cContact>
<event>
<eventType>Update</eventType>
<system>Admin</system>
</event>
<event>
<eventType>Update XREF</eventType>
<system>Admin</system>
</event>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>PK1265</sourceKey>
</xrefKey>
<xrefKey>
<systemName>Admin</systemName>
<sourceKey>64</sourceKey>
</xrefKey>
</cContact>
</contactPkgAmEvent>
</amRuleEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

BoDelete Message

The following is an example of a BoDelete message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>BO Delete</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>328</messageId>
<messageDate>2008-09-19T14:35:53.000-07:00</messageDate>
</eventMetadata>

- 467 -
<boDeleteEvent>
<sourceSystemName>Admin</sourceSystemName>
<eventDate>2008-09-19T14:35:53.000-07:00</eventDate>
<rowid>107 </rowid>
<xrefKey>
<systemName>CRM</systemName>
</xrefKey>
<xrefKey>
<systemName>Admin</systemName>
</xrefKey>
<xrefKey>
<systemName>WEB</systemName>
</xrefKey>
<contactPkg>
<rowidObject>107 </rowidObject>
<creator>sifuser</creator>
<createDate>2008-09-19T14:35:28.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-19T14:35:53.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>John</firstName>
<lastName>Smith</lastName>
<hubStateInd>-1</hubStateInd>
</contactPkg>
</boDeleteEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

BoSetToDelete Message
<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>BO set to Delete</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>319</messageId>
<messageDate>2008-09-19T14:21:03.000-07:00</messageDate>
</eventMetadata>
<boSetToDeleteEvent>
<sourceSystemName>Admin</sourceSystemName>
<eventDate>2008-09-19T14:21:03.000-07:00</eventDate>
<rowid>102 </rowid>
<xrefKey>
<systemName>CRM</systemName>
</xrefKey>
<xrefKey>
<systemName>Admin</systemName>
</xrefKey>
<xrefKey>

- 468 -
<systemName>WEB</systemName>
</xrefKey>
<contactPkg>
<rowidObject>102 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-19T13:57:09.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-19T14:21:03.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>SYS0 </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<hubStateInd>-1</hubStateInd>
</contactPkg>
</boSetToDeleteEvent>
</siperianEvent>

The following is an example of a BoSetToDelete message:

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Delete Message

The following is an example of a Delete message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Delete</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>328</messageId>
<messageDate>2008-09-19T14:35:53.000-07:00</messageDate>
</eventMetadata>
<deleteEvent>
<sourceSystemName>Admin</sourceSystemName>
<eventDate>2008-09-19T14:35:53.000-07:00</eventDate>
<rowid>107 </rowid>
<xrefKey>
<systemName>CRM</systemName>
</xrefKey>
<xrefKey>
<systemName>Admin</systemName>
</xrefKey>
<xrefKey>
<systemName>WEB</systemName>
</xrefKey>
<contactPkg>
<rowidObject>107 </rowidObject>
<creator>sifuser</creator>
<createDate>2008-09-19T14:35:28.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-19T14:35:53.000-07:00</lastUpdateDate>

- 469 -
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>John</firstName>
<lastName>Smith</lastName>
<hubStateInd>-1</hubStateInd>
</contactPkg>
</deleteEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Insert Message

The following is an example of an Insert message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Insert</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdateLegacy</triggerUid>
<messageId>114</messageId>
<messageDate>2008-09-08T16:02:11.000-07:00</messageDate>
</eventMetadata>
<insertEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>PK12658</sourceKey>
<eventDate>2008-09-08T16:02:11.000-07:00</eventDate>
<rowid>66 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>PK12658</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>66 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-08T16:02:11.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-08T16:02:11.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>Joe</firstName>
<lastName>Brown</lastName>
</contactPkg>
</insertEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

- 470 -
Merge Message

The following is an example of a Merge message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Merge</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdateLegacy</triggerUid>
<messageId>130</messageId>
<messageDate>2008-09-08T16:13:28.000-07:00</messageDate>
</eventMetadata>
<mergeEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>PK126566</sourceKey>
<eventDate>2008-09-08T16:13:28.000-07:00</eventDate>
<rowid>65 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>PK126566</sourceKey>
</xrefKey>
<xrefKey>
<systemName>Admin</systemName>
<sourceKey>SVR1.28E</sourceKey>
</xrefKey>
<mergedRowid>62 </mergedRowid>
<contactPkg>
<rowidObject>65 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-08T15:49:17.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-08T16:13:28.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>SYS0 </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>Joe</firstName>
<lastName>Brown</lastName>
</contactPkg>
</mergeEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Merge Update Message

The following is an example of a Merge Update message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>

- 471 -
<eventType>Merge Update</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>269</messageId>
<messageDate>2008-09-10T17:25:42.000-07:00</messageDate>
</eventMetadata>
<mergeUpdateEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>P45678</sourceKey>
<eventDate>2008-09-10T17:25:42.000-07:00</eventDate>
<rowid>83 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>P45678</sourceKey>
</xrefKey>
<mergedRowid>58 </mergedRowid>
<contactPkg>
<rowidObject>83 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-10T16:44:56.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-10T17:25:42.000-07:00</lastUpdateDate>
<consolidationInd>1</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>Thomas</firstName>
<lastName>Jones</lastName>
</contactPkg>
</mergeUpdateEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

No Action Message

The following is an example of a No Action message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>No Action</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>267</messageId>
<messageDate>2008-09-10T17:25:42.000-07:00</messageDate>
</eventMetadata>
<noActionEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>P45678</sourceKey>
<eventDate>2008-09-10T17:25:42.000-07:00</eventDate>
<rowid>83 </rowid>

- 472 -
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>P45678</sourceKey>
</xrefKey>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>P45678</sourceKey>
</xrefKey>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>P45678</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>83 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-10T16:44:56.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-10T17:25:42.000-07:00</lastUpdateDate>
<consolidationInd>1</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>Thomas</firstName>
<lastName>Jones</lastName>
</contactPkg>
</noActionEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

PendingInsert Message

The following is an example of a PendingInsert message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Pending Insert</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>302</messageId>
<messageDate>2008-09-19T13:57:10.000-07:00</messageDate>
</eventMetadata>
<pendingInsertEvent>
<sourceSystemName>Admin</sourceSystemName>
<sourceKey>SVR1.2V3</sourceKey>
<eventDate>2008-09-19T13:57:10.000-07:00</eventDate>
<rowid>102 </rowid>
<xrefKey>
<systemName>Admin</systemName>
<sourceKey>SVR1.2V3</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>102 </rowidObject>

- 473 -
<creator>admin</creator>
<createDate>2008-09-19T13:57:09.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-19T13:57:09.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>SYS0 </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>John</firstName>
<lastName>Smith</lastName>
<hubStateInd>0</hubStateInd>
</contactPkg>
</pendingInsertEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

PendingUpdate Message

The following is an example of a PendingUpdate message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Pending Update</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>306</messageId>
<messageDate>2008-09-19T14:01:36.000-07:00</messageDate>
</eventMetadata>
<pendingUpdateEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>CPK125</sourceKey>
<eventDate>2008-09-19T14:01:36.000-07:00</eventDate>
<rowid>102 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>CPK125</sourceKey>
</xrefKey>
<xrefKey>
<systemName>Admin</systemName>
<sourceKey>SVR1.2V3</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>102 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-19T13:57:09.000-07:00</createDate>
<updatedBy>sifuser</updatedBy>
<lastUpdateDate>2008-09-19T14:01:36.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>John</firstName>
<lastName>Smith</lastName>

- 474 -
<hubStateInd>1</hubStateInd>
</contactPkg>
</pendingUpdateEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

PendingUpdateXref Message

The following is an example of a PendingUpdateXref message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Pending Update XREF</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>306</messageId>
<messageDate>2008-09-19T14:01:36.000-07:00</messageDate>
</eventMetadata>
<pendingUpdateXrefEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>CPK125</sourceKey>
<eventDate>2008-09-19T14:01:36.000-07:00</eventDate>
<rowid>102 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>CPK125</sourceKey>
</xrefKey>
<xrefKey>
<systemName>Admin</systemName>
<sourceKey>SVR1.2V3</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>102 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-19T13:57:09.000-07:00</createDate>
<updatedBy>sifuser</updatedBy>
<lastUpdateDate>2008-09-19T14:01:36.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>John</firstName>
<lastName>Smith</lastName>
<hubStateInd>1</hubStateInd>
</contactPkg>
</pendingUpdateXrefEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

- 475 -
Unmerge Message

The following is an example of an unmerge message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>UnMerge</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>145</messageId>
<messageDate>2008-09-08T16:24:36.000-07:00</messageDate>
</eventMetadata>
<unmergeEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>PK1265</sourceKey>
<eventDate>2008-09-08T16:24:36.000-07:00</eventDate>
<rowid>65 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>PK1265</sourceKey>
</xrefKey>
<mergedRowid>64 </mergedRowid>
<contactPkg>
<rowidObject>65 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-08T15:49:17.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-08T16:24:35.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>SYS0 </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>Joe</firstName>
<lastName>Brown</lastName>
</contactPkg>
</unmergeEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Update Message

The following is an example of an update message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Update</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>

- 476 -
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>120</messageId>
<messageDate>2008-09-08T16:05:13.000-07:00</messageDate>
</eventMetadata>
<updateEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>PK12658</sourceKey>
<eventDate>2008-09-08T16:05:13.000-07:00</eventDate>
<rowid>66 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>PK12658</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>66 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-08T16:02:11.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-08T16:05:13.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>Joe</firstName>
<lastName>Black</lastName>
</contactPkg>
</updateEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Update XREF Message

The following is an example of an Update XREF message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>Update XREF</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>121</messageId>
<messageDate>2008-09-08T16:05:13.000-07:00</messageDate>
</eventMetadata>
<updateXrefEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>PK12658</sourceKey>
<eventDate>2008-09-08T16:05:13.000-07:00</eventDate>
<rowid>66 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>PK12658</sourceKey>
</xrefKey>
<contactPkg>

- 477 -
<rowidObject>66 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-08T16:02:11.000-07:00</createDate>
<updatedBy>admin</updatedBy>
<lastUpdateDate>2008-09-08T16:05:13.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<firstName>Joe</firstName>
<lastName>Black</lastName>
</contactPkg>
</updateXrefEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

XRefDelete Message

The following is an example of an XRefDelete message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>XREF Delete</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>314</messageId>
<messageDate>2008-09-19T14:14:51.000-07:00</messageDate>
</eventMetadata>
<XrefDeleteEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>CPK1256</sourceKey>
<eventDate>2008-09-19T14:14:51.000-07:00</eventDate>
<rowid>102 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>CPK1256</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>102 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-19T13:57:09.000-07:00</createDate>
<updatedBy>sifuser</updatedBy>
<lastUpdateDate>2008-09-19T14:14:54.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<hubStateInd>1</hubStateInd>
</contactPkg>
</XrefDeleteEvent>
</siperianEvent>

- 478 -
Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

XRefSetToDelete Message

The following is an example of an XRefSetToDelete message:


<?xml version="1.0" encoding="UTF-8"?>
<siperianEvent>
<eventMetadata>
<eventType>XREF set to Delete</eventType>
<baseObjectUid>BASE_OBJECT.C_CONTACT</baseObjectUid>
<packageUid>PACKAGE.CONTACT_PKG</packageUid>
<orsId>localhost-mrm-CMX_ORS</orsId>
<triggerUid>MESSAGE_QUEUE_RULE.ContactUpdate</triggerUid>
<messageId>314</messageId>
<messageDate>2008-09-19T14:14:51.000-07:00</messageDate>
</eventMetadata>
<XrefSetToDeleteEvent>
<sourceSystemName>CRM</sourceSystemName>
<sourceKey>CPK1256</sourceKey>
<eventDate>2008-09-19T14:14:51.000-07:00</eventDate>
<rowid>102 </rowid>
<xrefKey>
<systemName>CRM</systemName>
<sourceKey>CPK1256</sourceKey>
</xrefKey>
<contactPkg>
<rowidObject>102 </rowidObject>
<creator>admin</creator>
<createDate>2008-09-19T13:57:09.000-07:00</createDate>
<updatedBy>sifuser</updatedBy>
<lastUpdateDate>2008-09-19T14:14:54.000-07:00</lastUpdateDate>
<consolidationInd>4</consolidationInd>
<lastRowidSystem>CRM </lastRowidSystem>
<dirtyInd>1</dirtyInd>
<hubStateInd>1</hubStateInd>
</contactPkg>
</XrefSetToDeleteEvent>
</siperianEvent>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Legacy JMS Message XML Reference


This section describes the structure of legacy Informatica MDM Hub XML
messages and provides example messages. This section applies only if you
have selected the Use Legacy XML check box in the Message Queues tool (see
"Configuring Outbound Message Queues" on page 454). Use this option only
when your Informatica MDM Hub implementation requires that you use the
legacy XML message format (Informatica MDM Hub XU version) instead of the

- 479 -
current version of the XML message format (described in "JMS Message XML
Reference" on page 464).

Message Fields for Legacy XML


The contents of the data area of the message are determined by the package
specified in the trigger. The data area can contain the following fields:
Message Fields
Field Description
ACTION Action type: Insert, Update, Update XREF, Accept as Unique,
Merge, Unmerge, or Merge Update.
MESSAGE_ Time when the event was generated.
DATE
TABLE_ Name of the base object table or cross-reference object table
NAME affected by this action.
RULE_ Name of the rule that triggered the event that generated this
NAME message.
RULE_ID ID of the rule that triggered the event that generated this
message.
ROWID_ Unique key for the base object affected by this action.
OBJECT
MERGED_ List of ROWID_OBJECT values for the losing records in the merge.
OBJECTS This field is included in messages for MERGE events only.
SOURCE_ The SYSTEM and PKEY_SRC_OBJECT values for the cross-
XREF reference that triggered the UPDATE event. This field is included in
messages for UPDATE events only.
XREFS List of SYSTEM and PKEY_SRC_OBJECT values for all of the cross-
references in the output systems for this base object.

Filtering Messages for Legacy XML


You can use the custom JMS header named MessageType to filter incoming
messages based on the message type. The following message types are
indicated in the message header.
Message Description
Type
SIP_EVENT Event notification message.
< For Services Integration Framework (SIF) responses, the
serviceName response begins with the name of the SIF request, as in the
Return> following fragment of a response to a get request:
<getReturn>
 <message>The GET was executed successfully - retrieved 1
records</message>
 <recordKey>
 <ROWID>2</ROWID>
 </recordKey>
...

- 480 -
Example Messages for Legacy XML
This section provides listings of example messages.

Accept as Unique Message

The following is an example of an accept as unique message:


<SIP_EVENT>
 <CONTROLAREA>
 <ACTION>Accept as Unique</ACTION>
 <MESSAGE_DATE>2005-07-21 16:37:00.0</MESSAGE_DATE>
 <TABLE_NAME>C_CUSTOMER</TABLE_NAME>
 <RULE_NAME>CustomerRule1</RULE_NAME>
 <RULE_ID>SVR1.8EO</RULE_ID>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <XREFS>
 <XREF>
 <SYSTEM>CRM</SYSTEM>
 <PKEY_SRC_OBJECT>196 </PKEY_SRC_OBJECT>
 </XREF>
 <XREF>
 <SYSTEM>SFA</SYSTEM>
 <PKEY_SRC_OBJECT>49 </PKEY_SRC_OBJECT>
 </XREF>
 </XREFS>
 </CONTROLAREA>
 <DATAAREA>
 <DATA>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <CONSOLIDATION_IND>1</CONSOLIDATION_IND>
 <FIRST_NAME>Jimmy</FIRST_NAME>
 <MIDDLE_NAME>Neville</MIDDLE_NAME>
 <LAST_NAME>Darwent</LAST_NAME>
 <SUFFIX>Jr</SUFFIX>
 <GENDER>M </GENDER>
 <BIRTH_DATE>1938-06-22</BIRTH_DATE>
 <SALUTATION>Mr</SALUTATION>
 <SSN_TAX_NUMBER>659483774</SSN_TAX_NUMBER>
 <FULL_NAME>Jimmy Darwent, Stony Brook Ny</FULL_NAME>
 </DATA>
 </DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

BO Delete Message

The following is an example of a BO delete message:


<?xml version="1.0" encoding="UTF-8"?>
<SIP_EVENT>

- 481 -
<CONTROLAREA>
<ACTION>BO Delete</ACTION>
<MESSAGE_DATE>2008-09-19 14:35:53.0</MESSAGE_DATE>
<TABLE_NAME>C_CONTACT</TABLE_NAME>
<PACKAGE>CONTACT_PKG</PACKAGE>
<RULE_NAME>ContactUpdateLegacy</RULE_NAME>
<RULE_ID>SVR1.28D</RULE_ID>
<ROWID_OBJECT>107 </ROWID_OBJECT>
<DATABASE>localhost-mrm-CMX_ORS</DATABASE>
<XREFS>
<XREF>
<SYSTEM>CRM</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
<XREF>
<SYSTEM>Admin</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
<XREF>
<SYSTEM>WEB</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
</XREFS>
</CONTROLAREA>
<DATAAREA>
<DATA>
<ROWID_OBJECT>107 </ROWID_OBJECT>
<CREATOR>sifuser</CREATOR>
<CREATE_DATE>19 Sep 2008 14:35:28</CREATE_DATE>
<UPDATED_BY>admin</UPDATED_BY>
<LAST_UPDATE_DATE>19 Sep 2008 14:35:53</LAST_UPDATE_DATE>
<CONSOLIDATION_IND>4</CONSOLIDATION_IND>
<DELETED_IND />
<DELETED_BY />
<DELETED_DATE />
<LAST_ROWID_SYSTEM>CRM </LAST_ROWID_SYSTEM>
<DIRTY_IND>1</DIRTY_IND>
<INTERACTION_ID />
<FIRST_NAME>John</FIRST_NAME>
<LAST_NAME>Smith</LAST_NAME>
<HUB_STATE_IND>-1</HUB_STATE_IND>
</DATA>
</DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

BO set to Delete

The following is an example of a BO set to delete message:


<?xml version="1.0" encoding="UTF-8"?>
<SIP_EVENT>
<CONTROLAREA>
<ACTION>BO set to Delete</ACTION>

- 482 -
<MESSAGE_DATE>2008-09-19 14:21:03.0</MESSAGE_DATE>
<TABLE_NAME>C_CONTACT</TABLE_NAME>
<PACKAGE>CONTACT_PKG</PACKAGE>
<RULE_NAME>ContactUpdateLegacy</RULE_NAME>
<RULE_ID>SVR1.28D</RULE_ID>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<DATABASE>localhost-mrm-CMX_ORS</DATABASE>
<XREFS>
<XREF>
<SYSTEM>CRM</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
<XREF>
<SYSTEM>Admin</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
<XREF>
<SYSTEM>WEB</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
</XREFS>
</CONTROLAREA>
<DATAAREA>
<DATA>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<CREATOR>admin</CREATOR>
<CREATE_DATE>19 Sep 2008 13:57:09</CREATE_DATE>
<UPDATED_BY>admin</UPDATED_BY>
<LAST_UPDATE_DATE>19 Sep 2008 14:21:03</LAST_UPDATE_DATE>
<CONSOLIDATION_IND>4</CONSOLIDATION_IND>
<DELETED_IND />
<DELETED_BY />
<DELETED_DATE />
<LAST_ROWID_SYSTEM>SYS0 </LAST_ROWID_SYSTEM>
<DIRTY_IND>1</DIRTY_IND>
<INTERACTION_ID />
<FIRST_NAME />
<LAST_NAME />
<HUB_STATE_IND>-1</HUB_STATE_IND>
</DATA>
</DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Delete Message

The following is an example of a delete message:


<?xml version="1.0" encoding="UTF-8"?>
<SIP_EVENT>
<CONTROLAREA>
<ACTION>Delete</ACTION>
<MESSAGE_DATE>2008-09-19 14:35:53.0</MESSAGE_DATE>
<TABLE_NAME>C_CONTACT</TABLE_NAME>

- 483 -
<PACKAGE>CONTACT_PKG</PACKAGE>
<RULE_NAME>ContactUpdateLegacy</RULE_NAME>
<RULE_ID>SVR1.28D</RULE_ID>
<ROWID_OBJECT>107 </ROWID_OBJECT>
<DATABASE>localhost-mrm-CMX_ORS</DATABASE>
<XREFS>
<XREF>
<SYSTEM>CRM</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
<XREF>
<SYSTEM>Admin</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
<XREF>
<SYSTEM>WEB</SYSTEM>
<PKEY_SRC_OBJECT />
</XREF>
</XREFS>
</CONTROLAREA>
<DATAAREA>
<DATA>
<ROWID_OBJECT>107 </ROWID_OBJECT>
<CREATOR>sifuser</CREATOR>
<CREATE_DATE>19 Sep 2008 14:35:28</CREATE_DATE>
<UPDATED_BY>admin</UPDATED_BY>
<LAST_UPDATE_DATE>19 Sep 2008 14:35:53</LAST_UPDATE_DATE>
<CONSOLIDATION_IND>4</CONSOLIDATION_IND>
<DELETED_IND />
<DELETED_BY />
<DELETED_DATE />
<LAST_ROWID_SYSTEM>CRM </LAST_ROWID_SYSTEM>
<DIRTY_IND>1</DIRTY_IND>
<INTERACTION_ID />
<FIRST_NAME>John</FIRST_NAME>
<LAST_NAME>Smith</LAST_NAME>
<HUB_STATE_IND>-1</HUB_STATE_IND>
</DATA>
</DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Insert Message

The following is an example of an insert message:


<SIP_EVENT>
 <CONTROLAREA>
 <ACTION>Insert</ACTION>
 <MESSAGE_DATE>2005-07-21 16:07:26.0</MESSAGE_DATE>
 <TABLE_NAME>C_CUSTOMER</TABLE_NAME>
 <RULE_NAME>CustomerRule1</RULE_NAME>
 <RULE_ID>SVR1.8EO</RULE_ID>
 <ROWID_OBJECT>33 </ROWID_OBJECT>

- 484 -
 <XREFS>
 <XREF>
 <SYSTEM>CRM</SYSTEM>
 <PKEY_SRC_OBJECT>49 </PKEY_SRC_OBJECT>
 </XREF>
 </XREFS>
 </CONTROLAREA>
 <DATAAREA>
 <DATA>
 <ROWID_OBJECT>33 </ROWID_OBJECT>
 <CONSOLIDATION_IND>4</CONSOLIDATION_IND>
 <FIRST_NAME>James</FIRST_NAME>
 <MIDDLE_NAME>Neville</MIDDLE_NAME>
 <LAST_NAME>Darwent</LAST_NAME>
 <SUFFIX>Unknown</SUFFIX>
 <GENDER>M </GENDER>
 <BIRTH_DATE>1938-06-22</BIRTH_DATE>
 <SALUTATION>Mr</SALUTATION>
 <SSN_TAX_NUMBER>216275400</SSN_TAX_NUMBER>
 <FULL_NAME>James Darwent,Stony Brook Ny</FULL_NAME>
 </DATA>
 </DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Merge Message

The following is an example of a merge message:


<SIP_EVENT>
 <CONTROLAREA>
 <ACTION>Merge</ACTION>
 <MESSAGE_DATE>2005-07-21 16:34:28.0</MESSAGE_DATE>
 <TABLE_NAME>C_CUSTOMER</TABLE_NAME>
 <RULE_NAME>CustomerRule1</RULE_NAME>
 <RULE_ID>SVR1.8EO</RULE_ID>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <XREFS>
 <XREF>
 <SYSTEM>CRM</SYSTEM>
 <PKEY_SRC_OBJECT>196 </PKEY_SRC_OBJECT>
 </XREF>
 <XREF>
 <SYSTEM>SFA</SYSTEM>
 <PKEY_SRC_OBJECT>49 </PKEY_SRC_OBJECT>
 </XREF>
 </XREFS>
 <MERGED_OBJECTS>
 <ROWID_OBJECT>7 </ROWID_OBJECT>
 </MERGED_OBJECTS>
 </CONTROLAREA>
 <DATAAREA>
 <DATA>
 <ROWID_OBJECT>74 </ROWID_OBJECT>

- 485 -
 <CONSOLIDATION_IND>4</CONSOLIDATION_IND>
 <FIRST_NAME>Jimmy</FIRST_NAME>
 <MIDDLE_NAME>Neville</MIDDLE_NAME>
 <LAST_NAME>Darwent</LAST_NAME>
 <SUFFIX>Jr</SUFFIX>
 <GENDER>M </GENDER>
 <BIRTH_DATE>1938-06-22</BIRTH_DATE>
 <SALUTATION>Mr</SALUTATION>
 <SSN_TAX_NUMBER>659483774</SSN_TAX_NUMBER>
 <FULL_NAME>Jimmy Darwent, Stony Brook Ny</FULL_NAME>
 </DATA>
 </DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Merge Update Message

The following is an example of a merge update message:


<SIP_EVENT>
 <CONTROLAREA>
 <ACTION>Merge Update</ACTION>
 <MESSAGE_DATE>2005-07-21 16:34:28.0</MESSAGE_DATE>
 <TABLE_NAME>C_CUSTOMER</TABLE_NAME>
 <RULE_NAME>CustomerRule1</RULE_NAME>
 <RULE_ID>SVR1.8EO</RULE_ID>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <XREFS>
 <XREF>
 <SYSTEM>CRM</SYSTEM>
 <PKEY_SRC_OBJECT>196 </PKEY_SRC_OBJECT>
 </XREF>
 <XREF>
 <SYSTEM>SFA</SYSTEM>
 <PKEY_SRC_OBJECT>49 </PKEY_SRC_OBJECT>
 </XREF>
 </XREFS>
 <MERGED_OBJECTS>
 <ROWID_OBJECT>7 </ROWID_OBJECT>
 </MERGED_OBJECTS>
 </CONTROLAREA>
 <DATAAREA>
 <DATA>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <CONSOLIDATION_IND>4</CONSOLIDATION_IND>
 <FIRST_NAME>Jimmy</FIRST_NAME>
 <MIDDLE_NAME>Neville</MIDDLE_NAME>
 <LAST_NAME>Darwent</LAST_NAME>
 <SUFFIX>Jr</SUFFIX>
 <GENDER>M </GENDER>
 <BIRTH_DATE>1938-06-22</BIRTH_DATE>
 <SALUTATION>Mr</SALUTATION>
 <SSN_TAX_NUMBER>659483774</SSN_TAX_NUMBER>
 <FULL_NAME>Jimmy Darwent, Stony Brook Ny</FULL_NAME>

- 486 -
 </DATA>
 </DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Pending Insert Message

The following is an example of a pending insert message:


<?xml version="1.0" encoding="UTF-8"?>
<SIP_EVENT>
<CONTROLAREA>
<ACTION>Pending Insert</ACTION>
<MESSAGE_DATE>2008-09-19 13:57:10.0</MESSAGE_DATE>
<TABLE_NAME>C_CONTACT</TABLE_NAME>
<PACKAGE>CONTACT_PKG</PACKAGE>
<RULE_NAME>ContactUpdateLegacy</RULE_NAME>
<RULE_ID>SVR1.28D</RULE_ID>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<DATABASE>localhost-mrm-CMX_ORS</DATABASE>
<XREFS>
<XREF>
<SYSTEM>Admin</SYSTEM>
<PKEY_SRC_OBJECT>SVR1.2V3</PKEY_SRC_OBJECT>
</XREF>
</XREFS>
</CONTROLAREA>
<DATAAREA>
<DATA>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<CREATOR>admin</CREATOR>
<CREATE_DATE>19 Sep 2008 13:57:09</CREATE_DATE>
<UPDATED_BY>admin</UPDATED_BY>
<LAST_UPDATE_DATE>19 Sep 2008 13:57:09</LAST_UPDATE_DATE>
<CONSOLIDATION_IND>4</CONSOLIDATION_IND>
<DELETED_IND />
<DELETED_BY />
<DELETED_DATE />
<LAST_ROWID_SYSTEM>SYS0 </LAST_ROWID_SYSTEM>
<DIRTY_IND>1</DIRTY_IND>
<INTERACTION_ID />
<FIRST_NAME>John</FIRST_NAME>
<LAST_NAME>Smith</LAST_NAME>
<HUB_STATE_IND>0</HUB_STATE_IND>
</DATA>
</DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

- 487 -
Pending Update Message

The following is an example of a pending update message:


<?xml version="1.0" encoding="UTF-8"?>
<SIP_EVENT>
<CONTROLAREA>
<ACTION>Pending Update</ACTION>
<MESSAGE_DATE>2008-09-19 14:01:36.0</MESSAGE_DATE>
<TABLE_NAME>C_CONTACT</TABLE_NAME>
<PACKAGE>CONTACT_PKG</PACKAGE>
<RULE_NAME>ContactUpdateLegacy</RULE_NAME>
<RULE_ID>SVR1.28D</RULE_ID>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<DATABASE>localhost-mrm-CMX_ORS</DATABASE>
<XREFS>
<XREF>
<SYSTEM>CRM</SYSTEM>
<PKEY_SRC_OBJECT>CPK125</PKEY_SRC_OBJECT>
</XREF>
<XREF>
<SYSTEM>Admin</SYSTEM>
<PKEY_SRC_OBJECT>SVR1.2V3</PKEY_SRC_OBJECT>
</XREF>
</XREFS>
</CONTROLAREA>
<DATAAREA>
<DATA>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<CREATOR>admin</CREATOR>
<CREATE_DATE>19 Sep 2008 13:57:09</CREATE_DATE>
<UPDATED_BY>sifuser</UPDATED_BY>
<LAST_UPDATE_DATE>19 Sep 2008 14:01:36</LAST_UPDATE_DATE>
<CONSOLIDATION_IND>4</CONSOLIDATION_IND>
<DELETED_IND />
<DELETED_BY />
<DELETED_DATE />
<LAST_ROWID_SYSTEM>CRM </LAST_ROWID_SYSTEM>
<DIRTY_IND>1</DIRTY_IND>
<INTERACTION_ID />
<FIRST_NAME>John</FIRST_NAME>
<LAST_NAME>Smith</LAST_NAME>
<HUB_STATE_IND>1</HUB_STATE_IND>
</DATA>
</DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Pending Update XREF Message

The following is an example of a pending update XREF message:


<?xml version="1.0" encoding="UTF-8"?>

- 488 -
<SIP_EVENT>
<CONTROLAREA>
<ACTION>Pending Update XREF</ACTION>
<MESSAGE_DATE>2008-09-19 14:01:36.0</MESSAGE_DATE>
<TABLE_NAME>C_CONTACT</TABLE_NAME>
<PACKAGE>CONTACT_ADDRESS_PKG</PACKAGE>
<RULE_NAME>ContactAM</RULE_NAME>
<RULE_ID>SVR1.1VU</RULE_ID>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<DATABASE>localhost-mrm-CMX_ORS</DATABASE>
<XREFS>
<XREF>
<SYSTEM>CRM</SYSTEM>
<PKEY_SRC_OBJECT>CPK125</PKEY_SRC_OBJECT>
</XREF>
<XREF>
<SYSTEM>Admin</SYSTEM>
<PKEY_SRC_OBJECT>SVR1.2V3</PKEY_SRC_OBJECT>
</XREF>
</XREFS>
</CONTROLAREA>
<DATAAREA>
<DATA>
<ROWID_CONTACT>102 </ROWID_CONTACT>
<CREATOR>admin</CREATOR>
<CREATE_DATE>19 Sep 2008 13:57:09</CREATE_DATE>
<UPDATED_BY>sifuser</UPDATED_BY>
<LAST_UPDATE_DATE>19 Sep 2008 14:01:36</LAST_UPDATE_DATE>
<CONSOLIDATION_IND>4</CONSOLIDATION_IND>
<DELETED_IND />
<DELETED_BY />
<DELETED_DATE />
<LAST_ROWID_SYSTEM>CRM </LAST_ROWID_SYSTEM>
<DIRTY_IND>1</DIRTY_IND>
<INTERACTION_ID />
<FIRST_NAME>John</FIRST_NAME>
<LAST_NAME>Smith</LAST_NAME>
<HUB_STATE_IND>1</HUB_STATE_IND>
<CITY />
<STATE />
</DATA>
</DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Update Message

The following is an example of an update message:


<SIP_EVENT>
 <CONTROLAREA>
 <ACTION>Update</ACTION>
 <MESSAGE_DATE>2005-07-21 16:44:53.0</MESSAGE_DATE>
 <TABLE_NAME>C_CUSTOMER</TABLE_NAME>

- 489 -
 <RULE_NAME>CustomerRule1</RULE_NAME>
 <RULE_ID>SVR1.8EO</RULE_ID>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <SOURCE_XREF>
 <SYSTEM>Admin</SYSTEM>
 <PKEY_SRC_OBJECT>196 </PKEY_SRC_OBJECT>
 </SOURCE_XREF>
 <XREFS>
 <XREF>
 <SYSTEM>CRM</SYSTEM>
 <PKEY_SRC_OBJECT>196 </PKEY_SRC_OBJECT>
 </XREF>
 <XREF>
 <SYSTEM>SFA</SYSTEM>
 <PKEY_SRC_OBJECT>49 </PKEY_SRC_OBJECT>
 </XREF>
 <XREF>
 <SYSTEM>Admin</SYSTEM>
 <PKEY_SRC_OBJECT>74 </PKEY_SRC_OBJECT>
 </XREF>
 </XREFS>
 </CONTROLAREA>
 <DATAAREA>
 <DATA>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <CONSOLIDATION_IND>1</CONSOLIDATION_IND>
 <FIRST_NAME>Jimmy</FIRST_NAME>
 <MIDDLE_NAME>Neville</MIDDLE_NAME>
 <LAST_NAME>Darwent</LAST_NAME>
 <SUFFIX>Jr</SUFFIX>
 <GENDER>M </GENDER>
 <BIRTH_DATE>1938-06-22</BIRTH_DATE>
 <SALUTATION>Mr</SALUTATION>
 <SSN_TAX_NUMBER>659483773</SSN_TAX_NUMBER>
 <FULL_NAME>Jimmy Darwent, Stony Brook Ny</FULL_NAME>
 </DATA>
 </DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Update XREF Message

The following is an example of an update XREF message:


<SIP_EVENT>
 <CONTROLAREA>
 <ACTION>Update XREF</ACTION>
 <MESSAGE_DATE>2005-07-21 16:44:53.0</MESSAGE_DATE>
 <TABLE_NAME>C_CUSTOMER</TABLE_NAME>
 <RULE_NAME>CustomerRule1</RULE_NAME>
 <RULE_ID>SVR1.8EO</RULE_ID>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <SOURCE_XREF>
 <SYSTEM>Admin</SYSTEM>

- 490 -
 <PKEY_SRC_OBJECT>196 </PKEY_SRC_OBJECT>
 </SOURCE_XREF>
 <XREFS>
 <XREF>
 <SYSTEM>CRM</SYSTEM>
 <PKEY_SRC_OBJECT>196 </PKEY_SRC_OBJECT>
 </XREF>
 <XREF>
 <SYSTEM>SFA</SYSTEM>
 <PKEY_SRC_OBJECT>49 </PKEY_SRC_OBJECT>
 </XREF>
 <XREF>
 <SYSTEM>Admin</SYSTEM>
 <PKEY_SRC_OBJECT>74 </PKEY_SRC_OBJECT>
 </XREF>
 </XREFS>
 </CONTROLAREA>
 <DATAAREA>
 <DATA>
 <ROWID_OBJECT>74 </ROWID_OBJECT>
 <CONSOLIDATION_IND>1</CONSOLIDATION_IND>
 <FIRST_NAME>Jimmy</FIRST_NAME>
 <MIDDLE_NAME>Neville</MIDDLE_NAME>
 <LAST_NAME>Darwent</LAST_NAME>
 <SUFFIX>Jr</SUFFIX>
 <GENDER>M </GENDER>
 <BIRTH_DATE>1938-06-22</BIRTH_DATE>
 <SALUTATION>Mr</SALUTATION>
 <SSN_TAX_NUMBER>659483773</SSN_TAX_NUMBER>
 <FULL_NAME>Jimmy Darwent, Stony Brook Ny</FULL_NAME>
 </DATA>
 </DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

Unmerge Message

The following is an example of an unmerge message:


<SIP_EVENT>
 <CONTROLAREA>
 <ACTION>UnMerge</ACTION>
 <MESSAGE_DATE>2006-11-07 21:37:56.0</MESSAGE_DATE>
 <TABLE_NAME>C_CONSUMER</TABLE_NAME>
 <PACKAGE>CONSUMER_PKG</PACKAGE>
 <RULE_NAME>Unmerge</RULE_NAME>
 <RULE_ID>SVR1.97S</RULE_ID>
 <ROWID_OBJECT>10</ROWID_OBJECT>
 <DATABASE>edsel-edselsp2-CMX_AT</DATABASE>
 <XREFS>
 <XREF>
 <SYSTEM>Retail System</SYSTEM>
 <PKEY_SRC_OBJECT>8</PKEY_SRC_OBJECT>
 </XREF>

- 491 -
 </XREFS>
 <MERGED_OBJECTS>
 <ROWID_OBJECT>0</ROWID_OBJECT>
 </MERGED_OBJECTS>
 </CONTROLAREA>
 <DATAAREA>
 <DATA>
 <ROWID_OBJECT>10</ROWID_OBJECT>
 <CONSOLIDATION_IND>4</CONSOLIDATION_IND>
 <LAST_ROWID_SYSTEM>SVR1.7NK</LAST_ROWID_SYSTEM>
 <DIRTY_IND>1</DIRTY_IND>
 <INTERACTION_ID />
 <CONSUMER_ID>8</CONSUMER_ID>
 <FIRST_NAME>THOMAS</FIRST_NAME>
 <MIDDLE_NAME>L</MIDDLE_NAME>
 <LAST_NAME>KIDD</LAST_NAME>
 <SUFFIX />
 <TELEPHONE>2178952323</TELEPHONE>
 <GENDER>M</GENDER>
 <DOB>1940</DOB>
 </DATA>
 </DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

XREF Delete Message

The following is an example of an XREF delete message:


<?xml version="1.0" encoding="UTF-8"?>
<SIP_EVENT>
<CONTROLAREA>
<ACTION>XREF Delete</ACTION>
<MESSAGE_DATE>2008-09-19 14:14:51.0</MESSAGE_DATE>
<TABLE_NAME>C_CONTACT</TABLE_NAME>
<PACKAGE>CONTACT_PKG</PACKAGE>
<RULE_NAME>ContactUpdateLegacy</RULE_NAME>
<RULE_ID>SVR1.28D</RULE_ID>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<DATABASE>localhost-mrm-CMX_ORS</DATABASE>
<XREFS>
<XREF>
<SYSTEM>CRM</SYSTEM>
<PKEY_SRC_OBJECT>CPK1256</PKEY_SRC_OBJECT>
</XREF>
</XREFS>
</CONTROLAREA>
<DATAAREA>
<DATA>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<CREATOR>admin</CREATOR>
<CREATE_DATE>19 Sep 2008 13:57:09</CREATE_DATE>
<UPDATED_BY>sifuser</UPDATED_BY>
<LAST_UPDATE_DATE>19 Sep 2008 14:14:54</LAST_UPDATE_DATE>

- 492 -
<CONSOLIDATION_IND>4</CONSOLIDATION_IND>
<DELETED_IND />
<DELETED_BY />
<DELETED_DATE />
<LAST_ROWID_SYSTEM>CRM </LAST_ROWID_SYSTEM>
<DIRTY_IND>1</DIRTY_IND>
<INTERACTION_ID />
<FIRST_NAME />
<LAST_NAME />
<HUB_STATE_IND>1</HUB_STATE_IND>
</DATA>
</DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

XREF set to Delete

The following is an example of an XREF set to delete message:


<?xml version="1.0" encoding="UTF-8"?>
<SIP_EVENT>
<CONTROLAREA>
<ACTION>XREF set to Delete</ACTION>
<MESSAGE_DATE>2008-09-19 14:14:51.0</MESSAGE_DATE>
<TABLE_NAME>C_CONTACT</TABLE_NAME>
<PACKAGE>CONTACT_PKG</PACKAGE>
<RULE_NAME>ContactUpdateLegacy</RULE_NAME>
<RULE_ID>SVR1.28D</RULE_ID>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<DATABASE>localhost-mrm-CMX_ORS</DATABASE>
<XREFS>
<XREF>
<SYSTEM>CRM</SYSTEM>
<PKEY_SRC_OBJECT>CPK1256</PKEY_SRC_OBJECT>
</XREF>
</XREFS>
</CONTROLAREA>
<DATAAREA>
<DATA>
<ROWID_OBJECT>102 </ROWID_OBJECT>
<CREATOR>admin</CREATOR>
<CREATE_DATE>19 Sep 2008 13:57:09</CREATE_DATE>
<UPDATED_BY>sifuser</UPDATED_BY>
<LAST_UPDATE_DATE>19 Sep 2008 14:14:54</LAST_UPDATE_DATE>
<CONSOLIDATION_IND>4</CONSOLIDATION_IND>
<DELETED_IND />
<DELETED_BY />
<DELETED_DATE />
<LAST_ROWID_SYSTEM>CRM </LAST_ROWID_SYSTEM>
<DIRTY_IND>1</DIRTY_IND>
<INTERACTION_ID />
<FIRST_NAME />
<LAST_NAME />
<HUB_STATE_IND>1</HUB_STATE_IND>

- 493 -
</DATA>
</DATAAREA>
</SIP_EVENT>

Your messages will not look exactly like this. The data will reflect your data,
and the fields will reflect your packages.

- 494 -
Part 4: Executing Informatica MDM Hub Processes

Part 4: Executing Informatica


MDM Hub Processes

Contents
• "Using Batch Jobs " on page 496
• "Writing Custom Scripts to Execute Batch Jobs " on page 559

- 495 -
Chapter 17: Using Batch Jobs

This chapter describes how to configure and execute Informatica MDM Hub
batch jobs using the Batch Viewer and Batch Group tools in the Hub Console.
For more information about creating batch jobs using job execution scripts,
see "Writing Custom Scripts to Execute Batch Jobs " on page 559.

Chapter Contents
• "Before You Begin" on page 496
• "About Informatica MDM Hub Batch Jobs" on page 496
• "Running Batch Jobs Using the Batch Viewer Tool" on page 501
• "Running Batch Jobs Using the Batch Group Tool" on page 512
• "Batch Jobs Reference" on page 530

Before You Begin


Before you begin working with batch jobs, you must have performed the
following prerequisites:
• installed Informatica MDM Hub and created the Hub Store according to the
instructions in the Informatica MDM Hub Installation Guide for your
platform
• built the schema; see "About the Schema" on page 73

About Informatica MDM Hub Batch Jobs


In Informatica MDM Hub, a batch job is a program that, when executed,
completes a discrete unit of work (a process). For example, the Match job
carries out the match process: it searches for match candidates (records that
are possible matches), applies the match rules to the match candidates,
generates the matches, and then queues the matches for either automatic or
manual consolidation. For merge-style base objects, automatic consolidation
is handled by the Automerge job, and manual consolidation is handled by the
Manual Merge job.

Ways to Execute Batch Jobs


You can execute batch jobs in the following ways:
• Hub Console tools:
• Batch Viewer tool—Execute batch jobs individually. For more
information, see "Running Batch Jobs Using the Batch Viewer Tool" on
page 501.

- 496 -
• Batch Group tool—Execute batch jobs in a group. The Batch Group
tool allows you to configure the execution sequence for batch jobs and
to execute batch jobs in parallel. For more information, see "Running
Batch Jobs Using the Batch Group Tool" on page 512.
• Stored procedures—Execute public Informatica MDM Hub processes
(batch jobs and batch groups) through stored procedures using any job
scheduling software (such as Tivoli, CA Unicenter, and so on). For more
information, see "About Executing Informatica MDM Hub Batch Jobs" on
page 559. You can also create and run stored procedures using the SIF API
(using Java, SOAP, or HTTP/XML). For more information, see the
Informatica MDM Hub Services Integration Framework Guide.
• Services Integration Framework (SIF) requests—Applications can
invoke the SIF ExecuteBatchGroupRequest request to execute batch groups
directly. For more information, see the Informatica MDM Hub Services
Integration Framework Guide.

Support Tables Used By Batch Jobs


The following graphic shows the various support tables used by Informatica
MDM Hub batch jobs:

Running Batch Jobs in Sequence


Certain batch jobs require that other batch jobs be completed first. For
example, the landing tables for a base object must be populated before
running any batch jobs. Similarly, before you can run a Match job for a base
object, you must run its corresponding Stage and s. Finally, when a base
object has dependencies (for example, it is the child of a parent table, or it
has foreign key relationships that point to other base objects), batch jobs
must be run first for the tables on which the base object depends. You or your

- 497 -
organization should consider the best practice of developing an administration
or operations plan that specifies which batch processes and dependencies
should be completed before running batch jobs.

Populating Landing Tables Before Running Batch Jobs

One of the tasks Informatica MDM Hub batch jobs perform is to move data
from landing tables to the appropriate target location in Informatica MDM Hub.
Therefore, before you run Informatica MDM Hub batch jobs, you must first
have your source systems or an ETL tool write data into the landing tables.
The landing tables are Informatica MDM Hub’s interface for batch loads. You
deliver the data to the landing tables, and Informatica MDM Hub batch
procedures manipulate the data and copy it to the appropriate location(s). For
more information, see the description of the Informatica MDM Hub data
management process in the Informatica MDM Hub Overview.

Match Jobs and Subsequent Consolidation Jobs

Batch jobs need to be executed in a certain sequence. For example, a Match


job must be run for a base object before running the consolidation process.
For merge-style base objects, you can run the Auto Match and Merge job,
which executes the Match job and then Automerge job repeatedly, until either
all records in the base object have been checked for matches, or until the
maximum number of records for manual consolidation limit is reached (see
"Maximum Matches for Manual Consolidation" on page 368).

Loading Data from Parent Tables First

The general rule of thumb is that all parent tables (tables that other tables
reference) must be loaded first.

Loading Data for Objects With Foreign Key Relationships

If two tables have a foreign key relationship between them, you must load the
table that is being referenced gets loaded first, and the table doing the
referencing gets loaded second. The following foreign key relationships can
exist in Informatica MDM Hub: from one base object (child with foreign key)
to another base object (parent with primary key).

In most cases, you will schedule these jobs to run on a regular basis.

Best Practices for Working With Batch Jobs


While you design and plan your batch jobs, consider the following issues:
• Define your schema.

- 498 -
The schema is fundamental to all your Informatica MDM Hub tasks.
Without a schema, your batch jobs have nothing to do. For more
information about defining the schema, see "About the Schema" on page
73.
• Define mappings before executing Stage jobs.
Mappings define the transformations performed in Stage jobs. If you have
no mappings defined, then the Stage job will not perform any
transformations in the staging process. For more information about
mappings, see "Mapping Columns Between Landing and Staging Tables" on
page 286.
• Define match rules before executing Match jobs.
If you have no match rules, then the Match job will produce no matches.
For more information, see "Configuring Primary Key Match Rules" on page
434.
• Before running production jobs:
• Run tests with small data sets.
• Run tests of your cleanse engine and other components to determine
whether each component is working as expected.
• After testing each of the components separately, test the integrated
system in its entirety to determine whether the overall system is
working as expected.

Batch Job Creation


Batch jobs are created in either of two says:
• automatically when you configure Hub Store, or
• when certain changes occur in your Informatica MDM Hub configuration,
such as changes to trust settings for a base object

Batch Jobs That Are Created Automatically

When you configure your Hub Store, the following types of batch jobs are
automatically created:
• "Auto Match and Merge Jobs" on page 532
• "Autolink Jobs" on page 532
• "Automerge Jobs" on page 534
• "BVT Snapshot Jobs" on page 535
• "External Match Jobs" on page 535
• "Generate Match Tokens Jobs" on page 540
• "Load Jobs" on page 542
• "Manual Link Jobs" on page 545

- 499 -
• "Manual Merge Jobs" on page 545
• "Manual Unlink Jobs" on page 546
• "Manual Unmerge Jobs" on page 546
• "Match Jobs" on page 547
• "Match Analyze Jobs" on page 550
• "Migrate Link Style To Merge Style Jobs" on page 552
• "Promote Jobs" on page 552
• "Reset Links Jobs" on page 555
• "Stage Jobs" on page 556

Batch Jobs That Are Created When Changes Occur

The following batch jobs are created when you make changes to the match
and merge setup, set properties, or enable trust settings after initial loads:
• "Accept Non-Matched Records As Unique " on page 532
• "Key Match Jobs" on page 541
• "Reset Links Jobs" on page 555
• "Reset Match Table Jobs" on page 555
• "Revalidate Jobs" on page 556 (if you enable validation for a column)
• "Synchronize Jobs" on page 557

Information-Only Batch Jobs (Not Run in the Hub


Console)
The following batch jobs are for information only and cannot be manually run
from the Hub Console.
• "Accept Non-Matched Records As Unique " on page 532
• "BVT Snapshot Jobs" on page 535
• "Manual Link Jobs" on page 545
• "Manual Merge Jobs" on page 545
• "Manual Unlink Jobs" on page 546
• "Manual Unmerge Jobs" on page 546
• "Migrate Link Style To Merge Style Jobs" on page 552
• "Multi Merge Jobs" on page 552
• "Reset Match Table Jobs" on page 555

Other Batch Jobs


• "Hub Delete Jobs" on page 541

- 500 -
Running Batch Jobs Using the Batch Viewer
Tool
This section describes how to use the Batch Viewer tool in the Hub Console to
run batch jobs individually. To run batch jobs in a group, see "Running Batch
Jobs Using the Batch Group Tool" on page 512.

Batch Viewer Tool


The Batch Viewer tool provides a way to execute batch jobs individually and to
view the job execution logs. The Batch Viewer is useful for starting the run of
a single job, or for running jobs that do not need to run often, such as the
Synchronize job that is run after trust settings change. The job execution log
shows job completion status with any associated messages, such as success,
failure, or warning. The Batch Viewer tool also shows job statistics, if
applicable.

Note: The Batch Viewer does not provide automated scheduling. For more
information about how to create custom scripts to execute batch jobs and
batch groups, see "About Executing Informatica MDM Hub Batch Jobs" on page
559.

Starting the Batch Viewer Tool


To start the Batch Viewer tool:
• In the Hub Console, expand the Utilities workbench, and then click Batch
Viewer.

The Hub Console displays the Batch Viewer tool.

Grouping by Table, Data, or Procedure Type


You can change the top-level view of the navigation tree by right-clicking
Group By control at the bottom of the tree. Note that the grayed-out item
with the check mark represents the current selection.

Selecting one of the following options:

- 501 -
Group By Option Description
Table Displays items in the hierarchy at the following levels:
• top level: tables
• second level: procedure type
• third level: batch job
• fourth level: date / timestamp
Date Displays items in the hierarchy at the following levels:
• top level: date / timestamp
• second level: batch jobs by date/timestamp
Procedure Type Displays items in the hierarchy at the following levels:
• top level: procedure type
• second level: batch job
• third level: date / timestamp

The following example shows batch jobs grouped by table.

Running Batch Jobs Manually


To run a batch job manually:
1. Select the Batch Job to run
2. Execute the Batch Job

Selecting a Batch Job

To select a batch job to run:


1. Start the Batch Viewer tool, as described in "Starting the Batch Viewer
Tool" on page 501.
In the following example, the tree displays a list of batch jobs (the list is
grouped by procedure type).

- 502 -
2. Expand the tree to display the batch job that you want to run, and then
click it to select it.

The Batch Viewer displays a screen for the selected batch job with properties
and command buttons.

Batch Job Properties

The following batch job properties are read-only.

Field Description
Identity Identification information for this batch job. Stored in the C_
REPOS_TABLE_OBJECT_V table
Name Type code for this batch job. For example, s have the
 CMXLD.LOAD_MASTER type code. Stored in the OBJECT_NAME
column of the C_REPOS_TABLE_OBJECT_V table.
Description Description for this batch job in the format:

- 503 -
Field Description
JobName for | from BaseObjectName
Examples:
• Load from Consumer_Credit_Stg
• Match for Address
This description is stored in the OBJECT_DESC column of the C_
REPOS_TABLE_OBJECT_V table.
Status Status information for this batch job
Current Current status of the job. Examples:
Status • Executing
• Incomplete
• Completed
• Not Executing
• <Batch Job> Successful
• Description of failure

Options to Set Before Executing Batch Jobs

Certain types of batch jobs have additional fields that you can configure before
running the batch job.
Field Only For Description
Re- Generate Controls the scope of match tokens generation: tokenizes
generate Match the entire base object (checked) or tokenizes only those
All Token records that are flagged in the base object as requiring re-
Match Jobs tokenization (un-checked). For more information, see
Tokens "Regenerating All Match Tokens" on page 540.
Force Load If selected, the Load job forces a refresh and loads records
Update Jobs from the staging table to the base object regardless of
whether the records have already been loaded. For more
information, see "Forcing Updates in Load Jobs" on page
544.
Match Match Enables you to choose which match rule set to use for this
Set Jobs match job. For more information, see "Selecting a Match
Rule Set" on page 549.

Command Buttons for Batch Jobs

After you have selected a batch job, you can click the following command
buttons.
.

Button Description
Executes the selected batch job.

Clears the job execution history in the Batch Viewer.


For more information, see "Clearing the Job Execution
History" on page 512.
Sets the status of the currently-executing batch job to
Incomplete. For more information, see "etting the Job
Status to Incomplete" on page 505.
Refreshes the status display of the currently-
executing batch job. For more information, see
"Refreshing the Status" on page 505.

- 504 -
Executing a Batch Job

Important: You must have the application server running for the duration of
an executing batch job.

To execute a batch job in the Batch Viewer:


1. In the Batch Viewer, select the batch job that you want to run. For more
information, see "Selecting a Batch Job" on page 502.
2. In the right panel, click Execute Batch (or right-click on the job in the left
panel and select Execute from the pop-up menu)
If the current status of the job is Executing, then the Execute Batch
button is disabled. You must wait for the batch job to finish before you can
run it again.

To execute batch jobs in other ways, see "Ways to Execute Batch Jobs" on
page 496.

Refreshing the Status

While a batch job is running, you can click Refresh Status to check if the
status has changed.

etting the Job Status to Incomplete

In very rare circumstances, you might want to change the status of a running
job by clicking Set Status to Incomplete and execute the job again. Only do
this if the batch job has stopped executing (due to an error, such as a server
reboot or crash) but Informatica MDM Hub has not detected that the job has
stopped due to a job application lock in the metadata. You will know this is a
problem if the current status is Executing but the database, application
server, and logs show no activity. If this occurs, click this button to clear the
job application lock so that you can run the batch job again; otherwise, you
will not be able to execute the batch job. Setting the status to Incomplete just
updates the status of the batch job—it does not abort the job.

Note: This option is available only if your user ID has Informatica


Administrator rights.

- 505 -
Viewing Job Execution Logs
Informatica MDM Hub creates a job execution log each time that it executes a
batch job.

Job Execution Status

Each job execution log entry has one of the following status values:
Icon Description
Batch job is currently running.

Batch job completed successfully.


Batch job completed successfully, but additional information is
available. For example, for Stage and s, this can indicate that some
records were rejected (see "Viewing Rejected Records" on page 510).
For Match jobs, this can indicate that the base object is empty or that
there are no more records to match.
Batch job failed. For more information, see "Handling the Failed
Execution of a Batch Job" on page 511.
Batch job status was manually changed from “Executing” to
“Incomplete.” For more information, see "etting the Job Status to
Incomplete" on page 505.

Viewing the Job Execution Log for a Batch Job

To view the job execution log for a batch job:


1. Start the Batch Viewer tool, as described in "Starting the Batch Viewer
Tool" on page 501.
2. Expand the tree to display the job execution log that you want to view, and
then click it.
The Batch Viewer displays a screen for the selected job execution log.

- 506 -
Job Execution Log Entry Properties

For each job execution log entry, the Batch Viewer displays the following
information:
Field Description
Identity Identification information for this batch job. Stored in the C_REPOS_TABLE_OBJECT_V
table
Name Name of this job execution log. Date / time when the batch job
started.
Description Description for this batch job in the format:
JobName for / from BaseObjectName
Examples:
• Load from Consumer_Credit_Stg
• Match for Address
Source One of the following:
system • source system of the processed data
• Admin
Source Source table of the processed data.
table
Status Status information for this batch job
Current Current status of this batch job. If an error occurred, displays
Status information about the error. For more information, see "Job
Execution Status" on page 506.
Metrics Metrics for this batch job
[Various] Statistics collected during the execution of the batch job (if
applicable). For more information, see:
• "About Batch Job Metrics" on page 508
• "Auto Match and Merge Metrics" on page 533
• "Automerge Metrics" on page 534
• "Load Job Metrics" on page 544

- 507 -
Field Description
• "Match Job Metrics" on page 549
• "Match Analyze Job Metrics" on page 551
• "Stage Job Metrics" on page 557
• "Promote Job Metrics" on page 554
Time Timestamp for this batch job
Start Date / time when this batch job started.
Stop Date / time when this batch job ended.
Elapsed Elapsed time for the execution of this batch job.
time

About Batch Job Metrics

Informatica MDM Hub collects various statistics during the execution of a


batch job. The actual metrics returned depends on the specific batch job.
When a batch job has completed, it registers its statistics in C_REPOS_JOB_
METRIC. There can be multiple statistics for each job. The possible job metrics
include:
Metric Name Description
Total Total number of records processed by the batch job.
records
Inserted Number of records inserted by the batch job into the target
object.
Updated Number of records updated by the batch job in the target object.
No action Number of records on which no action was taken (the records
already existed in the base object).
Matched Number of records that were matched by the batch job.
records
Average Number of average matches.
matches
Updated Number of records that updated the cross-reference table for
XREF this base object. If you are loading a record during an
incremental load, that record has already been consolidated
(exists only in the XREF and not in the base object).
Records Number of records tokenized by the batch job. Applies only if
tokenized the Generate Match Tokens on Load check box is selected in the
Schema tool. For more information, see "Generating Match
Tokens During Load Jobs" on page 544.
Records Number of records flagged for match.
flagged for
match
Automerged Number of records that were merged by the Auto batch job.
records
Rejected Number of records rejected by the batch job. For more
records information, see "Viewing Rejected Records" on page 510.
Unmerged Number of source records that were not merged by the batch
source job.
records
Accepted as Number of records that were accepted as unique records by the
unique batch job. For more information, see "Automerge Jobs" on page
records 534.
Applies only if this base object has Accept All Unmatched
Rows as Unique enabled (set to Yes) in the Match / Merge

- 508 -
Metric Name Description
Setup configuration. For more information, see "Accept All
Unmatched Rows as Unique" on page 369.
Queued for Number of records that were queued for automerge by a Match
automerge job that was executed by the Auto Match and Merge job. For
more information, see "Automerge Jobs" on page 534.
Queued for Number of records that were queued for manual merge. Use the
manual Merge Manager in the Hub Console to process these records. For
merge more information, see the Informatica MDM Hub Data Steward
Guide.
Backfill trust
records
Missing Number of source records that were missing lookup information
lookup / or had invalid rowid_object records.
Invalid
rowid_
object
records
Records Number of records placed on Hold status.
moved to
Hold status
Records Number of records to be matched.
analyzed (to
be matched)
Match Number of match comparisons.
comparisons
Total Number of cleansed records.
cleansed
records
Total Number of records placed in landing table.
landing
records
Invalid Number of records with invalid rowid_object.
supplied
rowid_
object
records
Auto-linked Number of auto-linked records.
records
BVT Snapshot of BVT.
snapshot
Duplicate Number of duplicate matched records.
matched
records
Links Number of links removed.
removed
Revalidated Number of records revalidated.
records
Base object Number of base object records reset to “new” status.
records
reset to New
status

- 509 -
Metric Name Description
Links Number of links converted to matches.
converted to
matches
Auto- Number of auto-promoted records.
promoted
records
Deleted Number of XREF records deleted.
XREF
records
Deleted Number of records deleted.
record
Invalid Number of invalid records.
records
Not Number of active records not promoted.
promoted
active
records
Not Number of protected records not promoted.
promoted
protected
records
Deleted BO Number of base object records deleted.
records

Viewing Rejected Records

For Stage jobs or s only, if the batch job resulted in records being written to
the rejects table, then the job execution log displays a View Rejects button.

Note: Records are rejected if the HUB_STATE_IND value is not valid.

- 510 -
To view the rejected records and the reason why each was rejected:
1. Click the View Rejects button.
The Batch Viewer displays a table of rejected records.

2. Click Close.

Handling the Failed Execution of a Batch Job

If executing a batch job failed, perform the following steps:


• Display the execution log entry for this batch job.
• Read the error text in the Current Status field for diagnostic information.
• Take corrective action as necessary.

Copying the Current Status to the Windows Clipboard

To copy the current status of a batch to the Windows Clipboard (to paste into a
document or e-mail, for example):

• Click the button.

Deleting Job Execution Log Entries

To delete the selected job execution log:

• Click the button in the top right hand corner of the job properties page.

- 511 -
Clearing the Job Execution History
After running batch jobs over time, the list of executed jobs can become very
large. You should periodically remove the extraneous job execution logs from
this list.

Note: The actual procedure steps to clear job history will be slightly different
depending on the view (By Table, By Date, or By Procedure Type); the
following procedure assumes you are using the By Table view.

To clear the job history:


1. Start the Batch Viewer tool, as described in "Starting the Batch Viewer
Tool" on page 501.
2. In the Batch Viewer, expand the tree underneath your base object.
3. Expand the tree under the type of batch job.
4. Select the job for which you want to clear the history.

5. Click Clear History.


6. Click Yes to confirm that you want to delete all the execution history for
this batch job.

Running Batch Jobs Using the Batch Group


Tool
This section describes how to use the Batch Group tool in the Hub Console to
run batch jobs in groups. To run batch jobs individually, see "Running Batch
Jobs Using the Batch Viewer Tool" on page 501.

The Batch Viewer does not provide automated scheduling. For more
information about how to create custom scripts to execute batch jobs and
batch groups, see "Writing Custom Scripts to Execute Batch Jobs " on page 559

About Batch Groups


A batch group is a collection of individual batch jobs (for example, Stage,
Load, and Match jobs) that can be executed with a single command. Each

- 512 -
batch job in a batch group can be executed sequentially or in parallel with
other jobs. You use the Batch Group tool to configure and run batch groups.
For more information about batch jobs, see "Batch Jobs Reference" on page
530.

For more information about developing custom batch jobs and batch groups
that can be made available in the Batch Group tool, see "Developing Custom
Stored Procedures for Batch Jobs" on page 604.

Note: If you delete an object from the Hub Console (for example, if you
delete a mapping), the Batch Group tool highlights any batch jobs that depend
on that object (for example, a stage job) in red. You must resolve this issue
prior to re-executing the batch group.

Sequential and Parallel Execution

Batch jobs can be executed in the following ways:


Execution Description
Approach
sequentially Only one batch job in the batch group is executed at one
time.
parallel Multiple batch jobs in the batch group are executed
concurrently and in parallel.

Execution Paths

An execution path is the sequence in which batch jobs are executed when the
entire batch group is executed. The execution path begins with the Start node
and ends with the End node. The Batch Group tool does not validate the
execution sequence for you—it is up to you to ensure that the execution
sequence is correct. For example, the Batch Group tool would not notify you of
an error if you incorrectly specified the Load job for a base object ahead of its
Stage job.

Levels

In a batch group, the execution path consists of a series of one or more levels
that are executed in sequence (see "Running Batch Jobs in Sequence" on page
497).

A level is a collection of one or more batch jobs.


• If a level contains multiple batch jobs, then these batch jobs are executed
in parallel.
• If a level contains only a single batch job, then this batch job is executed
singly.

- 513 -
All batch jobs in the level must complete before the batch group proceeds to
the next task in the sequence.

Note: Because all of the batch jobs in a level are executed in parallel, none of
the batch jobs in the same level should have any dependencies. For example,
the Stage and Load jobs for a base object should be in separate levels that are
executed in the proper sequence. For more information, see "Running Batch
Jobs in Sequence" on page 497.

Other Ways to Execute Batch Groups

In addition to using the Batch Group tool, you can execute batch groups in the
following ways:
• Services Integration Framework (SIF) requests—Applications can
invoke the SIF ExecuteBatchGroupRequest request to execute batch groups
directly. For more information, see the Informatica MDM Hub Services
Integration Framework Guide.
• Stored procedures—Execute batch groups through stored procedures
using any job scheduling software (such as Tivoli, CA Unicenter, and so
on). For more information, see "Executing Batch Groups Using Stored
Procedures" on page 598.

Starting the Batch Group Tool


To start the Batch Group tool:
• In the Hub Console, expand the Utilities workbench, and then click Batch
Group.

The Hub Console displays the Batch Group tool.

The Batch Group tool consist of the following areas:


Area Description
Navigation Tree Hierarchical list of batch groups and execution logs.
Properties Pane Properties and command

Configuring Batch Groups


This section describes how to add, edit, and delete batch groups. For more
information, see "About Batch Groups" on page 512.

Adding Batch Groups

To add a batch group:

- 514 -
1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Right-click the Batch Groups node in the Batch Group tree and choose Add
Batch Group from the pop-up menu.
The Batch Group tool adds a “New Batch Group” to the Batch Group tree.

Note the empty execution sequence. You will configure this after adding
the new batch group. For more information, see "Configuring Levels for
Batch Groups" on page 516.
4. Specify the following information:

Field Description
Name Specify a unique, descriptive name for this batch group.
Description Enter a description for this batch group.

5. Click the button to save your changes.


The Batch Group tool saves your changes and updates the navigation tree.
To add batch jobs to the new batch group, see "Assigning Batch Jobs to
Batch Group Levels" on page 519.

Editing Batch Group Properties

To edit batch group properties:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to edit.
4. Specify a different batch group name, if you want.
5. Specify a different description, if you want.

6. Click the button to save your changes.

Deleting Batch Groups

To delete a batch group:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.

- 515 -
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to delete.
4. Right-click the batch group that you want to delete, and then click Delete
Batch Group.
The Batch Group tool prompts you to confirm deletion.
5. Click Yes.
The Batch Group tool removes the deleted batch group from the navigation
tree.

Configuring Levels for Batch Groups

As described in "About Batch Groups" on page 512, a batch group contains one
or more levels that are executed in sequence. This section describes how to
specify the execution sequence by configuring the levels in a batch group.

Adding Levels to a Batch Group

To add a level to a batch group:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to configure.
4. In the batch groups tree, right click on any level, and choose one of the
following options:

Command Description
Add Level Above Add a level to this batch group above the selected
item.
Add Level Below Add a level to this batch group below the selected item.
Move Level Up Move this batch group level above the prior level.
Move Level Down Move this batch group level below the next level.
Remove this Level Remove this batch group level.
The Batch Group tool displays the Choose Jobs to Add to Batch Group
dialog.

- 516 -
5. Expand the base object(s) for the job(s) that you want to add.

6. Select the job(s) that you want to add.


To select jobs that you want to execute in parallel, hold down the CTRL key
and click each job that you want to select.
7. Click OK. The Batch Group tool adds the selected job(s) to the batch group.

- 517 -
8. Click the button to save your changes.

Removing Levels From a Batch Group

To remove a level from a batch group:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to configure.
4. In the batch group, right click on the level that you want to delete, and
choose Remove this Level.
The Hub Console displays the delete confirmation dialog.

5. Click Yes.
The Batch Group tool removes the deleted level from the batch group.

To Move a Level Up Within a Batch Group

To move a level up within a batch group:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.

- 518 -
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to configure.
4. In the batch groups tree, right click on the level you want to move up, and
choose Move Level Up.
The Batch Group tool moves the level up within the batch group.

To Move a Level Down Within a Batch Group

To move a level down within a batch group:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to configure.
4. In the batch groups tree, right click on the level you want to move down,
and choose Move Level Down.
The Batch Group tool moves the level down within the batch group.

Assigning Batch Jobs to Batch Group Levels

In the Batch Group tool, a job is a Informatica MDM Hub batch job. Each level
contains one or more batch jobs. If a level contains multiple batch jobs, then
all of those batch jobs are executed in parallel.

Adding a Batch Job to a Batch Group Level

To add a batch job to a batch group:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to configure.
4. In the batch groups tree, right click on the level to which you want to add
jobs, and choose Add jobs to this level....
The Batch Group tool displays the Choose Jobs to Add to Batch Group
dialog.

- 519 -
5. Expand the base object(s) for the job(s) that you want to add.

6. Select the job(s) that you want to add.


To select multiple jobs at once (to execute them in parallel), hold down the
CTRL key while clicking jobs.
7. Click OK.
8. Save your changes.
The Batch Group tool adds the selected jobs to the target level box.
Informatica MDM Hub executes all batch jobs in a group level in parallel.

Configuring Options for Batch Jobs

When configuring a batch group, you can configure job options for certain
kinds of batch jobs. For more information about these job options, see

- 520 -
"Options to Set Before Executing Batch Jobs" on page 504.

Removing a Batch Job From a Level

To remove a batch job from a level:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to configure.
4. In the batch group, right click on the job that you want to delete, and
choose Remove Job.
The Batch Group tool displays the delete confirmation dialog.

5. Click Yes to delete the selected job.


The Batch Group tool removes the deleted job from this level in the batch
group.

To Move a Batch Job Up a Level

To move a batch job up a level:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to configure.
4. In the batch group, right click on the job that you want to move up, and
choose Move job up.
The Batch Group tool moves the selected job up one level in the batch
group.

To Move a Batch Job Down a Level

To move a batch job down a level:

- 521 -
1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to configure.
4. In the batch group, right click on the job that you want to move up, and
choose Move job down.
The Batch Group tool moves the selected job down one level in the batch
group.

Refreshing the Batch Groups List


To refresh the batch groups list:
• Right-click anywhere in the navigation pane and choose Refresh.

Executing Batch Groups Using the Batch Group Tool


This section describes how to manage batch group execution in the Batch
Group tool. For more information about executing batch jobs in other ways,
such as, using stored procedures or theServices Integration Framework, see
"Ways to Execute Batch Jobs" on page 496.

Important: You must have the application server running for the duration of
an executing batch group.

Note: If you delete an object from the Hub Console (for example, if you
delete a mapping), the Batch Group tool highlights any batch jobs that depend
on that object (for example, a stage job) in red. You must resolve this issue
prior to re-executing the batch group.

Navigating to the Control & Logs Screen

The Control & Logs screen is where you can control the execution of a batch
group and view its execution logs.

To navigate to the Control & Logs screen for a batch group.


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Expand the Batch Group tree to display the batch group that you want to
execute.

- 522 -
3. Expand the batch group and click the Control & Logs node.
The Batch Group tool displays the Control & Logs screen for this batch
group.

Components of the Control & Logs Screen

This screen contains the following components:

Component Description
Toolbar Command buttons for managing batch group execution. For more
information, see "Command Buttons for Batch Groups" on page
523.
Logs for the Execution logs for this batch group.
Batch
Group
Logs for Execution logs for individual batch jobs in this batch group.
Batch Jobs

Command Buttons for Batch Groups

Use the following command buttons to manage batch group execution.


Button Description
Executes this batch group.

Sets the execution status of a failed batch group to


restart. For more information, see "Restarting a Batch
Group That Failed Execution" on page 526.
Sets the execution status of a running batch group to
incomplete. For more information, see "Handling
Incomplete Batch Group Execution" on page 527.

- 523 -
Button Description
Removes the selected group or job execution log.

Removes all group and job execution logs.

Refreshes the screen for this batch group.

Executing a Batch Group

To execute a batch group:


1. Navigate to the Control & Logs screen for the batch group.

For more information, see "Navigating to the Control & Logs Screen" on
page 522.
2. Click on the node and then select Batch Group > Execute, or click on the
Execute button.
The Batch Group tool executes the batch group and updates the logs panel
with the status of the batch group execution.
3. Click the Refresh button to see the execution result.
The Batch Group tool displays progress information.

When finished, the Batch Group tool adds entries to:


• the group execution log for this batch group
• the job execution log for individual batch jobs

- 524 -
Note: When you execute a batch group in FAILED status, you are actually re-
executing the failed instance, and the status is set to whatever the final
outcome is, and the Hub does not generate a new group log. However, in the
detailed logs (lower log table), you are not re-executing the failed instance,
rather, you are executing the same job in a new instance, and as a result, the
Hub generates a new log that is displayed here.

Group Execution Status

Each execution log has one of the following status values:


Icon Description
Processing. The batch group is currently running.

Batch group execution completed successfully.


Batch group execution completed with additional information. For
example, for Stage and Load jobs, this can indicate that some records
were rejected (see "Viewing Rejected Records" on page 528). For Match
jobs, this can indicate that the base object is empty or that there are no
more records to match.
Batch group execution failed. For more information, see "Restarting a
Batch Group That Failed Execution" on page 526.
Batch group execution is incomplete. For more information, see
"Handling Incomplete Batch Group Execution" on page 527.
Batch group execution has been reset to start over. For more
information, see "Restarting a Batch Group That Failed Execution" on
page 526.

Viewing the Group Execution Log for a Batch Group

Each time that it executes a batch group, the Batch Group tool generates a
group execution log entry. Each log entry has the following properties:
Field Description
Status Current status of this batch job. If batch group execution failed,
displays a description of the problem. For more information, see
"Group Execution Status" on page 525.
Start Date / time when this batch job started.
End Date / time when this batch job ended.
Message Any messages regarding batch group execution.

- 525 -
Viewing the Job Execution Log for a Batch Job

Each time that it executes a batch job within a batch group, the Batch Group
tool generates a job execution log entry.

Each log entry has the following properties:


Field Description
Job Name of this batch job.
Name
Status Current status of this batch job. For more information, see "Job
Execution Status" on page 506.
Start Date / time when this batch job started.
End Date / time when this batch job ended.
Message Any messages regarding batch group execution.

Note: If you want to view the metrics for a completed batch job, you can use
the Batch Viewer. For more information, see "Viewing Job Execution Logs" on
page 506.

Restarting a Batch Group That Failed Execution

If batch group execution fails, then you can resolve any problems that may
have caused the failure to occur, then restart batch group from the beginning.

To execute the batch group again:


1. In the Logs for My Batch Group list, select the execution log entry for the
batch group that failed.

2. Click Set to Restart.


The Batch Group tool changes the status of this batch job to Restart.

- 526 -
3. Resolve any problems that may have caused the failure to occur and
execute the batch group again. For more information, see "Executing a
Batch Group" on page 524.
The Batch Group tool executes the batch group and creates a new
execution log entry.

Note: If a batch group fails and you do not click either the Set to Restart
button (see "Restarting a Batch Group That Failed Execution" on page 526) or
the Set to Incomplete button (see "Handling Incomplete Batch Group
Execution" on page 527) in the Logs for My Batch Group list, Informatica MDM
Hub restarts the batch job from the prior failed level.

Handling Incomplete Batch Group Execution

In very rare circumstances, you might want to change the status of a running
batch group.
• If the batch group status says it is still executing, you can click Set Status
to Incomplete and execute the batch group again. You do this only if the
batch group has stopped executing (due to an error, such as a server
reboot or crash) but Informatica MDM Hub has not detected that the batch
group has stopped due to a job application lock in the metadata.
You will know this is a problem if the current status is Executing but the
database, application server, and logs show no activity. If this occurs,
click this button to clear the job application lock so that you can run the
batch group again; otherwise, you will not be able to execute the batch
group. Setting the status to Incomplete just updates the status of the batch
group (as well as all batch jobs within the batch group)—it does not
terminate processing.
Note that, if the job status is Incomplete, you cannot set the job status to
Restart.
• If the job status is Failed, you can click Set to Restart. Note that, if the
job status is Restart, you cannot set the job status to Incomplete.

Changing the status allows you to continue doing something else while the
batch group completes.

- 527 -
To set the status of a running batch group to incomplete:
1. In the Logs for My Batch Group list, select the execution log entry for the
running batch group that you want to mark as incomplete.

2. Click Set to Incomplete.


The Batch Group tool changes the status of this batch job to Incomplete.

3. Execute the batch group again. For more information, see "Executing a
Batch Group" on page 524.

Note: If a batch group fails and you do not click either the Set to Restart
button (see "Restarting a Batch Group That Failed Execution" on page 526) or
the Set to Incomplete button (see "Handling Incomplete Batch Group
Execution" on page 527) in the Logs for My Batch Group list, Informatica MDM
Hub restarts the batch job from the prior failed level.

Viewing Rejected Records

If batch group execution resulted in records being written to the rejects table
(during the execution of Stage jobs or Load jobs), then the job execution log
enables the View Rejects button.

To view rejected records:


1. Click the View Rejects button.
The Batch Group tool displays the Rejects window.

- 528 -
2. Navigate and inspect the rejected records as needed.
3. Click Close.

Filtering Execution Logs By Status


You can view history logs across all Batch Groups, based on their execution
status by clicking on the appropriate node under the Logs By Status node.

To filter execution logs by status:


1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. In the Batch Group tree, expand the Logs by Status node.
The Batch Group tool displays the log status list.

3. Click the particular batch group log entry you want to review in the upper
half of the logs panel.
Informatica MDM Hub displays the detailed job execution logs for that
batch group in the lower half of the panel. For additional information, see:
• "Group Execution Status" on page 525
• "Viewing the Group Execution Log for a Batch Group" on page 525
• "Viewing the Job Execution Log for a Batch Job" on page 526

Note: Batch group logs can be deleted by selecting a batch group log and
clicking the Clear Selected button. To delete all logs shown in the panel,
click the Clear All button.

- 529 -
Deleting Batch Groups
To delete a batch group:
1. Start the Batch Group tool. For more information, see "Starting the Batch
Group Tool" on page 514.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the navigation tree, expand the Batch Group node to show the batch
group that you want to delete.
4. In the batch group, right click on the job that you want to move up, and
choose Delete Batch Group (or select Batch Group > Delete Batch
Group).

Batch Jobs Reference


This section describes each Informatica MDM Hub batch job.

Alphabetical List of Batch Jobs


Batch Job Description
"Accept For records that have undergone the match process but
Non- had no matching data, sets the consolidation indicator to
Matched 1 (consolidated), meaning that the record was unique and
Records As did not require consolidation.
Unique " on
page 532
"Autolink Automatically links records that have qualified for autolinking
Jobs" on during the match process and are flagged for autolinking
page 532 (Automerge_ind=1).
"Auto Match Executes a continual cycle of a Match job, followed by an
and Merge Automerge job, until there are no more records to match, or
Jobs" on until the number of matches ready for manual consolidation
page 532 exceeds the configured threshold. Used with merge-style base
objects only.
"Automerge Automatically merges records that have qualified for
Jobs" on automerging during the match process and are flagged for
page 534 automerging (Automerge_ind = 1). Used with merge-style base
objects only.
"BVT Generates a snapshot of the best version of the truth (BVT) for a
Snapshot base object. Used with link-style base objects only.
Jobs" on
page 535
"External Matches “externally managed/prepared” records with an
Match Jobs" existing base object, yielding the results based on the current
on page 535 match settings—all without actually modifying the data in the
base object.
"Generate Prepares data for matching by generating match tokens
Match according to the current match settings. Match tokens are

- 530 -
Batch Job Description
Tokens Jobs" strings that encode the columns used to identify candidates for
on page 540 matching.
"Hub Delete Deletes data from the Hub based on base object / XREF level
Jobs" on input.
page 541
"Key Match Matches records from two or more sources when these sources
Jobs" on use the same primary key. Compares new records to each other
page 541 and to existing records, and identifies potential matches based
on the comparison of source record keys as defined by the
match rules.
"Load Jobs" Copies records from a staging table to the corresponding target
on page 542 base object in the Hub Store. During the load process, applies
the current trust and validation rules to the records.
"Manual Link Shows logs for records that have been manually linked in the
Jobs" on Merge Manager tool. Used with link-style base objects only.
page 545
"Manual Shows logs for records that have been manually merged in the
Merge Jobs" Merge Manager tool. Used with merge-style base objects only.
on page 545
"Manual Shows logs for records that have been manually unlinked in the
Unlink Jobs" Merge Manager tool. Used with link-style base objects only.
on page 546
"Manual Shows logs for records that have been manually unmerged in
Unmerge the Merge Manager tool.
Jobs" on
page 546
"Match Jobs" Finds duplicate records in the base object, based on the current
on page 547 match rules.
"Match Conducts a search to gather match statistics but does not
Analyze actually perform the match process. If areas of data with the
Jobs" on potential for huge match requirements are discovered,
page 550 Informatica MDM Hub moves the records to a hold status, which
allows a data steward to review the data manually before
proceeding with the match process.
"Match for For data with a high percentage of duplicate records, compares
Duplicate new records to each other and to existing records, and identifies
Data Jobs" exact duplicates. The maximum number of exact duplicates is
on page 552 based on the Duplicate Match Threshold setting for this base
object.
"Migrate Link Used with link-style base objects only. Migrates link-style base
Style To objects to merge-style base objects.
Merge Style
Jobs" on
page 552
"Multi Merge Allows the merge of multiple records in one job.
Jobs" on
page 552
"Promote Reads the PROMOTE_IND column from an XREF table and
Jobs" on changes to ACTIVE the state on all rows where the column’s
page 552 value is 1.
"Recalculate Recalculates all base objects identified by ROWID_OBJECT
BO Jobs" on column in the table/inline view if you include the ROWID_
page 554 OBJECT_TABLE parameter.
If you do not include the parameter, this batch job recalculates
all records in the base object, in batches of MATCH_BATCH_
SIZE or 1/4 the number of the records in the table, whichever is
less.

- 531 -
Batch Job Description
"Recalculate Recalculates the BVT for the specified ROWID_OBJECT.
BVT Jobs" on
page 555
"Reset Links Updates the records in the _LINK table to account for changes in
Jobs" on the data. Used with link-style base objects only.
page 555
"Reset Match Shows logs of the operation where all matched records have
Table Jobs" been reset to be queued for match.
on page 555
"Revalidate Executes the validation logic/rules for records that have been
Jobs" on modified since the initial validation during the Load Process.
page 556
"Stage Jobs" Copies records from a landing table into a staging table. During
on page 556 execution, cleanses the data according to the current cleanse
settings.
"Synchronize Updates metadata for base objects. Used after a base object
Jobs" on has been loaded but not yet merged, and subsequent trust
page 557 configuration changes (such as enabling trust) have been made
to columns in that base object. This job must be run before
merging data for this base object.

Accept Non-Matched Records As Unique


Accept Non-matched Records As Unique jobs change the status of records that
have undergone the match process but had no matching data. This job sets the
consolidation indicator to 1, meaning that the record is consolidated or (in this
case) did not require consolidation. The Automerge job adheres to this setting
and treats these as unique records.

The Accept Non-matched Records As Unique job is created:


• only if the base object has Accept All Unmatched Rows as Unique
enabled (set to Yes) in the Match / Merge Setup configuration. For more
information, see "Accept All Unmatched Rows as Unique" on page 369.
• only after a merge job is run, as described in "Batch Jobs That Are Created
When Changes Occur" on page 500.

Note: This job cannot be executed from the Batch Viewer.

Autolink Jobs
For link-style base objects only, after the Match job has been run, you can run
the Autolink job to automatically link any records that qualified for autolinking
during the match process.

Auto Match and Merge Jobs


Auto Match and Merge batch jobs execute a continual cycle of a Match job,
followed by an Automerge job, until there are no more records to match, or

- 532 -
until the maximum number of records for manual consolidation limit is
reached (see "Maximum Matches for Manual Consolidation" on page 368). The
match batch size parameter (see "Number of Rows per Match Job Batch Cycle"
on page 368) controls the number of records per cycle that this process goes
through to finish the match and merge cycles. For more information, see
"Match Jobs" on page 547 and "Automerge Jobs" on page 534.

Important: Do not run an Auto Match and Merge job on a base object that is
used to define relationships between records in inter-table or intra-table
match paths. Doing so will change the relationship data, resulting in the loss of
the associations between records. For more information, see "Relationship
Base Objects" on page 374.

Second Jobs Shown After Application Server Restart

If you execute an Auto Match and Merge job, it completes successfully with
one job shown in the status. However, if you stop and restart the application
server and return to the Batch Viewer, you see a second job (listed under
Match jobs) with a warning a few seconds later. The second job is to ensure
that either the base object is empty or there are no more records to match.

Auto Match and Merge Metrics

After running an Auto Match and Merge job, the Batch Viewer displays the
following metrics (if applicable) in the job execution log.

Metric Description
Matched Number of records that were matched by the Auto Match and
records Merge job.
Records Number of records that were tokenized prior to the Auto Match
tokenized and Merge job.
Automerged Number of records that were merged by the Auto Match and
records Merge job.
Accepted as Number of records that were accepted as unique records by the
unique Auto Match and Merge job. For more information, see
records "Automerge Jobs" on page 534.
Applies only if this base object has Accept All Unmatched
Rows as Unique enabled (set to Yes) in the Match / Merge
Setup configuration. For more information, see "Accept All
Unmatched Rows as Unique" on page 369.

- 533 -
Metric Description
Queued for Number of records that were queued for automerge by a Match
automerge job that was executed by the Auto Match and Merge job. For
more information, see "Automerge Jobs" on page 534.
Queued for Number of records that were queued for manual merge. Use the
manual Merge Manager in the Hub Console to process these records. For
merge more information, see the Informatica MDM Hub Data Steward
Guide.

Automerge Jobs
For merge-style base objects only, after the Match job has been run, you can
run the Automerge job to automatically merge any records that qualified for
automerging during the match process. When an Automerge job is run, it
processes all matches in the MATCH table that are flagged for automerging
(Automerge_ind=1).

Note: For state-enabled objects only, records that are PENDING (source and
target records) or DELETED are never automerged. When a record is deleted,
it is removed from the match table and its consolidation_ind is reset to 4. For
more information regarding how to manage the state of base object or XREF
records, refer to "Configuring State Management for Base Objects" on page
162.

Automerge Jobs and Auto Match and Merge

Auto Match and Merge batch jobs execute a continual cycle of a Match job,
followed by an Automerge job, until there are no more records to match, or
until the maximum number of records for manual consolidation limit is
reached (see "Maximum Matches for Manual Consolidation" on page 368). For
additional information, see "Auto Match and Merge Jobs" on page 532.

Automerge Jobs and Trust-Enabled Columns

An Automerge job will fail if there is a large number of trust-enabled columns.


The exact number of columns that cause the job to fail is variable and based
on the length of the column names and the number of trust-enabled columns.
Long column names are at—or close to—the maximum allowable length of 26
characters. To avoid this problem, keep the number of trust-enabled columns
below 40 and/or the length of the column names short.

Automerge Metrics

After running an Automerge job, the Batch Viewer displays the following
metrics (if applicable) in the job execution log:

- 534 -
Metric Description
Automerged Number of records that were automerged by the Automerge job.
records
Accepted as Number of records that were accepted as unique records by the
unique Automerge job. Applies only if this base object has Accept All
records Unmatched Rows as Unique enabled (set to Yes) in the Match
/ Merge Setup configuration. For more information, see "Accept
All Unmatched Rows as Unique" on page 369.

BVT Snapshot Jobs


For a base object table, the best version of the truth (BVT) is a record that has
been consolidated with the best cells of data from the source records. For
more information, see "Best Version of the Truth" on page 259.

Note: For state-enabled base objects only, the BVT logic uses the HUB_
STATE_IND to ignore the non contributing base objects where the HUB_
STATE_IND is -1 or 0 (PENDING or DELETED state). For the online BUILD_BVT
call, provide INCLUDE_PENDING_IND parameter.

Possible scenarios include:


1. If this parameter is 0 then include only ACTIVE base object records.
2. If this parameter is 1 then include ACTIVE and PENDING base object
records.
3. If this parameter is 2 then calculate based on ACTIVE and PENDING XREF
records to provide “what-if” functionality.
4. If this parameter is 3 then calculate based on ACTIVE XREF records to
provide current BVT based on XREFs, which may be different than the
scenario 1.

For more information regarding how to manage the state of base object or
XREF records, refer to "State Management" on page 159

External Match Jobs


External match jobs match “externally managed/prepared” records with an
existing base object, yielding the results based on the current match
settings—all without actually loading the data from the input table into the
base object, changing data in the base object in any way, or changing the
match table associated with the base object. You can use external matching to
pretest data, test match rules, and inspect the results before running the
actual Match job.

- 535 -
External Match jobs can process both fuzzy-match and exact-match rules, and
can be used with fuzzy-match and exact-match base objects. For more
information, see "Exact-match and Fuzzy-match Base Objects" on page 247.

Note: Exact-match columns that contain concatenated physical columns


require a space at the end of each column. For example, "John" concatenated
with "Smith" will only match "John Smith ".

The External Match job executes as a batch job only—there is no


corresponding SIF request that external applications can invoke. For more
information, see "Running External Match Jobs" on page 539.

Input and Output Tables Used for External Match Jobs

In addition to the base object and its associated match key table, the External
Match job uses the following input and output tables.

External Match Input (EMI) Table

Each base object has an External Match Input (EMI) table for External Match
jobs. This table uses the following naming pattern:

C_BaseObject_EMI

where BaseObject is the name of the base object associated with this External
Match job.

When you create a base object, the Schema Manager automatically creates
the associated EMI table, and automatically adds the following system
columns:

- 536 -
Column Data Size Not Description
Name Type Null
SOURCE_ VARCHAR 50 Used as part of a three-column composite
KEY primary key to uniquely identify this record and to
map to records in the C_BaseObject_EMO table.
SOURCE_ VARCHAR 50 Used as part of a three-column composite
NAME primary key to uniquely identify this record and to
map to records in the C_BaseObject_EMO table.
FILE_ VARCHAR 50 Used as part of a three-column composite
NAME primary key to uniquely identify this record and to
map to records in the C_BaseObject_EMO table.

When populating the EMI table (see "Populating the Input Table" on page 539),
at least one of these columns must contain data. Note that the column names
are non-restrictive—they can contain any identifying data, as long as the
composite three-column primary key is unique.

In addition, when you configure match rules for a particular column (for
example, Person_Name, Address_Part1, or Exact_Cust_ID), the Schema
Manager adds that column automatically to the C_BaseObject_EMI table.

You can view the columns of an external match table in the Schema Manager
by expanding the External Match Table node.

The records in the EMI table are analogous to the match batch used in Match
jobs. As described in "Flagging the Match Batch" on page 251, the match batch
contains the set of records that are matched against the rest of records in the
base object. The difference is that, for Match jobs, the match batch records
reside in the base object, while for External Match, these records reside in a
separate input table.

External Match Output (EMO) Table

Each base object has an External Match Output (EMO) table that contains the
output data for External Match jobs. This table uses the following naming
pattern:

C_BaseObject_EMO

where BaseObject is the name of the base object associated with this External
Match job.

- 537 -
Before the External Match job is executed, Informatica MDM Hub drops and
re-creates this table.

An EMO table contains the following columns:


Column Data Type Size Not Description
Name Null
SOURCE_KEY VARCHAR 50 Used as part of a three-column composite
primary key to uniquely identify this record.
Maps back to the source record in the C_
BaseObject_EMI table.
SOURCE_ VARCHAR 50 Used as part of a three-column composite
NAME primary key to uniquely identify this record.
Maps back to the source record in the C_
BaseObject_EMI table.
FILE_NAME VARCHAR 50 Used as part of a three-column composite
primary key to uniquely identify this record.
Maps back to the source record in the C_
BaseObject_EMI table.
ROWID_ CHAR 14 X ROWID_OBJECT of the record in the base
OBJECT_ object that matched the record in the EMI
MATCHED table.
ROWID_ CHAR 14 Identifies the match rule that was used to
MATCH_RULE determine whether the two rows matched.
AUTOMERGE_ NUMBER 38 X Specifies whether a record qualifies for
IND automatic consolidation during the match
process. One of the following values:
• Zero (0): Record does not qualify for
automatic consolidation. Record
• One (1): Record qualifies for automatic
consolidation.
• Two (2): Records are pending. For Build
Match Group (BMG), do not build groups
with PENDING records. PENDING records
are to be left as individual matches.
The Automerge and Autolink job processes
any records with an AUTOMERGE_IND of 1.
For more information, see "Automerge Jobs"
on page 534.
CREATOR VARCHAR2‘ 50 User or process responsible for creating the
record.
CREATE_ DATE Date on which the record was created.
DATE

Instead of populating the match table for the base object, the External Match
job populates this EMO table with match pairs. Each row in the EMO represents
a pair of matched records—one from the EMI table and one from the base
object:
• The primary key (SOURCE_KEY + SOURCE_NAME + FILE_NAME) uniquely
identifies the record in the EMI table.
• ROWID_OBJECT_MATCHED uniquely identifies the record in the base
object.

- 538 -
Populating the Input Table

Before running an External Match job, the EMI table must be populated with
records to match against the records in the base object. The process of
loading data into an EMI table is external to Informatica MDM Hub—you must
use a data loading tool that works with your database platform (such as
SQL*Loader).

Important: When you populate this table, you must supply data for at least
one of the system columns (SOURCE_KEY, SOURCE_NAME, and FILE_NAME) to
help link back from the _EMI table. In addition, the C_BaseObject_EMI table
must contain flat records—like the output of a JOIN, with unique source keys
and no foreign keys to other tables.

Running External Match Jobs

To run an external match job for a base object:


1. Populate the data in the C_BaseObject_EMI table using a data loading
process that is external to Informatica MDM Hub. For requirements, see
"Populating the Input Table" on page 539.
2. In the Hub Console, start either of the following tools:
• Batch Viewer according to the instructions in "Starting the Batch
Viewer Tool" on page 501
• Batch Group according to the instructions in "Starting the Batch Group
Tool" on page 514
3. Select the External Match job for the base object.
4. Select the match rule set that you want to use for external match.
The default match rule set is automatically selected. For more
information, see "Configuring Match Rule Sets" on page 399.
5. Execute the External Match job according to the instructions in "Running
Batch Jobs Manually" on page 502 or "Executing Batch Groups Using the
Batch Group Tool" on page 522.
• The External Match job matches all records in the C_BaseObject_EMI
table against the records in the base object. There is no concept of a
consolidation indicator in the input or output tables.
• The Build Match Group is not run for the results.
6. Inspect the results in the C_BaseObject_EMO table using a data
management tool (external to Informatica MDM Hub).
7. If you want to save the results, make a backup copy of the data before
running the External Match job again.

- 539 -
Note: The C_BaseObject_EMO table is dropped and recreated after every
External Match Job execution.

Generate Match Tokens Jobs


The Generate Match Tokens job runs the tokenize process, which generates
match tokens and stores them in a match key table associated with the base
object so that they can be used subsequently by the match process to identify
candidates for matching. For an overview, see "Tokenize Process" on page
240. Generate Match Tokens jobs apply to fuzzy-match base objects only—not
to exact-match base objects—as described in "Exact-match and Fuzzy-match
Base Objects" on page 247.

The match process depends on the match tokens in the match key table being
current. If match tokens need to be updated (for example, if records have
been added or updated during the load process), the match process
automatically runs the tokenize process at the start of a match job (see
"Regenerating Match Tokens If Needed" on page 251). To expedite the match
process, it is recommended that you run the tokenize process separately—
before the match process—either by:
• manually executing the Generate Match Tokens job, or
• configuring the tokenize process to run automatically after the completion
of the load process (see "Generating Match Tokens (Optional)" on page
239)

Tokenize Process for State-Enabled Base Objects

For state-enabled base objects only, the tokenize process skips records that
are in the DELETED state. These records can be tokenized through the
Tokenize API, but will be ignored in batch processing. PENDING records can be
matched on a per-base object basis by setting the MATCH_PENDING_IND
(default off). For more information about how to manage the state of base
object or XREF records, see "Configuring State Management for Base Objects"
on page 162.

Regenerating All Match Tokens

Before you run a Generate Match Tokens job, you can use the Re-generate
All Match Tokens check box to specify the scope of match token generation.

- 540 -
Do one of the following:
• Check (select) this check box to have the Generate Match Tokens job
tokenize all records in the base object.
• Uncheck (clear) this check box to have the Generate Match Tokens job
generate match tokens for only new or updated records in the base object
(whose DIRTY_IND=1, as described in "Base Object Records Flagged for
Tokenization" on page 243).

After Generating Match Tokens

After the match tokens are generated, you can run the Match job for the base
object.

Hub Delete Jobs


Hub Delete jobs remove data from the Hub based on base object / XREFs input
to the cmxdm.hub_delete_batch stored procedure. You can use the Hub Delete
job to remove an entire source system from the Hub.

Note: Hub Delete jobs execute as a batch only stored procedure—you can not
call a Hub Delete job from the Batch Viewer or Batch Group tools, and there is
no corresponding SIF request that external applications can invoke. For more
information, see "Hub Delete Jobs" on page 575.

Key Match Jobs


Used only with primary key match rules (see "About Primary Key Match Rules"
on page 434), Key Match jobs run the match process on records from two or
more source systems when those sources use the same primary key values.
Key Match jobs compare new records to each other and to existing records,
and then identify potential matches based on the comparison of source record
keys (as defined by the primary key match rules). For an overview, see
"Match Process" on page 245.

- 541 -
A Key Match job is automatically created after a primary key match rule for a
base object has been created or changed in the Schema Manager (Match /
Merge Setup configuration). For more information, see "Configuring Primary
Key Match Rules" on page 434.

Load Jobs
Load jobs move data from a staging table to the corresponding target base
object in the Hub Store. Load jobs also calculate trust values for base objects
with defined trusted columns, and they apply validation rules (if defined) to
determine the final trust values. For more information about loading data,
including trust, validation, and delta detection, see "Configuration Tasks for
Loading Data" on page 343.

Load Jobs and State-enabled Base Objects

For state-enabled base objects, the load batch process can load records in any
state. The state is specified as an input column on the staging table. The input
state can be specified in the mapping view a landing table column or it can be
derived. If an input state is not specified in the mapping, then the state is
assumed to be ACTIVE. For more information regarding how to manage the
state of base object or XREF records, refer to "Configuring State Management
for Base Objects" on page 162.

The following table describes how input states affect the states of existing
XREFs.
Existing ACTIVE PENDING DELETED No XREF No
XREF (Load by Base
State: rowid) Object
Incoming
XREF
State:
ACTIVE Update Update + Update + Insert Insert
Promote Restore
PENDING Pending Pending Pending Pending Pending
Update Update Update + Update Insert
Restore
DELETED Soft Hard Hard Delete Error Error
Delete Delete
Undefined Treat Treat as Treat as Treat As Treat
as Pending Deleted Active As
Active Active

Note: Records are rejected if the HUB_STATE_IND value is not valid.

The following table provides a matrix of how Informatica MDM Hub processes
records (for state-enabled base objects) during Load (and Put) for certain
operations based on the record state:

- 542 -
Incoming Existing Notes
Record Record
State State
Update the ACTIVE ACTIVE
XREF record
when:
DELETED ACTIVE
PENDING PENDING
ACTIVE PENDING
DELETED DELETED
PENDING DELETED
DELETED When a base object rowid delete record
comes in, Informatica MDM Hub updates the
base object and all XREF records (regardless
of ROWID_SYSTEM) to DELETED state.
Insert the PENDING ACTIVE The second record for the pair is created.
XREF record
when:
ACTIVENo
Record
PENDING No
Record
Delete the ACTIVE PENDING Delete the ACTIVE record in the pair, the
XREF record (for PENDING record is then updated.
when: paired Paired records are two records with the same
records) PKEY_SRC_OBJECT and ROWID_SYSTEM.
DELETED PENDING
Informatica PENDING ACTIVE Paired records are two records with the same
MDM Hub (for PKEY_SRC_OBJECT and ROWID_SYSTEM.
displays an paired
error when: records)

Additional notes:
• If the incoming state is not specified (for a Load update), then the
incoming state is assumed to be the same as the current state. For
example if the incoming state is null and the existing state of the XREF or
base object to update is PENDING, then the incoming state is assumed to
be PENDING instead of null.
• Informatica MDM Hub deletes XREF records using the Hub Delete batch
job. The Hub Delete batch job removes specified data—up to and including
an entire source system—from Informatica MDM Hub based on your base
object/XREF input to the cmxdm.hub_delete_batch stored procedure. For
more information, see "Hub Delete Jobs" on page 575.

For more information regarding how to manage the state of base object or
XREF records, refer to "Configuring State Management for Base Objects" on
page 162.

Rules for Running Load Jobs

The following rules apply to Load jobs:

- 543 -
• Run a Load job only if the Stage job that loads the staging table used by
the Load job has completed successfully.
• Run the Load job for a parent table before you run the Load job for a child
table.
• If a lookup on the child object is not defined (the lookup table and column
were not populated), in order to successfully load data, you must repeat
the Stage job on the child object prior to running the Load job.
• Only one Load job at a time can be run for the same base object. Multiple
Load jobs for the same base object cannot be run concurrently.

Forcing Updates in Load Jobs

Before you run a Load job, you can use the Force Update check box to
configure how the Load job loads data from the staging table to the target
base object. By default, Informatica MDM Hub checks the Last Update Date for
each record in the staging table to ensure that it has not already loaded the
record. To override this behavior, check (select) the Force Update check
box, which ignores the Last Update Date, forces a refresh, and loads each
record regardless of whether it might have already been loaded from the
staging table. Use this approach prudently, however. Depending on the
volume of data to load, forcing updates can carry a price in processing time.

Generating Match Tokens During Load Jobs

When configuring the advanced properties of a base object in the Schema tool,
you can check (select) the Generate Match Tokens on Load check box to
generate match tokens during Load jobs, after the records have been loaded
into the base object. By default, this check box is unchecked (cleared), and
match tokens are generated during the Match process instead. For more
information, see "Editing Base Object Properties" on page 95 and "Run-time
Execution Flow of the Load Process" on page 231.

Load Job Metrics

After running a Load job, the Batch Viewer displays the following metrics (if
applicable) in the job execution log.

- 544 -
Metric Description
Total Number of records processed by the Load job.
records
Inserted Number of records inserted by the Load job into the target
object.
Updated Number of records updated by the Load job in the target object.
No action Number of records on which no action was taken (the records
already existed in the base object).
Updated Number of records that updated the cross-reference table for
XREF this base object. If you are loading a record during an
incremental load, that record has already been consolidated
(exists only in the XREF and not in the base object).
Records Number of records tokenized by the Load job. Applies only if the
tokenized Generate Match Tokens on Load check box is selected in the
Schema tool. For more information, see "Generating Match
Tokens During Load Jobs" on page 544.
Merge Number of updated cross-reference records that have been
contributor merged into other rowid_objects. Represents the difference
XREF between the total number of updated cross-reference records
records and the number of updated base object records.
Missing Number of source records that were missing lookup information
Lookup / or had invalid rowid_object records.
Invalid
rowid_
object
records

Manual Link Jobs


For link-style base objects only, after the Match job has been run, data
stewards can use the Merge Manager to process records that have been
queued by a Match job for manual linking.

Manual Merge Jobs


After the Match job has been run, data stewards can use the Merge Manager to
process records that have been queued by a Match job for manual merge.
Manual Merge jobs are run in the Merge Manager—not in the Batch Viewer.
The Batch Viewer only allows you to inspect job execution logs for Manual
Merge jobs that were run in the Merge Manager.

- 545 -
Maximum Matches for Manual Consolidation

In the Schema Manager, you can configure the maximum number of matches
ready for manual consolidation to prevent data stewards from being
overwhelmed with thousands of manual merges for processing. Once this limit
is reached, the Match jobs and the Auto Match and Merge jobs will not run until
the number of matches has been reduced. For more information, see
"Maximum Matches for Manual Consolidation" on page 368.

Executing a Manual Merge Job in the Merge Manager

When you start a Manual Merge job, the Merge Manager displays a dialog with
a progress indicator. A manual merge can take some time to complete. If
problems occur during processing, an error message is displayed on
completion. This error also shows up in the job execution log for the Manual
Merge job in the Batch Viewer.

In the Merge Manager, the process dialog includes a button labeled Mark
process as incomplete that updates the status of the Manual Merge job but
does not abort the Manual Merge job. If you click this button, the merge
process continues in the background. At this point, there will be an entry in the
Batch Viewer for this process. When the process completes, the success or
failure is reported. For more information about the Merge Manager, see the
Informatica MDM Hub Data Steward Guide.

Manual Unlink Jobs


For link-style base objects only, after a Manual Link job has been run, data
stewards can use the Data Manager to manually unlink records that have been
manually linked.

Manual Unmerge Jobs


For merge-style base objects only, after a Manual Merge job has been run,
data stewards can use the Data Manager to manually unmerge records that
have been manually merged. Manual Unmerge jobs are run in the Data
Manager—not in the Batch Viewer. The Batch Viewer only allows you to
inspect job execution logs for Manual Unmerge jobs that were run in the Data
Manager. For more information about the Data Manager, see the Informatica
MDM Hub Data Steward Guide.

Executing a Manual Unmerge Job in the Data Manager

When you start a Manual Unmerge job, the Data Manager displays a dialog
with a progress indicator. A manual unmerge can take some time to complete,

- 546 -
especially when a record in question is the product of many constituent
records If problems occur during processing, an error message is displayed on
completion. This error also shows up in the job execution log for the Manual
Unmerge in the Batch Viewer.

In the Data Manager, the process dialog includes a button labeled Mark
process as incomplete that updates the status of the Manual Unmerge job
but does not abort the Manual Unmerge job. If you click this button, the
unmerge process continues in the background. At this point, there will be an
entry in the Batch Viewer for this process. When the process completes, the
success or failure is reported.

Match Jobs
A match job generates search keys for a base object, searches through the
data for match candidates (records that are possible matches), applies the
match rules to the match candidates, generates the matches, and then queues
the matches for either automatic or manual consolidation. For an introduction,
see "Match Process" on page 245.

When you create a new base object in an ORS, Informatica MDM Hub
automatically creates its Match job. Each Match job compares new or updated
records in a base object with all records in the base object. For a detailed
description, see "Run-Time Execution Flow of the Match Process" on page 251.

After running a Match job, the matched rows are flagged for automatic and
manual consolidation. Informatica MDM Hub creates jobs that automatically
consolidate the appropriate records (automerge or autolink). If a record is
flagged for manual consolidation (manual merge or manual link), data
stewards must use the Merge Manager to perform the manual consolidation.
For more information about manual consolidation, see the Informatica MDM
Hub Data Steward Guide. For more information about consolidation, see
"About the Consolidate Process" on page 255.

You configure Match jobs in the Match / Merge Setup node in the Schema
Manager. For more information, see "Configuration Tasks for the Match
Process" on page 363.

Important: Do not run a Match job on a base object that is used to define
relationships between records in inter-table or intra-table match paths. Doing
so will change the relationship data, resulting in the loss of the associations
between records. For more information, see "Relationship Base Objects" on
page 374.

- 547 -
Match Tables

When a Informatica MDM Hub Match job runs for a base object, it populates its
match table with pairs of matched records. Match tables are usually named
Base_Object_MTCH. For more information, see "Populating the Match Table
with Match Pairs" on page 252.

Match Jobs and State-enabled Base Objects

The following table describes the details of the match batch process behavior
given the incoming states for state-enabled base objects:
Source Target Operation Result
Base Base
Object Object
State State
ACTIVE ACTIVE The records are analyzed for matching
PENDING ACTIVE Whether PENDING records are ignored in Batch Match is a
table-level parameter. If set, then batch match will
include PENDING records for the specified Base Object.
But the PENDING records can only be the source record in
a match.
DELETED Any state DELETED records are ignored in Batch Match
ANY PENDING PENDING records cannot be the target of a match.

Note: For Build Match Group (BMG), do not build groups with PENDING
records. PENDING records to be left as individual matches. PENDING matches
will have automerge_ind=2. For more information regarding how to manage
the state of base object or XREF records, refer to "Configuring State
Management for Base Objects" on page 162.

Auto Match and Merge Jobs

For merge-style base objects only, you can run the Auto Match and Merge job
for a base object. Auto Match and Merge batch jobs execute a continual cycle
of a Match job, followed by an Automerge job, until there are no more records
to match, or until the maximum number of records for manual consolidation
limit is reached (see "Maximum Matches for Manual Consolidation" on page
368). For more information, see "Auto Match and Merge Jobs" on page 532.

Match Stored Procedure

When executing the MATCH job stored procedure:


• CMXMA.MATCH just runs one batch.
• the Match job is dependent on the successful completion of all tokenization
jobs for the base object and any child tables used in intertable match. For
more information about the tokenization job, see "Generate Match Tokens

- 548 -
Jobs" on page 540. For more information about tokens for match, see
"About the Consolidate Process" on page 255.
• the Generate Match Tokens job need not be scheduled. Informatica MDM
Hub automatically runs it.

Setting Limits for Batch Jobs

The Match job for a base object does not attempt to match every record in the
base object against every other record in the base object. Instead, you specify
(in the Schema tool):
• how many records the job should match each time it runs. For more
information, see "Number of Rows per Match Job Batch Cycle" on page
368.
• how many matches are allowed for manual consolidation.
This feature helps to prevent data stewards from being overwhelmed with
manual merges for processing. Once this limit is reached, the Match job
will not run until the number of matches ready for manual consolidation
has been reduced. For more information, see "Maximum Matches for
Manual Consolidation" on page 368.

Selecting a Match Rule Set

For Match jobs, before executing the job, you can select the match rule set
that you want to use for evaluating matches.

The default match rule set for this base object is automatically selected. To
choose any other match rule set, click the drop-down list and select any other
match rule set that has been defined for this base object. For more
information, see "Configuring Match Rule Sets" on page 399.

Match Job Metrics

After running a Match job, the Batch Viewer displays the following metrics (if
applicable) in the job execution log:

- 549 -
Metric Description
Matched Number of records that were matched by the Match job.
records
Records Number of records that were tokenized by the Match job.
tokenized
Queued Number of records that were queued for automerge by the Match
for job. Use the Automerge job to process these records. For more
automerge information, see "Automerge Jobs" on page 534.
Queued Number of records that were queued for manual merge by the
for manual Match job. Use the Merge Manager in the Hub Console to process
merge these records. For more information, see the Informatica MDM
Hub Data Steward Guide.

Match Analyze Jobs


Match Analyze jobs perform a search to gather metrics but do not conduct any
actual matching. If areas of data with the potential for huge match
requirements (hot spots) are discovered, Informatica MDM Hub moves these
records to an on-hold status to prevent overmatching. Records that are on
hold have a consolidation indicator of 9, which allows a data steward to review
the data manually in the Data Manager tool before proceeding with the match
and consolidation. Match Analyze jobs are typically used to tune match rules
or simply to determine whether data for a base object is overly “matchy” or
has large intersections of data (“hot spots”) that will result in overmatching.

Dependencies for Match Analyze Jobs

Each Match Analyze job is dependent on new / updated records in the base
object that have been tokenized and are thus queued for matching. For base
objects that have intertable match enabled, the Match Analyze job is also
dependent on the successful completion of the data tokenization jobs for all
child tables, which in turn is dependent on successful Load jobs for the child
tables.

Limiting the Number of On-Hold Records

You can limit the number of records that the Match Analyze job moves to the
on-hold status. By default, no limit is set. To configure a limit, edit the
cmxcleanse.properties file and add the following setting:
cmx.server.match.threshold_to_move_range_to_hold = n

- 550 -
where n is the maximum number of records that the Match Analyze job can
move to the on-hold status. For more information about the
cmxcleanse.properties file, see the Informatica MDM Hub Installation Guide.

Match Analyze Job Metrics

After running a Match Analyze job, the Batch Viewer displays the following
metrics (if applicable) in the job execution log.
Metric Description
Records Number of records that were tokenized by the Match Analyze
tokenized job.
Records Number of records that were moved to a “Hold” status
moved to (consolidation indicator = 9) to avert overmatching. These
hold status records typically represent a hot spot in the data and are not run
through the match process. Data stewards can remove the hold
status in the Data Manager.
Records Number of records that were analyze for matching.
analyzed (to
be matched)
Match Number of actual matches that would be required to process this
comparisons base object.
required

Metrics in Execution Log


Metric Description
Records moved to Hold Number of records moved to Hold
Status
Records analyzed (to Number of records analyzed for match
be matched)
Match comparisons Number of actual matches that would be required to
required process this base object

Statistics
Statistic Description
Top 10 range count Top ten number of records in a given search range.
Top 10 range Top ten number of match comparison that will need to be
comparison count performed for a given search range.
Total records Count of the records moved to hold.
moved to hold
Total matches Total number of matches these records moved to hold
moved to hold required.
Total ranges Number of ranges required to process all the matches in
processed base object.
Total candidates Total number of match candidates required to process all
matches for this base object.
Time for analyze Amount of time required to run the analysis.

- 551 -
Match for Duplicate Data Jobs
Match for Duplicate Data jobs search for exact duplicates to consider them
matched. The maximum number of exact duplicates is based on the base
object columns defined in the Duplicate Match Threshold property in the
Schema Manager for each base object. For more information, see "Duplicate
Match Threshold" on page 91. For more information, see also "Matching for
Duplicate Data" on page 249.

Note: The Match for Duplicate Data job does not display in the Batch Viewer
when the duplicate match threshold is set to 1 and non-equal matches are
enabled on the base object.

To match for duplicate data:


1. Execute the Match for Duplicate Data job right after the Load job is
finished.
2. Once the Match for Duplicate Data job is complete, run the Automerge job
to process the duplicates found by the Match for Duplicate Data job.
3. Once the Automerge job is complete, run the regular match and merge
process (Match job and then Automerge job, or the Auto Match and Merge
job).

Migrate Link Style To Merge Style Jobs


For link-style base objects only, migrates link-style base objects to merge-
style base objects.

Multi Merge Jobs


A Multi Merge job allows the merge of multiple records in a single job—
essentially incorporating the entire set of records to be merged as one batch.
This batch job is initiated only by external applications that invoke the SIF
MultiMergeRequest request. For more information, see the Informatica MDM
Hub Services Integration Framework Guide.

Promote Jobs
For state-enabled objects, the Promote job reads the PROMOTE_IND column
from an XREF table and changes the system state to ACTIVE for all rows
where the column’s value is 1. Informatica MDM Hub resets PROMOTE_IND
after the Promote job has run.

Note: The PROMOTE_IND column on a record is not changed to 0 during the


promote batch process if the record is not promoted.

- 552 -
Here are the behavior details for the Promote batch job:
XREF Base Object Hub Hub Refresh Resulting Operation Result
State State Before Action Action BVT? BO State
Before Promote on XREF on BO
Promote
PENDING ACTIVE Promote Update Yes ACTIVE Informatica MDM Hub
promotes the pending
XREF and recalculates the
BVT to include the
promoted XREF.
PENDING PENDING Promote Promote Yes ACTIVE Informatica MDM Hub
promotes the pending
XREF and base object.
The BVT is then calculated
based on the promoted
XREF.
DELETED This operation None None No The state Informatica MDM Hub
behaves the of the ignores DELETED records
same way resulting in Batch Promote. This
regardless of base scenario can only happen
the state of object if a record that had been
the base record is flagged for promotion is
object record. unchanged deleted prior to running
by this the Promote batch
operation. process.
ACTIVE This operation None None No The state Informatica MDM Hub
behaves the of the ignores ACTIVE records in
same way resulting Batch Promote. This
regardless of base scenario can only happen
the state of object if a record that had been
the base record is flagged for promotion is
object record. unchanged made ACTIVE prior to
by this running the Promote
operation. batch process.

NotePromote and delete operations will cascade to direct child records.

You can run the Promote job using the following methods:
• Using the Hub Console; for more information, see "Running Promote Jobs
Using the Hub Console" on page 553.
• Using the CMXSM.AUTO_PROMOTE stored procedure; for more
information, see "Promote Jobs" on page 592.
• Using the Services Integration Framework (SIF) API (and the associated
SiperianClient Javadoc); for more information, see the Informatica MDM
Hub Services Integration Framework Guide.

Running Promote Jobs Using the Hub Console

To run an Promote job:


1. In the Hub Console, start either of the following tools:

- 553 -
• Batch Viewer according to the instructions in "Starting the Batch
Viewer Tool" on page 501
• Batch Group according to the instructions in "Starting the Batch Group
Tool" on page 514
2. Select the Promote job for the desired base object.
3. Execute the Promote job according to the instructions in "Running Batch
Jobs Manually" on page 502 or "Executing Batch Groups Using the Batch
Group Tool" on page 522.
4. Display the results of the Promote job according to the instructions in
"Viewing Job Execution Logs" on page 506.

Informatica MDM Hub displays the results of the Promote job.

Promote Job Metrics

After running a Promote job, the Batch Viewer displays the following metrics
(if applicable) in the job execution log.
Metric Description
Autopromoted records Number of records that were promoted by the
Promote job.
Deleted XREF records Number of XREF records that were deleted by the
Promote job.
Active records not Number of ACTIVE records that were not
promoted promoted.
Protected records not Number of protected records that were not
promoted promoted.

Once the Promote job has run, you can view these statistics on the job
summary page in the Batch Viewer.

Recalculate BO Jobs
There are two versions of Recalculate BO:

- 554 -
• Using the ROWID_OBJECT_TABLE Parameter—Recalculates all base
objects identified by ROWID_OBJECT column in the table/inline view (note
that brackets are required around inline view).
• Without the ROWID_OBJECT_TABLE Parameter—Recalculates all
records in the base object, in batches of MATCH_BATCH_SIZE or 1/4 the
number of the records in the table, whichever is less.

For more information, see "Recalculate BO Jobs" on page 592.

Recalculate BVT Jobs


Recalculates the BVT for the specified ROWID_OBJECT.

For more information, see "Recalculate BVT Jobs" on page 593.

Reset Links Jobs


For link-style base objects only, allows you to remove links for an existing
base object.

Reset Match Table Jobs


The Reset Match Table job is created automatically after you run a match job
and the following conditions exist: if records have been updated to
consolidation_ind = 2, and if you then change your match rules, as described
in "Configuring Match Column Rules for Match Rule Sets" on page 407.

If you change your match rules after matching, you are prompted to reset
your matches. When you reset matches, everything in the match table is
deleted. In addition, the Reset Match Table job then resets the consolidation_
ind=4 where it is =2. For more information, see "About the Consolidate
Process" on page 255.

When you save changes to the schema match columns, the following message
box is displayed.

Click Yes to reset the existing matches and create a Reset Match Table job in
the Batch Viewer.

Note: If you do not reset the existing matches, your next Match job will take
longer to execute because Informatica MDM Hub will need to regenerate the
match tokens before running the Match job.

Note: This job cannot be run from the Batch Viewer.

- 555 -
Revalidate Jobs
Revalidate jobs execute the validation logic/rules for records that have been
modified since the initial validation during the Load Process. You can run
Revalidate if/when records change post the initial Load process’s validation
step. If no records change, no records are updated. If some records have
changed and get caught by the existing validation rules, the metrics will show
the results.

Note: Revalidate jobs can only be run if validation is enabled on a column


after an initial load and prior to merge on base objects that have validate rules
setup.

Revalidate is executed manually using the batch viewer for base objects. For
more information, see "Running Batch Jobs Using the Batch Viewer Tool" on
page 501.

Stage Jobs
Stage jobs move data from a landing table to a staging table, performing any
cleansing that has been configured in the Informatica MDM Hub mapping
between the tables (see "Mapping Columns Between Landing and Staging
Tables" on page 286). Stage jobs have parallel cleanse jobs that you can run
(see "About Data Cleansing in Informatica MDM Hub" on page 307). The stage
status indicates which Cleanse Match Server is hit during a stage. For more
information about staging data, see "Configuration Tasks for the Stage
Process" on page 274.

For state-enabled base objects, records are rejected if the HUB_STATE_IND


value is not valid. For more information regarding how to manage the state of
base object or XREF records, refer to "About State Management in Informatica
MDM Hub" on page 159.

Note: If the Stage job is grayed out, then the mapping has become invalid
due to changes in the staging table, in a column mapping, or in a cleanse
function. Open the specific mapping using the Mappings tool, verify it, and
then save it. For more information, see "Mapping Columns Between Landing
and Staging Tables" on page 286.

Stage Job Stored Procedure

When executing the Stage job stored procedure:


• Run the Stage job only if the ETL process responsible for loading the
landing table used by the Stage job completes successfully.
• Make sure that there are no dependencies between Stage jobs.

- 556 -
• You can run multiple Stage jobs simultaneously if there are multiple
Cleanse Match Servers set up to run the jobs.

For more information, see "Stage Jobs" on page 596.

Stage Job Metrics

After running a Stage job, the Batch Viewer displays the following metrics in
the job execution log.

Metric Description
Total Number of records processed by the Stage job.
records
Inserted Number of records inserted by the Stage job into the target object.
Rejected Number of records rejected by the Stage job. For more information,
see "Viewing Rejected Records" on page 510.

Synchronize Jobs
You must run the Synchronize job after any changes are made to the schema
trust settings. The Synchronize job is created when any changes are made to
the schema trust settings, as described in "Batch Jobs That Are Created When
Changes Occur" on page 500. For more information, see "Configuring Trust for
Source Systems" on page 344.

Reminder Prompt for Running Synchronize Jobs

When you save changes to schema column trust settings in the Systems and
Trust tool, a message box reminds you to run the synchronize process.

Clicking OK does not synchronize the column trust settings—this is just an


information box that tells you to run the Synchronize job.

- 557 -
Running Synchronize Jobs

To run the Synchronize job, navigate to the Batch Viewer, find the correct
Synchronize job for the base object, and run it. Informatica MDM Hub updates
the metadata for the base objects that have trust enabled after initial load has
occurred.

Considerations for Running Synchronize Jobs


• If you do not run the Synchronize job, you will not be able to run a Load
job.
• This job can be run from the Batch Viewer only when a trust update is
required for the base object. For more information, see "Running
Synchronize Batch Jobs After Changes to Trust Settings" on page 352.
• A Synchronize job fails if a large number of trust-enabled columns are
defined. The exact number of columns that cause the job to fail is variable
and is based on the length of the column names and the number of trust-
enabled columns. Long column names are at—or close to—the maximum
allowable length of 26 characters. To avoid this problem, keep the number
of trust-enabled columns below 48 and/or the length of the column names
short. A workaround is to enable all trust/validation columns before saving
the base object to avoid running the Synchronize job.

- 558 -
Chapter 18: Writing Custom Scripts
to Execute Batch Jobs

This chapter explains how to create custom scripts to execute batch jobs and
batch groups in a Informatica MDM Hub implementation. The information in
this chapter is intended for implementation teams and system administrators.
For information how to configure and execute Informatica MDM Hub batch
jobs using the Batch Viewer and Batch Group tools in the Hub Console, see
"About Informatica MDM Hub Batch Jobs" on page 496.

Important: You must have the application server running for the duration of
a batch job.

Chapter Contents
• "About Executing Informatica MDM Hub Batch Jobs" on page 559
• "Setting Up Job Execution Scripts" on page 560
• "Monitoring Job Results and Statistics" on page 563
• "Stored Procedure Reference" on page 566
• "Executing Batch Groups Using Stored Procedures" on page 598
• "Developing Custom Stored Procedures for Batch Jobs" on page 604

About Executing Informatica MDM Hub Batch


Jobs
A Informatica MDM Hub batch job is a program that, when executed,
completes a discrete unit of work (a process). All public batch jobs in
Informatica MDM Hub can be executed as database stored procedures. For
more information about batch jobs, see the "Using Batch Jobs " on page 496.

In the Hub Console, the Informatica MDM Hub Batch Viewer and Batch Group
tools provide simple mechanisms for executing Informatica MDM Hub batch
jobs. However, they do not provide a means for executing and managing jobs
on a scheduled basis. To execute and manage jobs according to a schedule,
you need to execute stored procedures that do the work of batch jobs or batch
groups. Most organizations have job management tools that are used to
control IT processes. Any such tool capable of executing Oracle PL*SQL or DB2
 SQL commands can be used to schedule and manage Informatica MDM Hub
batch jobs.

- 559 -
Setting Up Job Execution Scripts
This section describes how to set up job execution scripts for running
Informatica MDM Hub stored procedures.

About Job Execution Scripts


Execution scripts enable you to run stored procedures on a scheduled basis to
execute and manage jobs.

Use job execution scripts to perform the following tasks:


• determine whether stored procedures can be run using job scheduling
tools; for more information, see "Determining Available Execution Scripts"
on page 563
• retrieve identifiers for scripts that execute stored procedures; for more
information, see "Retrieving Values from C_REPOS_TABLE_OBJECT_V at
Execution Time" on page 563
• determine which batch jobs are available to be executed using stored
procedures; for more information, see "Determining Available Execution
Scripts" on page 563
• schedule stored procedures to run synchronously or asynchronously; for
more information, see "Running Scripts Asynchronously" on page 563

Informatica MDM Hub provides information regarding stored procedures, such


as whether a stored procedure can be run using job scheduling tools, or how to
retrieve identifiers that execute stored procedures in the C_REPOS_TABLE_
OBJECT_V view.

About the C_REPOS_TABLE_OBJECT_V View


The C_REPOS_TABLE_OBJECT_V view contains metadata and identifiers for
the Informatica MDM Hub stored procedures.

Metadata in the C_REPOS_TABLE_OBJECT_V View

Informatica MDM Hub populates the C_REPOS_TABLE_OBJECT_V view with


metadata about its stored procedures. You use this metadata to:
• determine whether a stored procedure can be run using job scheduling
tools, as described in "Determining Available Execution Scripts" on page
563
• retrieve identifiers in the job execution scripts that execute Informatica
MDM Hub stored procedures, as described in "Retrieving Values from C_
REPOS_TABLE_OBJECT_V at Execution Time" on page 563

- 560 -
C_REPOS_TABLE_OBJECT_V has the following columns:
C_REPOS_TABLE_OBJECT_V Columns
Column Description
Name
ROWID_ Uniquely identifies a batch job.
TABLE_
OBJECT
ROWID_ Depending on the type of batch job, this is the table identifier for
TABLE either the table affected by the job (target table) or the table
providing the data for the job (source table).
• For Stage jobs, ROWID_TABLE refers to the target table
(staging table).
• For Load jobs, ROWID_TABLE refers to the source table
(staging table).
• For Match, Match Analyze, Autolink, Automerge, Auto Match
and Merge, External Match, Generate Match Tokens, and Key
Match jobs, ROWID_TABLE refers to the base object table,
which is both source and target for the jobs.
OBJECT_ Description of the type of batch job. Examples include:
NAME • Stage jobs: CMX_CLEANSE.EXE.
• Load jobs: CMXLD.LOAD_MASTER.
• Match and Match Analyze jobs: CMXMA.MATCH.
OBJECT_ Description of the batch job, including the type of batch job as
DESC well as the object affected by the batch job. Examples include:
• Stage for C_STG_CUSTOMER_CREDIT
• Load from C_STG_CUSTOMER_CREDIT
• Match and Merge for C_CUSTOMER
OBJECT_ Together with OBJECT_FUNCTION_TYPE_CODE, this is a foreign
TYPE_CODE key to C_REPOS_OBJ_FUNCTION_TYPE.
An OBJECT_TYPE_CODE of “P” indicates a procedure that can
potentially be executed by a scheduling tool.
OBJECT_ Indicates the actual procedure type (stage, load, match, and so
FUNCTION_ on).
TYPE_CODE
PUBLIC_ Indicates whether the procedure is a procedure that can be
IND displayed in the Batch Viewer.
PARAMETER Describes the parameter list for the procedure. Where specific
ROWID_TABLE values are required for the procedure, these are
shown in the parameter list. Otherwise, the name of the
parameter is simply displayed in the parameter list.
An exception to this is the parameter list for Stage jobs (where
OBJECT_NAME = CMX_CLEANSE.EXE). In this case, the full
parameter list is not shown. For a list of parameters, see "Stage
Jobs" on page 596.
VALID_IND If VALID_IND is not equal to 1, do not execute the
procedure. It means that some repository settings have
changed that affect the procedure. This usually applies to
changes that affect the Stage jobs if the mappings have not been
checked and saved again. For more information, see
"Determining Available Execution Scripts" on page
563."Determining Available Execution Scripts" on page 563

- 561 -
Identifiers in the C_REPOS_TABLE_OBJECT_V View

Use the following identifier values in C_REPOS_TABLE_OBJECT_V to execute


stored procedures.
OBJECT_NAME OBJECT_DESC OBJECT_ OBJECT_ OBJECT_
TYPE_ FUNCTION_ FUNCTION_
CODE TYPE_CODE TYPE_DESC
CMXUT.ACCEPT_ Change the status of records that have P U Accept Non-
NON_MATCH_ undergone the match process but had no matched
UNIQUE matching data. Records As
Unique
CMXMM.AUTOLINK Link data in BaseObjectName P I Autolink
(Procedure)
CMXMM.AUTOMERGE Merge data in BaseObjectName P G Automerge
(Procedure)
CMXMM.BUILD_BVT Generate BVT snapshot for P V BVT
BaseObjectName snapshot
CMXMA.EXTERNAL_ External Match for BaseObjectName P E External
MATCH match
CMXMA.GENERATE_ Generate Match Tokens for P N Generate
MATCH_TOKENS BaseObjectName match
tokens
CMXMA.KEY_MATCH Key Match for BaseObjectName P K Key match
CMXLD.LOAD_ Load from Link BaseObjectName P L Load
MASTER
CMXMM.MERGE Process records that have been queued P Y Manual
by a Match job for manual merge. merge
CMXMA.MATCH Match Analyze for BaseObjectName P Z Match
analyze
CMXMA.MATCH Match for BaseObjectName P M Match
CMXMA.MATCH_ Match and Merge for BaseObjectName P B Auto match
AND_MERGE and merge
CMXMA.MATCH_ Match for Duplicate Data for P D Match for
FOR_DUPS BaseObjectName duplicate
data
CMXMM.MLINK Manual Link for BaseObjectName P O Manual link
CMXMA.MIGRATE_ Migrate Link Style to Merge Style for P J Migrate link
LINK_STYLE_TO_ BaseObjectName style to
MERGE_STYLE merge style
CMXMM.MULTI_ Multi Merge for BaseObjectName P P Multi merge
MERGE
CMXSM.AUTO_ Reads the PROMOTE_IND column from an P PR Promote
PROMOTE XREF table and for all rows where the
column’s value is 1, changes the ACTIVE
state to on.
CMXMM.MUNLINK Manual Unlink for BaseObjectName P Q Manual
unlink
CMXMA.RESET_ Reset Links for BaseObjectName P W Reset links
LINKS
CMXMA.RESET_ Reset Match table for BaseObjectName P R Reset
MATCH match table
CMXUT.REVALIDATE_ Revalidate BaseObjectName P H Revalidate
BO BO
CMXCL.START_ Stage for TargetStagingTableName P C Stage
CLEANSE

- 562 -
OBJECT_NAME OBJECT_DESC OBJECT_ OBJECT_ OBJECT_
TYPE_ FUNCTION_ FUNCTION_
CODE TYPE_CODE TYPE_DESC
CMXUT.SYNC Synchronize after changes are made to P S Synchronize
the schema trust settings.
CMXMM.UNMERGE Unmerge for BaseObjectName P X Manual
unmerge

Determining Available Execution Scripts


To determine which batch jobs are available to be executed using stored
procedures, run a query using the standard Informatica MDM Hub view called
C_REPOS_TABLE_OBJECT_V:
SELECT *
FROM C_REPOS_TABLE_OBJECT_V
WHERE PUBLIC_IND = 1 :

Retrieving Values from C_REPOS_TABLE_OBJECT_V


at Execution Time
Use SQL statements to retrieve values from C_REPOS_TABLE_OBJECT_V when
executing scripts at run time. The following example code retrieves the
STG_ROWID_TABLE and ROWID_TABLE_OBJECT for cleanse jobs.
SELECT A.ROWID_TABLE, A.ROWID_TABLE_OBJECT INTO IN_STG_ROWID_TABLE, IN_
ROWID_TABLE_OBJECT
FROM C_REPOS_TABLE_OBJECT_V A, C_REPOSE_TABLE B
WHERE A.OBJECT_NAME = 'CMX_CLEANSE.EXE'
AND B.ROWID_TABLE = A.ROWID_TABLE
AND B.TABLE_NAME = 'C_HMO_ADDRESS'
AND A.VALID_IND = 1;

Running Scripts Asynchronously


By default, the execution scripts run synchronously (IN_RUN_SYNCH = ‘TRUE’
or IN_RUN_SYNCH = NULL). To run the execution scripts asynchronously,
specify IN_RUN_SYNCH = ‘FALSE’. Note that these Boolean values are case-
sensitive and must be specified in upper-case characters.

Monitoring Job Results and Statistics


This section describes how to monitor the results and view the associated
statistics of batch jobs run in job execution scripts.

Error Messages and Return Codes


Informatica MDM Hub stored procedures return an error message and return
code.

- 563 -
Returned Description
Parameter
OUT_ERROR_ Error message if an error occurred.
MSG
OUT_RETURN_ Return code. Zero (0) if no errors occurred, or one (1) if an
CODE error occurred.

Error handling code in job execution scripts can look for return codes and trap
any associated error messaged.

Error Handling and Transaction Management


The stored procedures are transaction-enabled and can be rolled back if an
error occurs during execution. After you invoke a stored procedure, check the
return code (OUT_RETURN_CODE) in your error handling:
• If any failure occurred during execution (OUT_RETURN_CODE <> 0),
immediately roll back any changes. Wait until after you have successfully
rolled back the changes before you invoke the stored procedure again.
• If no failure occurred during execution (OUT_RETURN_CODE = 0), commit
any changes.

Housekeeping for Temporary Tables


Informatica MDM Hub stored procedures, when invoked directly, generally
clean up any internal temporary files created during execution. However:
• Certain stored procedures have an OUT_TMP_TABLE_LIST return
parameter, which consists of temporary tables that contain data that could
be useful for debugging purposes. For such stored procedures, if OUT_
RETURN_CODE=0 is returned, pass the returned OUT_TMP_TABLE_LIST
parameter to the CMXUT.DROP_TEMP_TABLES stored procedure to clean
up the temporary tables that were returned in the parameter.
IF rc = 0 THEN
 COMMIT;
 cmxut.drop_table_in_list( out_tmp_table_list, out_error_message, rc
);
END IF;

• Certain stored procedures will also register the temporary tables that
remain so that a server-side process can periodically remove them.

Job Execution Status


Informatica MDM Hub stored procedures log their job execution status and
statistics in the Informatica MDM Hub repository. The following figure
illustrates the repository tables that can be used for monitoring job results and
statistics:

- 564 -
The following table describes the various repository tables.

Repository Tables Used for Monitoring Job Results and Statistics


Table Description
Name
C_ As soon as a job starts to run, it registers itself in C_REPOS_JOB_
REPOS_ CONTROL with a RUN_STATUS of 2 (Running/Processing). Once the
JOB_ job completes, its status is updated to one of the following values:
CONTROL • 0 (Completed Successfully)—Completed without any errors
or warnings.
• 1 (Completed with Errors)—Completed, but with some
warnings or data rejections. See the RETURN_CODE for any
error code and the STATUS_MESSAGE for a description of the
error/warning.
• 2 (Running / Processing)
• 3 (Failed—Job did not complete). Corrective action must be
taken and the job must be run again. See the RETURN_CODE for
any error code and the STATUS_MESSAGE for the reason for
failure.
• 4 (Incomplete)—The job failed before updating its job status
and has been manually marked as incomplete. Corrective action
must be taken and the job must be run again. RETURN_CODE
and STATUS_MESSAGE will not provide any useful information.
Marked as incomplete by clicking the Set Status to Incomplete
button in the Batch Viewer.
C_ When a batch job has completed, it registers its statistics in
REPOS_ C_REPOS_JOB_METRIC. There can be multiple statistics for each
JOB_ job. Join to C_REPOS_JOB_METRIC_TYPE to get a description for
METRIC each statistic. For additional information, see "About Batch Job
Metrics" on page 508.
C_ Stores the descriptions of the types of metrics that can be
REPOS_ registered in C_REPOS_JOB_METRIC.
JOB_
METRIC_
TYPE
C_ Stores the descriptions of the RUN_STATUS values that can be
REPOS_ registered in C_REPOS_JOB_CONTROL.

- 565 -
Table Description
Name

Stored Procedure Reference


This section provides a reference for the stored procedures that represent
Informatica MDM Hub batch jobs. Informatica MDM Hub provides these stored
procedures, in compiled form, for each Operational Reference Store (ORS),
for Oracle databases. You can use any job scheduling software (such as Tivoli,
CA Unicenter, and so on) to execute these stored procedures.

Note: All the input parameters that need a delimited list require a trailing “~”
character.

Alphabetical List of Batch Jobs


Batch Job Description
"Accept For records that have undergone the match process but
Non- had no matching data, sets the consolidation indicator to
matched 1 (consolidated), meaning that the record was unique and
Records As did not require consolidation.
Unique " on
page 568
"Autolink Automatically links records that have qualified for autolinking
Jobs" on during the match process and are flagged for autolinking
page 570 (Autolink_ind=1). Used with link-style base objects only.
"Auto Match Executes a continual cycle of a Match job, followed by an
and Merge Automerge job, until there are no more records to match, or
Jobs" on until the size of the manual merge queue exceeds the
page 570 configured threshold. Used with merge-style base objects only.
"Automerge Automatically merges records that have qualified for
Jobs" on automerging during the match process and are flagged for
page 571 automerging (Automerge_ind=1). Used with merge-style base
objects only.
"BVT Generates a snapshot of the best version of the truth (BVT) for a
Snapshot base object. Used with link-style base objects only.
Jobs" on
page 572
"Execute Constructs an XML message and sends it to the MRM Server SIF
Batch Group API (ExecuteBatchGroupRequest), which performs the
Jobs" on operation. For more information, see "Stored Procedures for
page 572 Batch Groups" on page 599.
"External Matches “externally managed/prepared” records with an
Match Jobs" existing base object, yielding the results based on the current
on page 572 match settings—all without actually modifying the data in the
base object.
"Generate Prepares data for matching by generating match tokens
Match Token according to the current match settings. Match tokens are
Jobs" on strings that encode the columns used to identify candidates for
page 573 matching.
"Get Batch Returns the status of a batch group. For more information, see
Group Status "Stored Procedures for Batch Groups" on page 599.

- 566 -
Batch Job Description
Jobs" on
page 575
"Hub Delete Deletes data from the Hub based on base object / XREF level
Jobs" on input.
page 575
"Key Match Matches records from two or more sources when these sources
Jobs" on use the same primary key. Compares new records to each other
page 579 and to existing records, and identifies potential matches based
on the comparison of source record keys as defined by the
match rules.
"Load Jobs" Copies records from a staging table to the corresponding target
on page 580 base object in the Hub Store. During the load process, it also
applies the current trust and validation rules to the records.
"Manual Link Shows logs for records that have been manually linked in the
Jobs" on Merge Manager tool. Used with link-style base objects only.
page 581
"Manual Shows logs for records that have been manually merged in the
Merge Jobs" Merge Manager tool. Used with merge-style base objects only.
on page 581
"Manual Shows logs for records that have been manually unlinked in the
Unlink Jobs" Data Manager tool. Used with link-style base objects only.
on page 582
"Manual Shows logs for records that have been manually unmerged in
Unmerge the Merge Manager tool. Used with merge-style base objects
Jobs" on only.
page 583
"Match Jobs" Finds duplicate records in the base object, based on the current
on page 586 match rules.
"Match Conducts a search to gather match statistics but does not
Analyze actually perform the match process. If areas of data with the
Jobs" on potential for huge match requirements are discovered,
page 588 Informatica MDM Hub moves the records to a hold status, which
allows a data steward to review the data manually before
proceeding with the match process.
"Match for For data with a high percentage of duplicate records, compares
Duplicate new records to each other and to existing records, and identifies
Data Jobs" exact duplicates. The maximum number of exact duplicates is
on page 589 based on the Duplicate Match Threshold setting for this base
object.
Note: The Match for Duplicate Data batch job has been
deprecated.
"Promote Reads the PROMOTE_IND column from an XREF table and
Jobs" on changes to ACTIVE the state on all rows where the column’s
page 592 value is 1.
"Recalculate Recalculates all base objects identified by ROWID_OBJECT
BO Jobs" on column in the table/inline view if you include the ROWID_
page 592 OBJECT_TABLE parameter.
If you do not include the parameter, this batch job recalculates
all records in the base object, in batches of MATCH_BATCH_
SIZE or 1/4 the number of the records in the table, whichever is
less.
"Recalculate Recalculates the BVT for the specified ROWID_OBJECT.
BVT Jobs" on
page 593

- 567 -
Batch Job Description
"Reset Batch Resets a batch group. For more information, see "Stored
Group Status Procedures for Batch Groups" on page 599.
Jobs" on
page 594
"Reset Links Updates the records in the _LINK table to account for changes in
Jobs" on the data. Used with link-style base objects only.
page 594
"Reset Match Shows logs of the operation where all matched records have
Table Jobs" been reset to be queued for match.
on page 594
"Revalidate Executes the validation logic/rules for records that have been
Jobs" on modified since the initial validation during the Load Process. You
page 595 can run Revalidate if/when records change after the initial Load
process’s validation step. If no records change, no records are
updated. If some records have changed and get caught by the
existing validation rules, the metrics will show the results.
"Stage Jobs" Copies records from a landing table into a staging table. During
on page 596 execution, cleanses the data according to the current cleanse
settings.
"Synchronize Updates metadata for base objects. Used after a base object
Jobs" on has been loaded but not yet merged, and subsequent trust
page 597 configuration changes (such as enabling trust) have been made
to columns in that base object. This job must be run before
merging data for this base object.

Accept Non-matched Records As Unique


Accept Non-matched Records As Unique jobs change the status of records that
have undergone the match process but had no matching data. This job sets the
consolidation indicator to 1, meaning that the record is consolidated or (in this
case) did not require consolidation. The Automerge job adheres to this setting
and treats these as unique records.

The Accept Non-matched Records As Unique job is created:


• only if the base object has Accept All Unmatched Rows as Unique
enabled (set to Yes) in the Match / Merge Setup configuration. For more
information, see "Accept All Unmatched Rows as Unique" on page 369.
• only after a merge job is run, as described in "Batch Jobs That Are Created
When Changes Occur" on page 500.

Note: This job cannot be executed from the Batch Viewer.

Stored Procedure Definition for Accept Non-matched Records As


Unique Jobs
PROCEDURE CMXUT.ACCEPT_NON_MATCH_UNIQUE (
IN_ROWID_TABLE IN CHAR(14)
,IN_ROWID_USER IN CHAR(14)
,IN_ASSIGNMENT_IND INT
,OUT_ACCEPT_UNIQUE_CNT OUT INT
,OUT_ERROR_MSG OUT VARCHAR2(1024)
,RC OUT INT

- 568 -
)

Sample Job Execution Script for Accept Non-matched Records As


Unique
-- ACCEPT RECORDS ASSIGNED TO ALL USERS
DECLARE
V_ROWID_TABLE CHAR( 14 );
OUT_ACCEPT_UNIQUE_CNT INTEGER;
OUT_ERROR_MESSAGE VARCHAR2( 1024 );
OUT_RETURN_CODE INTEGER;
BEGIN
SELECT ROWID_TABLE
INTO V_ROWID_TABLE
FROM C_REPOS_TABLE
WHERE TABLE_NAME = 'C_CUSTOMER';

CMXUT.ACCEPT_NON_MATCH_UNIQUE( V_ROWID_TABLE, NULL, 0,


OUT_ACCEPT_UNIQUE_CNT, OUT_ERROR_MESSAGE, OUT_RETURN_CODE );
DBMS_OUTPUT.PUT_LINE( 'NUMBER FOR RECORDS ACCEPTED AS UNIQUE: ' ||
OUT_ACCEPT_UNIQUE_CNT );
DBMS_OUTPUT.PUT_LINE( 'RETURN MESSAGE: ' || SUBSTR( OUT_ERROR_MESSAGE,
1, 255 ));
DBMS_OUTPUT.PUT_LINE( 'RETURN CODE: ' || OUT_RETURN_CODE );
END;
/
-- ACCEPT ONLY RECORDS ASSIGNED TO SPECIFIC USER
DECLARE
V_ROWID_TABLE CHAR( 14 );
V_ROWID_USER CHAR( 14 );
OUT_ACCEPT_UNIQUE_CNT INTEGER;
OUT_ERROR_MESSAGE VARCHAR2( 1024 );
OUT_RETURN_CODE INTEGER;
BEGIN
SELECT ROWID_TABLE
INTO V_ROWID_TABLE
FROM C_REPOS_TABLE
WHERE TABLE_NAME = 'C_CUSTOMER';

SELECT ROWID_USER
INTO V_ROWID_USER
FROM C_REPOS_USER
WHERE USER_NAME = 'ADMIN';

CMXUT.ACCEPT_NON_MATCH_UNIQUE( V_ROWID_TABLE, V_ROWID_USER, 1, OUT_


ACCEPT_UNIQUE_CNT, OUT_ERROR_MESSAGE, OUT_RETURN_CODE );
DBMS_OUTPUT.PUT_LINE( 'NUMBER FOR RECORDS ACCEPTED AS UNIQUE: ' ||
OUT_ACCEPT_UNIQUE_CNT );
DBMS_OUTPUT.PUT_LINE( 'RETURN MESSAGE: ' || SUBSTR( OUT_ERROR_MESSAGE,
1, 255 ));
DBMS_OUTPUT.PUT_LINE( 'RETURN CODE: ' || OUT_RETURN_CODE );
COMMIT;
END;
/

- 569 -
Autolink Jobs
Autolink jobs automatically link records that have qualified for autolinking
during the match process and are flagged for autolinking (Autolink_ind = 1).

Auto Match and Merge Jobs


Auto Match and Merge batch jobs execute a continual cycle of a Match job,
followed by an Automerge job, until there are no more records to match, or
until the size of the manual merge queue exceeds the configured threshold.
Auto Match and Merge jobs are used with merge-style base objects only. For
more information, see "Auto Match and Merge Jobs" on page 532.

Important: Do not run an Auto Match and Merge job on a base object that is
used to define relationships between records in inter-table or intra-table
match paths. Doing so will change the relationship data, resulting in the loss of
the associations between records. For more information, see "Relationship
Base Objects" on page 374.

Identifiers for Executing Auto Match and Merge Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

Dependencies for Auto Match and Merge Jobs

The Auto Match and Merge jobs for a target base object can either be run on
successful completion of each Load job, or on successful completion of all
Load jobs for the object.

Successful Completion of Auto Match and Merge Jobs

Auto Match and Merge jobs must complete with a RUN_STATUS of 0
(Completed Successfully) or 1 (Completed with Errors) to be considered
successful.

Stored Procedure Definition for Auto Match and Merge Jobs


PROCEDURE CMXMA.MATCH_AND_MERGE (
IN_ROWID_TABLE IN CHAR(14) --Rowid of a table
,IN_USER_NAME IN VARCHAR2(50) --User name
,IN_MATCH_SET_NAME IN VARCHAR2(500) DEFAULT NULL
,OUT_ERROR_MSG OUT VARCHAR2(1024) --Error message, if any
,RC OUT INT
,IN_JOB_GRP_CTRL IN CHAR(14) DEFAULT NULL
,IN_JOB_GRP_ITEM IN CHAR(14) DEFAULT NULL

- 570 -
)

Sample Job Execution Script for Auto Match and Merge Jobs
DECLARE
IN_ROWID_TABLE CHAR(14);
IN_USER_NAME VARCHAR2(50);
IN_MATCH_SET_NAME VARCHAR(500);
OUT_ERROR_MSG VARCHAR2(1024);
OUT_RETURN_CODE NUMBER;
BEGIN
IN_ROWID_TABLE := 'SVR1.188';
IN_USER_NAME := 'CMX_ORS';
IN_MATCH_SET_NAME := 'MRS2';
OUT_ERROR_MSG := NULL;
OUT_RETURN_CODE := NULL;
CMXMA.MATCH_AND_MERGE ( IN_ROWID_TABLE, IN_USER_NAME,
IN_MATCH_SET_NAME, OUT_ERROR_MSG, OUT_RETURN_CODE );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('RC = ' || TO_CHAR(OUT_RETURN_CODE));
COMMIT;
END;

Automerge Jobs
Automerge jobs automatically merge records that have qualified for
automerging during the match process and are flagged for automerging
(Automerge_ind = 1). Automerge jobs are used with merge-style base objects
only. For more information, see "Automerge Jobs" on page 534.

Identifiers for Executing Automerge Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

Dependencies for Automerge Jobs

Each Automerge job is dependent on the successful completion of the match


process, and the queuing of records for automerge.

Successful Completion of Automerge Jobs

Automerge jobs must complete with a RUN_STATUS of 0 (Completed


Successfully) or 1 (Completed with Errors) to be considered successful.

Stored Procedure Definition for Automerge Jobs


PROCEDURE CMXMM.AUTOMERGE (
IN_ROWID_TABLE IN CHAR(14) --Rowid of a table
,IN_USER_NAME IN VARCHAR2(50) --User name

- 571 -
,OUT_ERROR_MESSAGE OUT VARCHAR2(1024) --Error message, if any
,OUT_RETURN_CODE OUT NUMBER --Return code (if no errors, 0 is returned)
)

Sample Job Execution Script for Automerge Jobs


DECLARE
 IN_ROWID_TABLE CHAR(14);
 IN_USER_NAME VARCHAR2(50);
 OUT_ERROR_MESSAGE VARCHAR2(1024);
 OUT_RETURN_CODE NUMBER;

BEGIN
 IN_ROWID_TABLE := NULL;
 IN_USER_NAME := NULL;
 OUT_ERROR_MESSAGE := NULL;
 OUT_RETURN_CODE := NULL;

CMXMM.AUTOMERGE ( IN_ROWID_TABLE, IN_USER_NAME, OUT_ERROR_MESSAGE, OUT_


RETURN_CODE );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('OUT_RETURN_CODE = ' || TO_CHAR(OUT_RETURN_CODE));
COMMIT;
END;

BVT Snapshot Jobs


The BVT Snapshot stored procedure generates a snapshot of the best version
of the truth (BVT) for a base object.

Execute Batch Group Jobs


Execute Batch Group jobs (CMXBG.EXECUTE_BATCHGROUP) execute a batch
group. Note that there are two other related batch group stored procedures:
• Reset Batch Group Jobs (CMXBG.RESET_BATCHGROUP)
• Get Batch Group Status Jobs (CMXBG.GET_BATCHGROUP_STATUS)

For more information, see "Stored Procedures for Batch Groups" on page 599.

External Match Jobs


Matches “externally managed/prepared” records with an existing base object,
yielding the results based on the current match settings—all without actually
loading the data from the input table into the base object, changing data in the
base object in any way, or changing the match table associated with the base
object. You can use external matching to pretest data, test match rules, and
inspect the results before running the actual Match job. For more information,
see "External Match Jobs" on page 535.

- 572 -
Note: The External Batch job executes as a batch job only—there is no
corresponding SIF request that external applications can invoke.

Stored Procedure Definition for External Match Jobs


PROCEDURE CMXMA.EXTERNAL_MATCH(
IN_ROWID_TABLE IN CHAR(14)
, IN_USER_NAME IN VARCHAR2(50)
, IN_MATCH_SET_NAME IN VARCHAR2(500) DEFAULT NULL
, OUT_ERROR_MSG OUT VARCHAR2(1024)
, RC OUT INT
, IN_JOB_GRP_CTRL IN CHAR(14) DEFAULT NULL
, IN_JOB_GRP_ITEM IN CHAR(14) DEFAULT NULL
)

Sample Job Execution Script for External Match Jobs


DECLARE
IN_ROWID_TABLE CHAR(14);
IN_USER_NAME VARCHAR2(50);
IN_MATCH_SET_NAME VARCHAR2(200);
OUT_ERROR_MSG VARCHAR2(1024);
RC NUMBER;
BEGIN
IN_ROWID_TABLE := NULL;
IN_USER_NAME := NULL;
IN_MATCH_SET_NAME := NULL;
OUT_ERROR_MSG := NULL;
RC := NULL;
IN_JOB_GRP_CTRL := NULL;
IN_JOB_GRP_ITEM := NULL;

CMXMA.EXTERNAL_MATCH ( IN_ROWID_TABLE, IN_USER_NAME,


IN_MATCH_SET_NAME, OUT_ERROR_MSG, RC, IN_JOB_GRP_CTRL,
IN_JOB_GRP_ITEM, );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MSG);
DBMS_OUTPUT.Put_Line('RC = ' || TO_CHAR(RC));
COMMIT;
END;

Generate Match Token Jobs


The Generate Match Tokens job runs the tokenize process, which generates
match tokens and stores them in a match key table associated with the base
object so that they can be used subsequently by the match process to identify
candidates for matching. For an overview, see "Tokenize Process" on page
240.

You should run Generate Match Tokens jobs whenever match tokens need to
be regenerated, as described in "When to Generate Match Tokens" on page
242. Generate Match Tokens jobs apply to fuzzy-match base objects only—not
to exact-match base objects—as described in "Exact-match and Fuzzy-match

- 573 -
Base Objects" on page 247. For more information, see "Generate Match
Tokens Jobs" on page 540.

Note: The Generate Match Tokens job generates the match tokens for the
entire base object (when IN_FULL_RESTRIP_IND is set to 1). Check (select)
the Re-generate All Match Tokens check box in the Batch Viewer to populate
the IN_FULL_RESTRIP_IND parameter.

Identifiers for Executing Generate Match Token Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

Dependencies for Generate Match Token Jobs

Each Generate Match Tokens job is dependent on the successful completion of


the Load job responsible for loading data into the base object.

Successful Completion of Generate Match Token Jobs

Generate Match Tokens jobs must complete with a RUN_STATUS of 0


(Completed Successfully).

Stored Procedure Definition for Generate Match Token Jobs


PROCEDURE CMXMA.GENERATE_MATCH_TOKENS (
IN_ROWID_TABLE IN CHAR(14) --Rowid of a table
,IN_USER_NAME IN VARCHAR2(50) --User name
,OUT_ERROR_MSG OUT VARCHAR2(1024) --Error message, if any
,OUT_RETURN_CODE OUT NUMBER --Return code (if no errors, 0 is returned)
,IN_JOB_GRP_CTRL IN CHAR(14) DEFAULT NULL
,IN_JOB_GRP_ITEM IN CHAR(14) DEFAULT NULL
,IN_FULL_RESTRIP_IND IN NUMBER --Default 0, retokenize entire table if set
to 1 (strip_truncate_insert)
)

Sample Job Execution Script for Generate Match Token Jobs


DECLARE
 IN_ROWID_TABLE CHAR(14);
 IN_USER_NAME VARCHAR2(50);
 OUT_ERROR_MSG VARCHAR2(1024);
 OUT_RETURN_CODE NUMBER;
 IN_FULL_RESTRIP_IND NUMBER;
BEGIN
 IN_ROWID_TABLE := NULL;
 IN_USER_NAME := NULL;
 OUT_ERROR_MSG := NULL;
 OUT_RETURN_CODE := NULL;

- 574 -
 IN_FULL_RESTRIP_IND := NULL;

CMXMA.GENERATE_MATCH_TOKENS ( IN_ROWID_TABLE, IN_USER_NAME,


OUT_ERROR_MSG, OUT_RETURN_CODE, IN_FULL_RESTRIP_IND );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MSG);
DBMS_OUTPUT.Put_Line('OUT_RETURN_CODE = ' || TO_CHAR(OUT_RETURN_CODE));
COMMIT;
END;

Get Batch Group Status Jobs


Get Batch Group Status jobs returns the status of a batch group. Note that
there are two other related batch group stored procedures:
• Execute Batch Group Jobs (CMXBG.EXECUTE_BATCHGROUP)
• Reset Batch Group Jobs (CMXBG.RESET_BATCHGROUP)

For more information, see "Stored Procedures for Batch Groups" on page 599.

Hub Delete Jobs


The Hub Delete job removes specified data—up to and including an entire
source system—from Informatica MDM Hub based on your base object / XREF
input to the CMXDM.HUB_DELETE_BATCH stored procedure.

Although the Hub Delete job deletes the XREF record, a pointer to the deleted
record (actually to the parent base object of this XREF) could potentially be
present on the
_HMXR table (on column ORIG_TGT_ROWID_OBJECT). The Match Tree tool
displays REMOVED (ID#: xxxx) for the removed record(s).

Important:
• The Hub Delete batch job will not delete the data if there are records
queued for an Automerge job.
• Do not run a Hub Delete job when there are automerge records in the
match table. Run the Hub Delete job after the automerge matches are
processed.

Cascade Delete

The Hub Delete job performs a cascade delete if you set the parameter IN_
ALLOW_CASCADE_DELETE_IND=1 for a base object in the stored procedure.
With cascade delete, when records in the parent object are deleted, Hub
Delete also removes the affected records in the child base object. Hub Delete
checks each child base object table for related data that should be deleted
given the removal of the parent base object record.

- 575 -
Important: For the prior example, the Hub Delete job may potentially delete
XREF records from other source systems. To ensure that Hub Delete does not
delete XREF records from other systems, do not use cascade delete. IN_
ALLOW_CASCADE_DELETE_IND forces Hub Delete to delete the child base
objects and cross-references (regardless of system) when the parent base
object is being deleted.

Notes:
• If you do not set the IN_ALLOW_CASCADE_DELETE_IND=1, Informatica
MDM Hub generates an error message if there are child base objects
referencing the deleted base objects record; Hub Delete fails, and
Informatica MDM Hub performs a rollback operation for the associated
data.
• IN_CASCADE_CHILD_SYSTEM_XREF=1 is not supported in XU SP1. Since
there may be situations where you would want to selectively cascade
deletes to child records, you would have to perform child deletes first, and
then parent deletes with the cascade delete feature disabled.

Hub Delete Impact on History Tables

Hub Delete jobs have the following impact on history tables:


• If you set IN_OVERRIDE_HISTORY_IND=1, Hub Delete does not write to
history tables when deleting.
• If you set IN_OVERRIDE_HISTORY_IND=1 and set IN_PURGE_HISTORY_
IND=1, then Hub Delete removes history tables to delete all traces of the
data.
• If IN_PURGE_HISTORY_IND=1 and IN_OVERRIDE_HISTORY_IND=0, there
is no effect.

Note: Informatica MDM Hub sets the HUB_STATE_IND to -9 in the HXRF when
XREFs are deleted. The HIST table will be set to -9 if the base object record is
deleted.

Hub Delete Impact on Records on Hold

The Hub Delete job removes “records on hold” or records that have had their
CONSOLIDATION_IND column set to 9.

Stored Procedure Definition for Hub Delete Jobs


PROCEDURE CMXDM.HUB_DELETE_BATCH (
IN_BO_TABLE_NAME IN VARCHAR2(30)
,IN_XREF_LIST_TO_BE_DELETED IN VARCHAR2(8)
,OUT_DELETED_XREF_COUNT OUT INT
,OUT_DELETED_BO_COUNT OUT INT
,OUT_ERROR_MSG OUT VARCHAR2(1024)

- 576 -
,OUT_RETURN_CODE OUT INT
,OUT_TMP_TABLE_LIST IN OUT VARCHAR2(32000)
,IN_RECALCULATE_BVT IN INT DEFAULT 1
,IN_ALLOW_CASCADE_DELETE IN INT DEFAULT 1
,IN_CASCADE_CHILD_SYSTEM_XREF IN INT DEFAULT 0
,IN_OVERRIDE_HISTORY_IND IN INT DEFAULT 0
,IN_PURGE_HISTORY_IND IN INT DEFAULT 0
,IN_USER_NAME IN VARCHAR2(50) DEFAULT NULL
,IN_ALLOW_COMMIT_IND IN INT DEFAULT 1
)

Parameters
Parameter Description
IN_BO_TABLE_ Name of the table that contains the list of base objects to delete.
NAME
IN_XREF_ Name of the table that contains the list of XREFs to delete.
LIST_TO_BE_
DELETED
IN_ If set to one (1), recalculates BVT following base object and/or
RECALCULATE_ XREF delete.
BVT_IND
IN_ALLOW_ If set to one (1), specifies that when records in the parent object
CASCADE_ are deleted, Hub Delete also removes the affected records in the
DELETE_IND child base object. Hub Delete checks each child base object table
for related data that should be deleted given the removal of the
parent base object record.
IN_CASCADE_ Not supported in XU SP1. Leave the value for this parameter as
CHILD_ the default (0) when executing the procedure.
SYSTEM_XREF
IN_OVERRIDE_ If set to one (1), Hub Delete does not write to history tables
HISTORY_IND when deleting. If you set IN_OVERRIDE_HISTORY_IND=1 and set IN_
PURGE_HISTORY_IND=1, then Hub Delete removes history tables to
delete all traces of the data.
IN_PURGE_ If set to one (1), Hub Delete will remove all history records
HISTORY_IND related to deleted XREF records.

Returns
Parameter Description
OUT_ Number of deleted XREFs.
DELETED_
XREF_
COUNT
OUT_ Number of deleted base objects.
DELETED_
BO_
COUNT
OUT_TMP_ List of delimited tables that can be passed on to CMXUT.DROP_
TABLE_ TEMP_TABLES stored procedure calls to clean up the temporary
LIST tables. For more information, see "Housekeeping for Temporary
Tables" on page 564.
OUT_ERROR_ Error message text.
MSG
OUT_ Error code. If zero (0), then the stored procedure completed
RETURN_ successfully.

- 577 -
Parameter Description
CODE The procedure will return a non-zero value in case of an error.

Sample Job Execution Script for Hub Delete Jobs


DECLARE
IN_BO_TABLE_NAME VARCHAR2( 40 );
IN_XREF_LIST_TO_BE_DELETED VARCHAR2( 40 );
IN_RECALCULATE_BVT_IND NUMBER;
IN_ALLOW_CASCADE_DELETE_IND NUMBER;
IN_CASCADE_CHILD_SYSTEM_XREF NUMBER;
IN_OVERRIDE_HISTORY_IND NUMBER;
IN_PURGE_HISTORY_IND NUMBER;
IN_USER_NAME VARCHAR2( 100 );
IN_ALLOW_COMMIT_IND NUMBER;
OUT_DELETED_XREF_COUNT NUMBER;
OUT_DELETED_BO_COUNT NUMBER;
OUT_TMP_TABLE_LIST VARCHAR2 (32000);
OUT_ERROR_MESSAGE VARCHAR2( 1024 );
OUT_RETURN_CODE NUMBER;
BEGIN
IN_BO_TABLE_NAME := 'C_CUSTOMER';
IN_XREF_LIST_TO_BE_DELETED := 'TMP_DELETE_KEYS';
OUT_DELETED_XREF_COUNT := NULL;
OUT_DELETED_BO_COUNT := NULL;
OUT_TMP_TABLE_LIST := NULL;
OUT_ERROR_MESSAGE := NULL;
OUT_RETURN_CODE := NULL;
IN_RECALCULATE_BVT_IND := 1;
IN_ALLOW_CASCADE_DELETE_IND := 1;
IN_CASCADE_CHILD_SYSTEM_XREF := 0;
IN_OVERRIDE_HISTORY_IND := 0;
IN_PURGE_HISTORY_IND := 0;
IN_USER_NAME := 'ADMIN';
IN_ALLOW_COMMIT_IND := 0;

DELETE TMP_DELETE_KEYS;

INSERT INTO TMP_DELETE_KEYS


SELECT PKEY_SRC_OBJECT, ROWID_SYSTEM
FROM C_CUSTOMER_XREF
WHERE ROWID_SYSTEM = 'SALES';
COMMIT;
--
CMXDM.HUB_DELETE_BATCH( IN_BO_TABLE_NAME,
IN_XREF_LIST_TO_BE_DELETED, OUT_DELETED_XREF_COUNT,
OUT_DELETED_BO_COUNT, OUT_ERROR_MESSAGE, OUT_RETURN_CODE,
OUT_TMP_TABLE_LIST, IN_RECALCULATE_BVT_IND,
IN_ALLOW_CASCADE_DELETE_IND, IN_CASCADE_CHILD_SYSTEM_XREF,
IN_OVERRIDE_HISTORY_IND, IN_PURGE_HISTORY_IND, IN_USER_NAME,
IN_ALLOW_COMMIT_IND );
DBMS_OUTPUT.PUT_LINE( ' RETURN CODE IS ' || OUT_RETURN_CODE );
DBMS_OUTPUT.PUT_LINE( ' MESSAGE IS ' || OUT_ERROR_MESSAGE );
DBMS_OUTPUT.PUT_LINE( ' XREF RECORDS DELETED: ' || OUT_DELETED_XREF_COUNT
);
DBMS_OUTPUT.PUT_LINE( ' BO RECORDS DELETED: ' || OUT_DELETED_BO_COUNT );

- 578 -
COMMIT;
END; /

Key Match Jobs


Key Match jobs are used to match records from two or more sources when
these sources use the same primary key. Key Match jobs compare new
records to each other and to existing records, and identifies potential matches
based on the comparison of source record keys as defined by the match rules.
For more information, see "Key Match Jobs" on page 541.

Identifiers for Executing Key Match Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

Dependencies for Key Match Jobs

Key Match jobs are dependent on the successful completion of the Load job
responsible for loading data into the base object. The Key Match job cannot
have been run after any changes were made to the data.

Successful Completion of Key Match Jobs

Key Match jobs must complete with a RUN_STATUS of 0 (Completed


Successfully).

Stored Procedure Definition for Key Match Jobs


PROCEDURE CMXMA.KEY_MATCH (
IN_ROWID_TABLE IN CHAR(14) --Rowid of a table
,IN_USER_NAME IN VARCHAR2(50) --User name
,OUT_ERROR_MSG OUT VARCHAR2(1024)--Error message, if any
,OUT_RETURN_CODE OUT NUMBER --Return code (if no errors, returns 0)
)

Sample Job Execution Script for Key Match Jobs


DECLARE
 IN_ROWID_TABLE VARCHAR2(14);
 IN_USER_NAME VARCHAR2(50);
 OUT_ERROR_MESSAGE VARCHAR2(1024);
 OUT_RETURN_CODE NUMBER;
BEGIN
 IN_ROWID_TABLE := NULL;
 IN_USER_NAME := 'myusername';
 OUT_ERROR_MESSAGE := NULL;
 OUT_RETURN_CODE := NULL;

- 579 -
CMXMA.KEY_MATCH (IN_ROWID_TABLE, IN_USER_NAME, OUT_ERROR_MESSAGE, OUT_
RETURN_CODE);
DBMS_OUTPUT.Put_Line(' Row id table = ' || IN_ROWID_TABLE);
CMXMA.KEY_MATCH ( IN_ROWID_TABLE, IN_USER_NAME, OUT_ERROR_MESSAGE, OUT_
RETURN_CODE);
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('OUT_RETURN_CODE = ' || TO_CHAR(OUT_RETURN_CODE));
COMMIT;
END;

Load Jobs
Load jobs move data from staging tables to the final target objects, and apply
any trust and validation rules where appropriate. For more information about
Load jobs and the load process, see "Load Jobs" on page 542.

Identifiers for Executing Load Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

Dependencies for Load Jobs

Each Load job is dependent on the success of the Stage job that precedes it. In
 addition, each Load job is governed by the demands of referential integrity
constraints and is dependent on the successful completion of all other Load
jobs responsible for populating tables referenced by the base object that is the
target of the load. Run the loads for parent tables before the loads for child
tables.

Successful Completion of Load Jobs

A Load job must complete with a RUN_STATUS of 0 (Completed Successfully)


or 1 (Completed with Errors) to be considered successful. The Auto Match and
Merge jobs for a target base object can either be run on successful completion
of each Load job, or on successful completion of all Load jobs for the base
object.

Stored Procedure Definition for Load Jobs


PROCEDURE LOAD_MASTER (
IN_STG_ROWID_TABLE IN CHAR(14) --Rowid of staging table
,IN_USER_NAME IN VARCHAR2(50) --Database user name
,OUT_ERROR_MSG OUT VARCHAR2 (1024) --Error message, if any
,OUT_RETURN_CODE OUT NUMBER --Return code (if no errors, 0 is returned)
,IN_FORCE_UPDATE_IND --Forced update value Default 0, 1 for Forced update
,IN_ROWID_JOB_GRP_CTRL IN CHAR(14)
,IN_ROWID_JOB_GRP_ITEM IN CHAR(14)
)

- 580 -
Sample Job Execution Script for Load Jobs
DECLARE
 IN_STG_ROWID_TABLE CHAR(14);
 IN_USER_NAME VARCHAR2(50);
 OUT_ERROR_MSG VARCHAR2(1024);
 OUT_RETURN_CODE NUMBER;
 IN_FORCE_UPDATE_IND NUMBER;
 IN_ROWID_JOB_GRP_CTRL CHAR(14);
 IN_ROWID_JOB_GRP_ITEM CHAR(14);
BEGIN
 IN_STG_ROWID_TABLE := 'SVR1.1L9';
 IN_USER_NAME := 'ADMIN';
 IN_ROWID_JOB_GRP_CTRL := NULL;
 IN_ROWID_JOB_GRP_ITEM := NULL;
 OUT_ERROR_MSG := NULL;
 OUT_RETURN_CODE := NULL;
 IN_FORCE_UPDATE_IND := 1;
 CMXLD.LOAD_MASTER ( IN_STG_ROWID_TABLE, IN_USER_NAME, OUT_ERROR_MSG,
OUT_RETURN_CODE,IN_FORCE_UPDATE_IND, IN_ROWID_JOB_GRP_CTRL, IN_ROWID_JOB_
GRP_ITEM);
 DBMS_OUTPUT.Put_Line('OUT_ERROR_MSG = ' || OUT_ERROR_MSG);
 DBMS_OUTPUT.Put_Line('OUT_RETURN_CODE = ' || TO_CHAR(OUT_RETURN_CODE));
 COMMIT;
END;

Manual Link Jobs


Manual Link jobs execute manually linking in the Merge Manager tool. Manual
Link jobs are used with link-style base objects only. Results are stored in the _
LINK table. For more information, see "Manual Link Jobs" on page 545.

Manual Merge Jobs


After the Match job has been run, data stewards can use the Merge Manager to
process records that have been queued by a Match job for manual merge.
Manual Merge jobs are run in the Merge Manager—not in the Batch Viewer.
The Batch Viewer only allows you to inspect job execution logs for Manual
Merge jobs that were run in the Merge Manager. For more information, see
"Executing a Manual Merge Job in the Merge Manager" on page 546.

Stored Procedure Definition for Manual Merge Jobs


PROCEDURE CMXMM.MERGE(
IN_ROWID_TABLE IN CHAR(14)
,IN_SRC_ROWID_OBJECT IN CHAR(14)
,IN_TGT_ROWID_OBJECT IN CHAR(14)
,IN_ROWID_MATCH_RULE IN CHAR(14)
,IN_AUTOMERGE_IND IN INT
,IN_PROMOTE_STRING IN VARCHAR2(4000)
,IN_ROWID_JOB_CTL IN CHAR(14)
,IN_INTERACTION_ID IN INT

- 581 -
,IN_USER_NAME IN VARCHAR2(50)
,OUT_MERGED_IS_UNIQUE_IND OUT INT
,OUT_ERROR_MESSAGE OUT VARCHAR2(1024)
,OUT_RETURN_CODE OUT INT
,CALLED_MANUALLY_IND IN INT DEFAULT 1
,OUT_TMP_TABLE_LIST OUT NOCOPY VARCHAR2(32000)
)

Sample Job Execution Script for Manual Merge Jobs


DECLARE
V_ROWID_TABLE CHAR( 14 );
V_SRC_ROWID_OBJECT CHAR( 14 );
V_TGT_ROWID_OBJECT CHAR( 14 );
V_PROMOTE_STRING VARCHAR2( 2000 );
V_INTERACTION_ID INT := NULL;
OUT_MERGE_COUNT INT;
OUT_MERGED_IS_UNIQUE_IND INT;
OUT_ERROR_MESSAGE VARCHAR2( 2000 );
OUT_RETURN_CODE INT;
BEGIN
SELECT ROWID_TABLE
INTO V_ROWID_TABLE
FROM C_REPOS_TABLE
WHERE TABLE_NAME = 'C_CUSTOMER';
V_TGT_ROWID_OBJECT := 1;
V_SRC_ROWID_OBJECT := 2;
V_PROMOTE_STRING := NULL; --Contains Rowid_column~winner~ For
trusted columns to force the winning cell for that column.
--Winner can either be "s"ource or
"t"arget. Example: 'svr1.7sv~t~svr1.7sw~s~'
V_INTERACTION_ID := NULL;

CMXMM.MANUAL_MERGE( V_ROWID_TABLE, V_SRC_ROWID_OBJECT,


V_TGT_ROWID_OBJECT, V_PROMOTE_STRING, V_INTERACTION_ID, 'ADMIN',
OUT_MERGED_IS_UNIQUE_IND, OUT_ERROR_MESSAGE, OUT_RETURN_CODE );
DBMS_OUTPUT.PUT_LINE( 'MERGED IS UNIQUE IND: ' ||
OUT_MERGED_IS_UNIQUE_IND );
DBMS_OUTPUT.PUT_LINE( 'RETURN MESSAGE: ' || SUBSTR( OUT_ERROR_MESSAGE, 1,
255 ));
DBMS_OUTPUT.PUT_LINE( 'RETURN CODE: ' || OUT_RETURN_CODE );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('OUT_RETURN_CODE = ' || TO_CHAR(OUT_RETURN_CODE));
COMMIT;
END;

Manual Unlink Jobs


Manual Unlink jobs execute manually unlinking of records that were
previously linked manually in the Merge Manager tool or through one of these
stored procedure jobs.

- 582 -
Manual Unmerge Jobs
The Unmerge job can unmerge already-consolidated records, whether those
records were consolidated using Automerge, Manual Merge, manual edit, Load
by Rowid_Object, or Put Xref. The Unmerge job succeeds or fails as a single
transaction: if the server fails while the Unmerge job is executing, the
unmerge process is rolled back.

Cascade Unmerge

The Unmerge job performs a cascade unmerge if this feature is enabled for
this base object in the Schema Manager in the Hub Console. With cascade
unmerge, when records in the parent object are unmerged, Informatica MDM
Hub also unmerges affected records in the child base object.

This feature applies to unmerging records across base objects. This is


configured per base object (using the Unmerge Child When Parent
Unmerges check box on the Merge Settings tab in the Schema Manager).
Cascade unmerge applies only when a foreign-key relationship exists between
two base objects.

For example: Customer A record (parent) in the Customer base object has
multiple address records (children) in the Address base object. The two tables
are linked by a unique key (Customer_ID).
• When cascade unmerge is enabled—Unmerging the parent record
(Customer A) in the Customer base object also unmerges Customer A's
child address records in the Address base object.
• When cascade unmerge is disabled—Unmerging the parent record
(Customer A) in the Customer base object has no effect on Customer A's
child records in the Address base object; they are NOT unmerged.

Unmerging All Records or One Record

In your job execution script, you can specify the scope of records to unmerge
by setting IN_UNMERGE_ALL_XREFS_IND.
• IN_UNMERGE_ALL_XREFS_IND=0: Default setting. Unmerges the single
record identified in the specified XREF to its state prior to the merge.
• IN_UNMERGE_ALL_XREFS_IND=1: Unmerges all XREFs to their state prior
to the merge. Use this option to quickly unmerge all XREFs for a single
consolidated record in a single operation.

- 583 -
Linear and Tree Unmerge

These features apply to unmerging contributing records from within a single


base object. There is a hierarchy of merges consisting of a root (top of the
tree, or BVT), branches (merged records), and leaves (the original
contributing records at end of the branches). This hierarchy can be many
levels deep.

In your job execution script, you can specify the type of unmerge (linear or
tree unmerge) by setting IN_TREE_UNMERGE_IND:
• IN_TREE_UNMERGE_IND=0: Default setting. Linear Unmerge
• IN_TREE_UNMERGE_IND=1: Tree Unmerge

Linear Unmerge

Linear unmerge is the default behavior. During a linear unmerge, a base


object record is unmerged and taken out of the existing merge tree structure.
Only the unmerged base object record itself will come out the merge tree
structure, and all base object records below it in the merge tree will stay in
the original merge tree.

Tree Unmerge

Tree unmerge is an optional alternative. A tree of merged base object records


is a hierarchical structure of the merge history, reflecting the sequence of
merge operations that have occurred. Merge history is kept during the merge
process in these tables:
• HMXR provides the current state view of merges
• HMRG table provides a hierarchical view of the merge history, a tree of
merged base object records, as well as an interactive unmerge history.

During a tree unmerge, you unmerge a tree of merged base object records as
an intact sub-structure. A sub-tree having unmerged base object records as
root will come out from the original merge tree structure. (For example,
merge a1 and a2 into a, then merge b1 and b2 into b, and then finally merge a
and b into c. If you then perform a tree unmerge on a, and then unmerge a
from a1, a2 is a sub tree and will come out from the original tree c. As a
result, a is the root of the tree after the unmerge.)

Identifiers for Executing Manual Unmerge Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

- 584 -
Committing Unmerge Transactions After Error Checking

The Unmerge stored procedure is transaction-enabled and can be rolled back


if an error occurs during execution.

Important: After calling the Unmerge, check the return code (OUT_RETURN_
CODE) in your error handling.
• If any failure occurred during execution (OUT_RETURN_CODE <> 0),
immediately roll back any changes. Wait until after you have successfully
rolled back the unmerge changes before you invoke Unmerge again.
• If no failure occurred during execution (OUT_RETURN_CODE = 0), commit
any unmerge changes.

Dependencies for Manual Unmerge Jobs

Each Manual Unmerge job is dependent on data having already been merged.

Successful Completion of Manual Unmerge Jobs

A Manual Unmerge job must complete with a RUN_STATUS of 0 (Completed


Successfully) or 1 (Completed with Errors) to be considered successful.

Note: If the manual unmerge job completes successfully, you need to pass
the OUT_TMP_TABLE_LIST parameter to the CMXUT.DROP_TEMP_TABLES
stored procedure to clean up the temporary tables that were returned in the
parameter, as described in "Housekeeping for Temporary Tables" on page
564.

Stored Procedure Definition for Manual Unmerge Jobs


PROCEDURE CMXMM.UNMERGE (
IN_ROWID_TABLE IN CHAR(14)
,IN_ROWID_SYSTEM IN CHAR(14)
,IN_PKEY_SRC_OBJECT IN VARCHAR2(255)
,IN_TREE_UNMERGE_IND IN INT
,IN_ROWID_JOB_CTL IN CHAR(14)
,IN_INTERACTION_ID IN INT
,IN_USER_NAME IN VARCHAR2(50)
,OUT_UNMERGED_ROWID OUT CHAR(14)
,OUT_TMP_TABLE_LIST OUT VARCHAR2(32000)
,OUT_ERROR_MESSAGE OUT VARCHAR2(1024)
,RC OUT INT
,IN_UNMERGE_ALL_XREFS_IND IN INT DEFAULT 0 )

Sample Job Execution Script for Manual Unmerge Jobs


DECLARE
IN_ROWID_TABLE CHAR (14);
IN_ROWID_SYSTEM CHAR (14);

- 585 -
IN_PKEY_SRC_OBJECT VARCHAR2 (255);
IN_TREE_UNMERGE_IND NUMBER;
IN_ROWID_JOB_CTL CHAR (14);
IN_INTERACTION_ID NUMBER;
IN_USER_NAME VARCHAR2 (50);
OUT_UNMERGED_ROWID CHAR (14);
OUT_TMP_TABLE_LIST VARCHAR2 (32000);
OUT_ERROR_MESSAGE VARCHAR2 (1024);
RC NUMBER;
IN_UNMERGE_ALL_XREFS_IND NUMBER;
BEGIN
IN_ROWID_TABLE := 'SVR1.8ZC ';
IN_ROWID_SYSTEM := 'SVR1.7NJ ';
IN_PKEY_SRC_OBJECT := '6';
IN_TREE_UNMERGE_IND := 0; -- Default 0, 1 for tree unmerge
IN_ROWID_JOB_CTL := NULL;
IN_INTERACTION_ID := NULL;
IN_USER_NAME := 'XHE';
OUT_UNMERGED_ROWID := NULL;
OUT_TMP_TABLE_LIST := NULL;
OUT_ERROR_MESSAGE := NULL;
RC := NULL;
IN_UNMERGE_ALL_XREFS_IND := 0; -- default 0, 1 for unmerge_all

CMXMM.UNMERGE ( IN_ROWID_TABLE, IN_ROWID_SYSTEM,


IN_PKEY_SRC_OBJECT, IN_TREE_UNMERGE_IND, IN_ROWID_JOB_CTL,
IN_INTERACTION_ID, IN_USER_NAME, OUT_UNMERGED_ROWID,
OUT_TMP_TABLE_LIST, OUT_ERROR_MESSAGE, RC, IN_UNMERGE_ALL_XREFS_IND );
DBMS_OUTPUT.PUT_LINE (' Return Code = ' || rc);
DBMS_OUTPUT.PUT_LINE (' Message is = ' || out_error_message);
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('RC = ' || TO_CHAR(RC));

Match Jobs
Match jobs find duplicate records in the base object, based on the current
match rules. For more information about Match jobs and the match process,
see "Match Jobs" on page 547.

Important: Do not run a Match job on a base object that is used to define
relationships between records in inter-table or intra-table match paths. Doing
so will change the relationship data, resulting in the loss of the associations
between records. For more information, see "Relationship Base Objects" on
page 374.

Identifiers for Executing Match Jobs

For a complete list of the identifiers used to execute the stored procedure
associated with this batch job, see "Identifiers in the C_REPOS_TABLE_
OBJECT_V View" on page 562.

- 586 -
Dependencies for Match Jobs

Each Match job is dependent on new / updated records in the base object that
have been tokenized and are thus queued for matching. For parent base
objects that have children, the Match job is also dependent on the successful
completion of the data tokenization jobs for all child tables, which in turn is
dependent on successful Load jobs for the child tables.

Successful Completion of Match Jobs

Match jobs must complete with a RUN_STATUS of 0 (Completed Successfully)


or 1 (Completed with Errors) to be considered successful.

Stored Procedure for Match Jobs


PROCEDURE CMXMA.MATCH (
IN_ROWID_TABLE IN CHAR(14) --Rowid of a table
,IN_USER_NAME IN VARCHAR2(50) --User name
,OUT_ERROR_MSG OUT VARCHAR2(1024) --Error message, if any
,RC OUT NUMBER --Return code (if no errors, 0 is returned)
,IN_VALIDATE_TABLE_NAME IN VARCHAR2(200) --Validate table name
,IN_MATCH_ANALYZE_IND IN NUMBER --Match analyze to check for match data
,IN_MATCH_SET_NAME IN VARCHAR2(500)
)

Sample Job Execution Script for Match Jobs


DECLARE
 IN_ROWID_TABLE CHAR(14);
 IN_USER_NAME VARCHAR2(50);
 OUT_ERROR_MSG VARCHAR2(1024);
 RC NUMBER;
 IN_VALIDATE_TABLE_NAME VARCHAR2(30);
 IN_MATCH_ANALYZE_IND NUMBER;
 IN_MATCH_SET_NAME VARCHAR2(500);
 IN_JOB_GRP_CTRL CHAR(14);
 IN_JOB_GRP_ITEM CHAR(14);
BEGIN
 IN_ROWID_TABLE := NULL;
 IN_USER_NAME := NULL;
 OUT_ERROR_MSG := NULL;
 RC := NULL;
 IN_VALIDATE_TABLE_NAME := NULL;
 IN_MATCH_ANALYZE_IND := NULL;
 IN_MATCH_SET_NAME := NULL;
 IN_JOB_GRP_CTRL := NULL;
 IN_JOB_GRP_ITEM := NULL;

CMXMA.MATCH ( IN_ROWID_TABLE, IN_USER_NAME, OUT_ERROR_MSG, RC,


IN_VALIDATE_TABLE_NAME, IN_MATCH_ANALYZE_IND, IN_MATCH_SET_NAME,
IN_JOB_GRP_CTRL, IN_JOB_GRP_ITEM );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('RC = ' || TO_CHAR(RC));

- 587 -
COMMIT;
END;

Match Analyze Jobs


Match Analyze jobs perform a search to gather metrics about matching
without conducting any actual matching. Match Analyze jobs are typically used
to fine-tune match rules. For more information, see "Match Analyze Jobs" on
page 550.

Identifiers for Executing Match Analyze Jobs

For a complete list of the identifiers used to execute the stored procedure
associated with this batch job, see "Identifiers in the C_REPOS_TABLE_
OBJECT_V View" on page 562.

Dependencies for Match Analyze Jobs

Each Match Analyze job is dependent on new / updated records in the base
object that have been tokenized and are thus queued for matching. For parent
base objects, the Match Analyze job is also dependent on the successful
completion of the data tokenization jobs for all child tables, which in turn is
dependent on successful Load jobs for the child tables.

Successful Completion of Match Analyze Jobs

Match Analyze jobs must complete with a RUN_STATUS of 0 (Completed


Successfully) or 1 (Completed with Errors) to be considered successful.

Stored Procedure for Match Analyze Jobs


PROCEDURE CMXMA.MATCH (
IN_ROWID_TABLE IN CHAR(14) --Rowid of a table
,IN_USER_NAME IN VARCHAR2(50) --User name
,OUT_ERROR_MSG OUT VARCHAR2(1024) --Error message, if any
,RC OUT NUMBER --Return code (if no errors, 0 is returned)
,IN_VALIDATE_TABLE_NAME IN VARCHAR2(30) --Validate table name
,IN_MATCH_ANALYZE_IND IN NUMBER --Match analyze to check for match data
,IN_MATCH_SET_NAME IN VARCHAR2(500)
)

Sample Job Execution Script for Match Analyze Jobs


DECLARE
 IN_ROWID_TABLE CHAR(14);
 IN_USER_NAME VARCHAR2(50);
 OUT_ERROR_MSG VARCHAR2(1024);
 OUT_RETURN_CODE NUMBER;
 IN_VALIDATE_TABLE_NAME VARCHAR2(30);
 IN_MATCH_ANALYZE_IND NUMBER;

- 588 -
BEGIN
 IN_ROWID_TABLE := NULL;
 IN_USER_NAME := NULL;
 OUT_ERROR_MSG := NULL;
 OUT_RETURN_CODE := NULL;
 IN_VALIDATE_TABLE_NAME := NULL;
 IN_MATCH_ANALYZE_IND := 1;

CMXMA.MATCH ( IN_ROWID_TABLE, IN_USER_NAME, OUT_ERROR_MSG, OUT_RETURN_


CODE, IN_VALIDATE_TABLE_NAME, IN_MATCH_ANALYZE_IND );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('OUT_RETURN_CODE = ' || TO_CHAR(OUT_RETURN_CODE));
COMMIT;
END;

Match for Duplicate Data Jobs


A Match for Duplicate Data job searches for exact duplicates to consider them
matched. Use it to manually run the Match for Duplicate Data process when
you want to use your own rule as the match for duplicates criteria instead of
all the columns in the base object. The maximum number of exact duplicates
is based on the base object columns defined in the Duplicate Match Threshold
property in the Schema Manager for each base object. For more information,
see “Match for Duplicate Data Jobs” on page 737.

Note: The Match for Duplicate Data batch job has been deprecated.

Identifiers for Executing Match for Duplicate Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

Dependencies for Match for Duplicate Data Jobs

Match for Duplicate Data jobs require the existence of unconsolidated data in
the base object.

Successful Completion of Match for Duplicate Data Jobs

Match for Duplicate Data jobs must complete with a RUN_STATUS of 0


(Completed Successfully).

Stored Procedure Definition for Match for Duplicate Data Jobs


PROCEDURE CMXMA.MATCH_FOR_DUPS (
IN_ROWID_TABLE IN CHAR(14) --Rowid of a table
,IN_USER_NAME IN VARCHAR2(200) --User name
,OUT_ERROR_MSG OUT VARCHAR2(2000) --Error message, if any

- 589 -
,OUT_RETURN_CODE OUT INT --Return code (if no errors, 0 is returned)
)

Sample Job Execution Script for Match for Duplicate Data Jobs
DECLARE
 IN_ROWID_TABLE CHAR(14);
 IN_USER_NAME VARCHAR2(200);
 OUT_ERROR_MSG VARCHAR2(2000);
 OUT_RETURN_CODE NUMBER;
BEGIN
 IN_ROWID_TABLE := NULL;
 IN_USER_NAME := NULL;
 OUT_ERROR_MSG := NULL;
 OUT_RETURN_CODE := NULL;

CMXMA.MATCH_FOR_DUPS ( IN_ROWID_TABLE, IN_USER_NAME, OUT_ERROR_MSG, OUT_


RETURN_CODE);
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('OUT_RETURN_CODE = ' || TO_CHAR(OUT_RETURN_CODE));
COMMIT;
END;

Multi Merge Jobs


Multi Merge jobs allow the merge of multiple records in a single job. This
batch job is initiated only by external applications that invoke the SIF
MultiMergeRequest. For more information, see the Informatica MDM Hub
Services Integration Framework Guide.

The Multi Merge stored procedure:


• calls group_merge based on the incoming list of base object records
• uses PUT_XREF to process user-selected winning values (new XREF record
from CMX Admin + merge lineage) into the base object record

When executing the Multi Merge stored procedure:


• Merge rowid_objects in the IN_MEMBER_ROWID_LIST into IN_
SURVIVING_ROWID have the column values provided from IN_VAL_LIST
as the base object’s winning cell values. Values are delimited by ~. For
example: val1~val2~val3~
• The first rowid_object in IN_MEMBER_ROWID_LIST will be selected as the
surviving rowid_object if the IN_SURVIVING_ROWID is not provided.
• If IN_MEMBER_ROWID_LIST is NULL, IN_SURVIVING_ROWID will be
considered as group_id in the link table. In this case, all active member
rowid_objects belonging to this group_id will be merged into IN_
SURVIVING_ROWID.
• Values in the IN_MEMBER_ROWID_LIST, IN_COL_LIST, and IN_VAL_LIST
columns are delimited by ‘~’. For example: value1~value2~value3~

- 590 -
Identifiers for Executing Multi Merge Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

Dependencies for Multi Merge Jobs

Each Multi Merge job is dependent on the successful completion of the match
process for this base object.

Successful Completion of Multi Merge Jobs

Multi Merge jobs must complete with a RUN_STATUS of 0 (Completed


Successfully) or 1 (Completed with Errors) to be considered successful.

Stored Procedure Definition for Multi Merge Jobs


PROCEDURE CMXMM.MULTI_MERGE (
IN_ROWID_TABLE IN CHAR(14)
,IN_SURVIVING_ROWID IN CHAR(14)
,IN_MEMBER_ROWID_LIST IN VARCHAR2(32000) --delimited by '~'
,IN_ROWID_MATCH_RULE IN CHAR(14)
,IN_COL_LIST IN VARCHAR2(32000) --delimited by '~'
,IN_VAL_LIST IN VARCHAR2(32000) --delimited by '~'
,IN_INTERACTION_ID IN INT
,IN_USER_NAME IN VARCHAR2(50)
,IN_WINNING_CELL_OVERRIDE VARCHAR2(4000)
,OUT_ERROR_MESSAGE OUT VARCHAR2(1024)
,OUT_RETURN_CODE OUT INT
)

Sample Job Execution Script for Multi Merge Jobs


DECLARE
IN_ROWID_TABLE CHAR(14);
IN_SURVIVING_ROWID CHAR(14);
IN_MEMBER_ROWID_LIST VARCHAR2(4000);
IN_ROWID_MATCH_RULE VARCHAR2(4000);
IN_COL_LIST VARCHAR2(4000);
IN_VAL_LIST VARCHAR2(4000);
IN_INTERACTION_ID NUMBER;
IN_USER_NAME VARCHAR2(200);
IN_WINNING_CELL_OVERRIDE VARCHAR2(4000);
OUT_ERROR_MESSAGE VARCHAR2(200);
OUT_RETURN_CODE NUMBER;

BEGIN
IN_ROWID_TABLE := 'SVR1.CP4 ';
IN_SURVIVING_ROWID := '40 ';
IN_MEMBER_ROWID_LIST := '42 ~44 ~45 ~47
~48 ~49 ~';

- 591 -
IN_ROWID_MATCH_RULE := NULL;
IN_COL_LIST := 'SVR1.CSB ~SVR1.CSE ~SVR1.CSG ~SVR1.CSH
~SVR1.CSA ~';
IN_VAL_LIST := 'INDU~THOMAS~11111111111~F~1000~';
IN_INTERACTION_ID := 0;
IN_USER_NAME := 'INDU';
IN_WINNING_CELL_OVERRIDE := NULL
OUT_ERROR_MESSAGE := NULL;
OUT_RETURN_CODE := NULL;

CMXMM.MUTLI_MERGE ( IN_ROW_TABLE ,IN_SURVIVING_ROWID ,IN_MEMBER_


ROWID_LIST ,IN_ROWID_MATCH_RULE ,IN_COL_LIST,IN_VAL_LIST,IN_
INTERACTION_ID ,IN_USER_NAME ,OUR_ERROR_MESSAGE,OUT_RETURN_CODE,IN_
WINNING_CELL_OVERRIDE);

DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);


DBMS_OUTPUT.Put_Line('OUT_RETURN_CODE = ' || TO_CHAR(OUT_RETURN_CODE));
COMMIT;
END;

Promote Jobs
For state-enabled objects, a promote job reads the PROMOTE_IND column
from an XREF table and for all rows where the column’s value is 1, changes
the ACTIVE state to on. Informatica MDM Hub resets PROMOTE_IND after the
Promote job has run. For more information regarding how to manage the state
of base object or XREF records, refer to "About State Management in
Informatica MDM Hub" on page 159.

Note: The PROMOTE_IND column on a record is not changed to 0 during the


Promote batch process if the record is not promoted.

Stored Procedure Definition for Promote Jobs


PROCEDURE CMXSM.AUTO_PROMOTE(
IN_ROWID_TABLE IN CHAR(14)
,IN_USER_NAME IN VARCHAR2(50)
,OUT_ERROR_MESSAGE OUT VARCHAR2(1024)
,OUT_RETURN_CODE OUT INT
,IN_ROWID_JOB_GRP_CTRL IN CHAR(14) DEFAULT NULL
,IN_ROWID_JOB_GRP_ITEM IN CHAR(14) DEFAULT NULL
)

Recalculate BO Jobs
There are two versions of Recalculate BO:
• Using the ROWID_OBJECT_TABLE Parameter—Recalculates all BOs
identified by ROWID_OBJECT column in the table/inline view (note that
brackets are required around inline view).

- 592 -
• Without the ROWID_OBJECT_TABLE Parameter—Recalculates all
records in the BO, in batches of MATCH_BATCH_SIZE or 1/4 the number of
the records in the table, whichever is less.

Stored Procedure Definition for Recalculate BO Jobs

Note: If you include the ROWID_OBJECT_TABLE parameter, the Recalculate


BO batch job recalculates all BOs identified by ROWID_OBJECT column in the
table/inline view. If you do not include the parameter, this batch job
recalculates all records in the BO, in batches of MATCH_BATCH_SIZE or 1/4
the number of the records in the table, whichever is less.
PROCEDURE CMXBV.RECALCULATE_BO(
IN_TABLE_NAME IN VARCHAR2(128)
,IN_ROWID_OBJECT_TABLE IN VARCHAR2(128)
,IN_USER_NAME IN VARCHAR2(50)
,IN_LOCK_GROUP_STR IN VARCHAR2(100)
,OUT_TMP_TABLE_LIST OUT VARCHAR2(32000)
,OUT_ERROR_MESSAGE OUT VARCHAR2(1024)
,OUT_RETURN_CODE OUT INT
)

Sample Job Execution Script for Recalculate BO Jobs


DECLARE
OUT_ERROR_MESSAGE VARCHAR2( 1024 );
OUT_RETURN_CODE NUMBER;
BEGIN
DELETE TEST_RECALC_BO;
INSERT INTO TEST_RECALC_BO
SELECT ROWID_OBJECT
FROM C_CUSTOMER;

CMXBV.RECALCULATE_BO( 'C_CUSTOMER', 'TEST_RECALC_BO', 'TNEFF',


OUT_ERROR_MESSAGE, OUT_RETURN_CODE );
COMMIT;
DBMS_OUTPUT.PUT_LINE( ' RETURN CODE = ' || OUT_RETURN_CODE );
DBMS_OUTPUT.PUT_LINE( ' MESSAGE IS = ' || OUT_ERROR_MESSAGE );
COMMIT;
END;

Recalculate BVT Jobs


Recalculates the BVT for the specified ROWID_OBJECT.

Stored Procedure Definition for Recalculate BVT Jobs


PROCEDURE CMXBV.RECALCULATE_BVT(
IN_TABLE_NAME IN VARCHAR2(128)
,IN_ROWID_OBJECT IN CHAR(14)
,IN_USER_NAME IN VARCHAR2(50)
,OUT_TMP_TABLE_LIST OUT VARCHAR2(32000)
,OUT_ERROR_MESSAGE OUT VARCHAR2(1024)
,OUT_RETURN_CODE OUT INT

- 593 -
)

Reset Batch Group Status Jobs


Rest Batch Group Status jobs (CMXBG.RESET_BATCHGROUP) resets a batch
group. Note that there are two other related batch group stored procedures:
• Execute Batch Group Jobs (CMXBG.EXECUTE_BATCHGROUP)
• Get Batch Group Status Jobs (CMXBG.GET_BATCHGROUP_STATUS)

For more information, see "Stored Procedures for Batch Groups" on page 599.

Reset Links Jobs


Updates the records in the _LINK table to account for changes in the data.
Used with link-style base objects only.

Reset Match Table Jobs


The Reset Match Table job is created automatically after you run a match job
and the following conditions exist: if records have been updated to
CONSOLIDATION_IND=2, and if you then change your match rules, as
described in "Configuring Match Column Rules for Match Rule Sets" on page
407.

Note: This job cannot be run from the Batch Viewer. For more information,
see "Reset Match Table Jobs" on page 555.

Stored Procedure Definition for Reset Match Table Jobs


PROCEDURE CMXMA.RESET_MATCH(
IN_ROWID_TABLE IN CHAR(14)
,IN_USER_NAME IN VARCHAR2(50)
,OUT_ERROR_MSG OUT VARCHAR2(1024)
,RC OUT INT
,IN_JOB_GRP_CTRL IN CHAR(14) DEFAULT NULL
,IN_JOB_GRP_ITEM IN CHAR(14) DEFAULT NULL
)

Sample Job Execution Script for Reset Match Table Jobs


DECLARE
V_ROWID_TABLE CHAR( 14 );
OUT_ERROR_MESSAGE VARCHAR2( 1024 );
OUT_RETURN_CODE INTEGER;
BEGIN
SELECT ROWID_TABLE
INTO V_ROWID_TABLE
FROM C_REPOS_TABLE
WHERE TABLE_NAME = 'C_CUSTOMER';

- 594 -
CMXMA.RESET_MATCH( V_ROWID_TABLE, 'ADMIN', OUT_ERROR_MESSAGE, OUT_RETURN_
CODE );
DBMS_OUTPUT.PUT_LINE( 'RETURN MESSAGE: ' || SUBSTR(
OUT_ERROR_MESSAGE, 1, 255 ));
DBMS_OUTPUT.PUT_LINE( 'RETURN CODE: ' || OUT_RETURN_CODE );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('RC = ' || TO_CHAR(RC));
COMMIT;
END;

Revalidate Jobs
Revalidate jobs execute the validation logic/rules for records that have been
modified since the initial validation during the Load Process. You can run
Revalidate if/when records change post the initial Load process’s validation
step. If no records change, no records are updated. If some records have
changed and get caught by the existing validation rules, the metrics will show
the results. Revalidate is executed manually using the batch viewer for base
objects. For more information. see "Running Batch Jobs Using the Batch
Viewer Tool" on page 501.

Note: Revalidate can only be run after an initial load and prior to merge on
base objects that have validate rules setup.

Stored Procedure Definition for Revalidate Jobs


PROCEDURE CMXUT.REVALIDATE_BO(
IN_TABLE_NAME IN VARCHAR2(30)
,OUT_ERROR_MSG OUT VARCHAR2(1024)
,RC OUT INT
)

Sample Job Execution Script for Revalidate Jobs


DECLARE
IN_TABLE_NAME VARCHAR2(30);
OUT_ERROR_MESSAGE VARCHAR2(1024);
RC NUMBER;
BEGIN
IN_TABLE_NAME := UPPER('&TBL');
OUT_ERROR_MESSAGE := NULL;
RC := NULL;

CMXUT.REVALIDATE_BO(IN_TABLE_NAME, IN_TABLE_NAME,
OUT_ERROR_MESSAGE, RC);
DBMS_OUTPUT.PUT_LINE ( 'OUT_ERROR_MESSAGE= ' ||
SUBSTR(OUT_ERROR_MESSAGE,1,200) );
DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MESSAGE);
DBMS_OUTPUT.Put_Line('RC = ' || TO_CHAR(RC));
COMMIT;
END;

- 595 -
Stage Jobs
Stage jobs copy records from a landing to a staging table. During execution,
Stage jobs optionally cleanse data according to the current cleanse settings.
For more information about Stage jobs and the stage process, see "Stage
Jobs" on page 556.

Identifiers for Executing Stage Jobs

To learn about the identifiers used to execute the stored procedure associated
with this batch job, see "Identifiers in the C_REPOS_TABLE_OBJECT_V View"
on page 562.

Dependencies for Stage Jobs

Each Stage job is dependent on the successful completion of the Extraction


Transform Load (ETL) process responsible for loading the Landing table used
by the Stage job. There are no dependencies between Stage jobs.

Successful Completion of Stage Jobs

A Stage job must complete with a RUN_STATUS of 0 (Completed Successfully)


or 1 (Completed with Errors) to be considered successful. On successful
completion of a Stage job, the Load job for the target staging table can be run,
provided that all other dependencies for the Load job have been met.

Stored Procedure Definition for Stage Jobs


PROCEDURE CMXCL.START_CLEANSE(
IN_ROWID_TABLE_OBJECT IN VARCHAR2(500) --From the view
,IN_USER_NAME IN VARCHAR2(50)
,OUT_ERROR_MSG OUT VARCHAR2(1024)
,OUT_ERROR_CODE OUT INT
,IN_STG_ROWID_TABLE IN VARCHAR2(500) --rowid_table_object
,IN_RUN_SYNCH IN VARCHAR2(500) --Set to true, else runs asynch
,IN_ROWID_JOB_GRP_CTRL IN CHAR(14) DEFAULT NULL
,IN_ROWID_JOB_GRP_ITEM IN CHAR(14) DEFAULT NULL
)

Sample Job Execution Script for Stage Jobs


DECLARE
IN_STG_ROWID_TABLE VARCHAR2(200);
IN_USER_NAME VARCHAR2(50);
IN_ROWID_TABLE_OBJECT VARCHAR2(200);
IN_RUN_SYNCH VARCHAR2(200);
OUT_ERROR_MSG VARCHAR2(2000);
OUT_ERROR_CODE NUMBER;
BEGIN

- 596 -
 IN_STG_ROWID_TABLE := NULL;
 IN_ROWID_TABLE_OBJECT := NULL;
 IN_RUN_SYNCH := NULL;
 OUT_ERROR_MSG := NULL;
 OUT_ERROR_CODE := NULL;
 SELECT A.ROWID_TABLE, A.ROWID_TABLE_OBJECT INTO IN_STG_ROWID_TABLE,
 IN_ROWID_TABLE_OBJECT
 FROM C_REPOS_TABLE_OBJECT_V A, C_REPOS_TABLE B
 WHERE A.OBJECT_NAME = 'CMX_CLEANSE.EXE'
 AND B.ROWID_TABLE = A.ROWID_TABLE
 AND B.TABLE_NAME = 'C_HMO_ADDRESS'
 AND A.VALID_IND = 1;
 CMXCL.START_CLEANSE ( IN_STG_ROWID_TABLE, IN_USER_NAME, IN_ROWID_TABLE_
OBJECT, IN_RUN_SYNCH, OUT_ERROR_MSG, OUT_ERROR_CODE );
 DBMS_OUTPUT.PUT_LINE(' MESSAGE IS = ' || OUT_ERROR_MSG);
 DBMS_OUTPUT.Put_Line('OUT_ERROR_MESSAGE = ' || OUT_ERROR_MSG);
 DBMS_OUTPUT.Put_Line('OUT_ERROR_CODE = ' || TO_CHAR(OUT_ERROR_CODE));
COMMIT;
END;

Synchronize Jobs
You must run the Synchronize job after any changes are made to the schema
trust settings. The Synchronize job is created when any changes are made to
the schema trust settings, as described in "Batch Jobs That Are Created When
Changes Occur" on page 500. For more information, see "Configuring Trust for
Source Systems" on page 344.

Running Synchronize Jobs

To run the Synchronize job, navigate to the Batch Viewer, find the correct
Synchronize job for the base object, and run it. Informatica MDM Hub updates
the metadata for the base objects that have trust enabled after initial load has
occurred. For more information, see "Synchronize Jobs" on page 557.

Stored Procedure Definition for Synchronize Jobs


PROCEDURE CMXUT.SYNC(
IN_ROWID_TABLE IN CHAR(14)
,IN_USER_NAME IN VARCHAR2(50)
,OUT_ERROR_MSG OUT VARCHAR2(1024)
,OUT_RETURN_CODE OUT INT
,IN_JOB_GRP_CTRL IN CHAR(14) DEFAULT NULL
,IN_JOB_GRP_ITEM IN CHAR(14) DEFAULT NULL
)

Sample Job Execution Script for Synchronize Jobs


DECLARE
V_ROWID_TABLE CHAR( 14 );
OUT_ERROR_MESSAGE VARCHAR2( 1024 );
OUT_RETURN_CODE INTEGER;
BEGIN
SELECT ROWID_TABLE

- 597 -
INTO V_ROWID_TABLE
FROM C_REPOS_TABLE
WHERE TABLE_NAME = 'C_CUSTOMER';

CMXUT.SYNCH( V_ROWID_TABLE, 'ADMIN', OUT_ERROR_MESSAGE,


OUT_RETURN_CODE );
DBMS_OUTPUT.PUT_LINE( 'RETURN MESSAGE: ' || SUBSTR(
OUT_ERROR_MESSAGE, 1, 255 ));
DBMS_OUTPUT.PUT_LINE( 'RETURN CODE: ' || OUT_RETURN_CODE );
COMMIT;
END;

Executing Batch Groups Using Stored


Procedures
This section describes how to execute batch groups for your Informatica MDM
Hub implementation.

About Executing Batch Groups


A batch group is a collection of individual batch jobs (for example, Stage,
Load, and Match jobs) that can be executed with a single command; some
sequentially and some in parallel according to the configuration. When one job
has an error, the group will stop; that is, no more jobs will be started,
however, running jobs will run to completion. To learn important background
information about batch groups, see "Running Batch Jobs Using the Batch
Group Tool" on page 512.

This section describes how to execute batch groups using stored procedures
and job scheduling software (such as Tivoli, CA Unicenter, and so on).
Informatica MDM Hub provides stored procedures for managing batch groups.
For more information, see "Stored Procedures for Batch Groups" on page 599.

You can also use the Batch Group tool in the Hub Console to configure and run
batch groups. However, to schedule batch groups, you need to do so using
stored procedures, as described in this section. For more information about
the Batch Group tool, see "Running Batch Jobs Using the Batch Group Tool" on
page 512.

Note: If a batch group fails and you do not click either the Set to Restart
button (see "Restarting a Batch Group That Failed Execution" on page 526) or
the Set to Incomplete button (see "Handling Incomplete Batch Group
Execution" on page 527) in the Logs for My Batch Group list, Informatica MDM
Hub restarts the batch job from the prior failed level.

- 598 -
Stored Procedures for Batch Groups
Informatica MDM Hub provides the following stored procedures for managing
batch groups:
Stored Procedure Description
CMXBG.EXECUTE_ Performs an HTTP POST to the SIF
BATCHGROUP ExecuteBatchGroupRequest. For more information, see
"CMXBG.EXECUTE_BATCHGROUP" on page 599
CMXBG.RESET_ Performs an HTTP POST to the SIF
BATCHGROUP ResetBatchGroupRequest. For more information, see
"CMXBG.RESET_BATCHGROUP" on page 601.
CMXBG.GET_ Performs an HTTP POST to the SIF
BATCHGROUP_ GetBatchGroupStatusRequest. For more information, see
STATUS "CMXBG.GET_BATCHGROUP_STATUS" on page 602.

In addition to using parameters that are associated with the corresponding SIF
request, these stored procedures require the following parameters:
• URL of the Hub Server (for example, http://localhost:7001/cmx/request)
• username and password
• target ORS

Note: These stored procedures construct an XML message, perform an HTTP


POST to a server URL using SIF, and return the results.

CMXBG.EXECUTE_BATCHGROUP

Execute Batch Group jobs execute a batch group. Execute Batch Groups jobs
have an option to execute asynchronously, but not to receive a JMS response
for asynchronous execution. If you need to use asynchronous execution and
need to know when execution is finished, then poll with the cmxbg.get_
batchgroup_status stored procedure. Alternatively, if you need to receive a
JMS response for asynchronous execution, then execute the batch group
directly in an external application (instead of a job execution script) by
invoking the SIF ExecuteBatchGroup request, which is described in the
Informatica MDM Hub Services Integration Framework Guide.

Signature
FUNCTION CMXBG.EXECUTE_BATCHGROUP(
IN_MRM_SERVER_URL IN VARCHAR2(500)
, IN_USERNAME IN VARCHAR2(500)
, IN_PASSWORD IN VARCHAR2(500)
, IN_ORSID IN VARCHAR2(500)
, IN_BATCHGROUP_UID IN VARCHAR2(500)
, IN_RESUME IN VARCHAR2(500)
, IN_ASYNCRONOUS IN VARCHAR2(500)
, OUT_ROWID_BATCHGROUP_LOG OUT VARCHAR2(500)
, OUT_ERROR_MSG OUT VARCHAR2(500)
) RETURN NUMBER --Return the error code

- 599 -
Parameters
Name Description
IN_MRM_ Hub Server SIF URL.
SERVER_URL
IN_USERNAME User account with role-based permissions to execute batch
groups.
IN_PASSWORD Password for the user account with role-based permissions to
execute batch groups.
IN_ORSID ORS ID as shown in Console > Configuration >
Databases. For more information, see “Configuring Operational
Record Stores” on page 62.
IN_ Informatica MDM Hub Object UID of batch group to [execute,
BATCHGROUP_ reset, get status, etc.].
UID
IN_RESUME One of the following values:
• true: if previous execution failed, resume at that point
• false: regardless of previous execution, start from the
beginning
IN_ Specifies whether to execute asynchronously or
ASYNCRONOUS synchronously. One of the following values:
• true: start execution and return immediately
(asynchronous execution).
• false: return when group execution is complete
(synchronous execution).

Returns
Parameter Description
OUT_ROWID_ c_repos_job_group_control.rowid_job_group_control
BATCHGROUP_
LOG
OUT_ERROR_ Error message text.
MSG
NUMBER Error code. If zero (0), then the stored procedure completed
successfully. If one (1), then the stored procedure returns an
explanation in out_error_msg.

Sample Job Execution Script for Execute Batch Group Jobs


DECLARE
 OUT_ROWID_BATCHGROUP_LOG CMXLB.CMX_SMALL_STR;
 OUT_ERROR_MSG CMXLB.CMX_SMALL_STR;
 RET_VAL INT;
BEGIN
RET_VAL := CMXBG.EXECUTE_BATCHGROUP(
'HTTP://LOCALHOST:7001/CMX/REQUEST/PROCESS/'
, 'ADMIN'
, 'ADMIN'
, 'LOCALHOST-MRM-XU_3009'
, 'BATCH_GROUP.MYBATCHGROUP'
, 'TRUE' -- OR 'FALSE'
, 'TRUE' -- OR 'FALSE'
, OUT_ROWID_BATCHGROUP_LOG
, OUT_ERROR_MSG
);

- 600 -
CMXLB.DEBUG_PRINT('EXECUTE_BATCHGROUP:
' || ' CODE='|| RET_VAL || ' MESSAGE='|| OUT_ERROR_MSG ||
' | OUT_ROWID_BATCHGROUP_LOG='|| OUT_ROWID_BATCHGROUP_LOG);
);
COMMIT;
END;

CMXBG.RESET_BATCHGROUP

Reset Batch Group Status jobs resets a batch group.

Note: In addition to this stored procedure, there are Java API requests and
the SOAP and HTTP XML protocols available using Services Integration
Framework (SIF). The Reset Batch Group Status job has the following SIF API
requests available: ResetBatchGroup. For more information about this SIF API
request, see the Informatica MDM Hub Services Integration Framework
Guide.

Signature
FUNCTION CMXBG.RESET_BATCHGROUP(
IN_MRM_SERVER_URL IN VARCHAR2(500)
, IN_USERNAME IN VARCHAR2(500)
, IN_PASSWORD IN VARCHAR2(500)
, IN_ORSID IN VARCHAR2(500)
, IN_BATCHGROUP_UID IN VARCHAR2(500)
, OUT_ROWID_BATCHGROUP_LOG OUT VARCHAR2(500)
, OUT_ERROR_MSG OUT VARCHAR2(500)
) RETURN NUMBER --Return the error code

Parameters
Name Description
IN_MRM_ Hub Server SIF URL.
SERVER_URL
IN_USERNAME User account with role-based permissions to execute batch
groups.
IN_ Password for the user account with role-based permissions to
PASSWORD execute batch groups.
IN_ORSID ORS ID as specified in the Database tool in the Hub Console.
For more information, see "Configuring Operational Reference
Stores" on page 55.
IN_ Informatica MDM Hub Object UID of batch group to [execute,
BATCHGROUP_ reset, get status of, and so on].
UID

Returns
Parameter Description
OUT_ROWID_ c_repos_job_group_control.rowid_job_group_control
BATCHGROUP_
LOG
OUT_ERROR_ Error message text.
MSG
NUMBER Error code. If zero (0), then the stored procedure completed

- 601 -
Parameter Description
successfully. If one (1), then the stored procedure returns an
explanation in out_error_msg.

Sample Job Execution Script for Reset Batch Group Jobs


DECLARE
 OUT_ROWID_BATCHGROUP_LOG CMXLB.CMX_SMALL_STR;
 OUT_ERROR_MSG CMXLB.CMX_SMALL_STR;
 RET_VAL INT;
BEGIN
 RET_VAL := CMXBG.RESET_BATCHGROUP(
 'HTTP://LOCALHOST:7001/CMX/REQUEST/PROCESS/'
 , 'ADMIN'
 , 'ADMIN'
 ,'LOCALHOST-MRM-XU_3009'
 , 'BATCH_GROUP.MYBATCHGROUP'
 , OUT_ROWID_BATCHGROUP_LOG
 , OUT_ERROR_MSG
 );
 CMXLB.DEBUG_PRINT('RESET_BATCHGROUP: CODE=' || RET_VAL || ' MESSAGE='
|| OUT_ERROR_MSG || ' OUT_ROWID_BATCHGROUP_LOG=' || OUT_ROWID_BATCHGROUP_
LOG);
/

CMXBG.GET_BATCHGROUP_STATUS

Get Batch Group Status jobs return the batch group status.

Note: In addition to this stored procedure, there are Java API requests and
the SOAP and HTTP XML protocols available using Services Integration
Framework (SIF). The Get Batch Group Status job has the following SIF API
requests available: GetBatchGroupStatus. For more information about this SIF
API request, see the Informatica MDM Hub Services Integration Framework
Guide.

Signature
FUNCTION CMXBG.GET_BATCHGROUP_STATUS(
IN_MRM_SERVER_URL IN VARCHAR2(500)
, IN_USERNAME IN VARCHAR2(500)
, IN_PASSWORD IN VARCHAR2(500)
, IN_ORSID IN VARCHAR2(500)
, IN_BATCHGROUP_UID IN VARCHAR2(500)
, IN_ROWID_BATCHGROUP_LOG IN VARCHAR2(500)
, OUT_ROWID_BATCHGROUP OUT VARCHAR2(500)
, OUT_ROWID_BATCHGROUP_LOG OUT VARCHAR2(500)
, OUT_START_RUNDATE OUT VARCHAR2(500)
, OUT_END_RUNDATE OUT VARCHAR2(500)
, OUT_RUN_STATUS OUT VARCHAR2(500)
, OUT_STATUS_MESSAGE OUT VARCHAR2(500)
, OUT_ERROR_MSG OUT VARCHAR2(500)
) RETURN NUMBER --Return the error code

- 602 -
Parameters
Name Description
IN_MRM_ Hub Server SIF URL.
SERVER_URL
IN_USERNAME User account with role-based permissions to execute batch
groups.
IN_ Password for the user account with role-based permissions to
PASSWORD execute batch groups.
IN_ORSID ORS ID as specified in the Database tool in the Hub Console.
For more information, see "Configuring Operational Reference
Stores" on page 55.
IN_ Informatica MDM Hub Object UID of batch group to [execute,
BATCHGROUP_ reset, get status of, and so on].
UID If IN_ROWID_BATCHGROUP_LOG is null, the most recent log
for this group will be used.
IN_ROWID_ c_repos_job_group_control.rowid_job_group_control
BATCHGROUP_ Either IN_BATCHGROUP_UID or IN_ROWID_BATCHGROUP_
LOG LOG is required.

Returns
Parameter Description
OUT_ROWID_ c_repos_job_group.rowid_job_group
BATCHGROUP
OUT_ROWID_ c_repos_job_group_control.rowid_job_group_control
BATCHGROUP_
LOG
OUT_START_ Date / time when this batch job started.
RUNDATE
OUT_END_ Date / time when this batch job ended.
RUNDATE
OUT_RUN_ Job execution status code that is displayed in the Batch Group
STATUS tool. For more information, see "Executing Batch Groups
Using the Batch Group Tool" on page 522.
OUT_STATUS_ Job execution status message that is displayed in the Batch
MESSAGE Group tool. For more information, see "Executing Batch
Groups Using the Batch Group Tool" on page 522.
OUT_ERROR_ Error message text for this stored procedure call, if
MSG applicable.
NUMBER Error code. If zero (0), then the stored procedure completed
successfully. If one (1), then the stored procedure returns an
explanation in out_error_msg.

Sample Job Execution Script for Get Batch Group Status Jobs
DECLARE
 OUT_ROWID_BATCHGROUP CMXLB.CMX_SMALL_STR;
 OUT_ROWID_BATCHGROUP_LOG CMXLB.CMX_SMALL_STR;
 OUT_START_RUNDATE CMXLB.CMX_SMALL_STR;
 OUT_END_RUNDATE CMXLB.CMX_SMALL_STR;
 OUT_RUN_STATUS CMXLB.CMX_SMALL_STR;
 OUT_STATUS_MESSAGE CMXLB.CMX_SMALL_STR;
 OUT_ERROR_MSG CMXLB.CMX_SMALL_STR;
 OUT_RETURNCODE INT;
 RET_VAL INT;
BEGIN

- 603 -
 RET_VAL := CMXBG.GET_BATCHGROUP_STATUS(
 'HTTP://LOCALHOST:7001/CMX/REQUEST/PROCESS/'
 , 'ADMIN'
 , 'ADMIN'
 ,'LOCALHOST-MRM-XU_3009'
 , 'BATCH_GROUP.MYBATCHGROUP'
 , NULL
 , OUT_ROWID_BATCHGROUP
 , OUT_ROWID_BATCHGROUP_LOG
 , OUT_START_RUNDATE
 , OUT_END_RUNDATE
 , OUT_RUN_STATUS
 , OUT_STATUS_MESSAGE
 , OUT_ERROR_MSG
 );
 CMXLB.DEBUG_PRINT('GET_BATCHGROUP_STATUS: CODE='|| RET_VAL || '
MESSAGE='|| OUT_ERROR_MSG || ' STATUS=' || OUT_STATUS_MESSAGE || ' | OUT_
ROWID_BATCHGROUP_LOG='|| OUT_ROWID_BATCHGROUP_LOG);
END;
/

Developing Custom Stored Procedures for


Batch Jobs
This section describes how to create and register custom stored procedures
for batch jobs that can be added to batch groups for your Informatica MDM
Hub implementation.

About Custom Stored Procedures


Informatica MDM Hub also allows you to create and run custom stored
procedures for batch jobs. After developing the custom stored procedure, you
must register it in order to make it available to users as batch jobs in the
Batch Viewer and Batch Group tools in the Hub Console. For more information
about these tools, see "Using Batch Jobs " on page 496.

Required Execution Parameters for Custom Batch


Jobs
The following parameters are required for custom batch jobs. During its
execution, a custom batch job can call cmxut.set_metric_value to register
metrics.

Signature
PROCEDURE EXAMPLE_JOB(
IN_ROWID_TABLE_OBJECT IN CHAR(14) --C_REPOS_TABLE_OBJECT.ROWID_TABLE_
OBJECT, RESULT OF CMXUT.REGISTER_CUSTOM_TABLE_OBJECT
,IN_USER_NAME IN VARCHAR2(50) --Username calling the function

- 604 -
,IN_ROWID_JOB IN CHAR(14) --C_REPOS_JOB_CONTROL.ROWID_JOB, for reference,
do not update status
,OUT_ERR_MSG OUT VARCHAR --Message about success or error
,OUT_ERR_CODE OUT INT -- >=0: Completed successfully. <0: Error
)

Parameters
Name Description
in_rowid_table_object IN c_repos_table_object.rowid_table_object
cmxlb.cmx_rowid Result of cmxut.REGISTER_CUSTOM_
TABLE_OBJECT
in_user_name IN User name calling the function.
cmxlb.cmx_user_name

Returns
Parameter Description
out_err_msg Error message text.
out_err_code Error code.

Registering a Custom Stored Procedure


You must register a custom stored procedure with Informatica MDM Hub in
order to make it available to users in the Batch Group tool in the Hub Console.
You can register the same custom job multiple times for different tables (in_
rowid_table). To register a custom stored procedure, you need to call this
stored procedure in c_repos_table_object:
CMXUT.REGISTER_CUSTOM_TABLE_OBJECT

Signature
PROCEDURE REGISTER_CUSTOM_TABLE_OBJECT(
IN_ROWID_TABLE IN CHAR(14)
, IN_OBJ_FUNC_TYPE_CODE IN VARCHAR
, IN_OBJ_FUNC_TYPE_DESC IN VARCHAR
, IN_OBJECT_NAME IN VARCHAR
)

Parameters
Name Description
IN_ROWID_TABLE Foreign key to c_repos_table.rowid_table.
CMXLB.CMX_ROWID When the Hub Server calls the custom job in a batch
group, this value is passed in.
IN_OBJ_FUNC_TYPE_ Job type code. Must be 'A' for batch group custom jobs.
CODE
IN_OBJ_FUNC_TYPE_ Display name for the custom batch job in the Batch
DESC Groups tool in the Hub Console.
IN_OBJECT_NAME package.procedure name of the custom job.

- 605 -
Example
BEGIN
cmxut.REGISTER_CUSTOM_TABLE_OBJECT (
'SVR1.RS1B ' -- c_repos_table.rowid_table
,'A' -- Job type, must be 'A' for batch group
,'CMXBG_EXAMPLE.UPDATE_TABLE EXAMPLE' -- Display name
,'CMXBG_EXAMPLE.UPDATE_TABLE' -- Package.procedure
);
END;

Registering a Custom Index


There are a number of user-defined indexes that have been created, but that
are not registered in the repository. If you create your own indexes, you
should register them in the repository. Some batch processes drop and
recreate indexes based on the repository info, so if your indexes aren’t
registered, you run the risk of having them dropped.

Example
DECLARE
IN_ROWID_TABLE CHAR(14);
IN_ROWID_COL_LIST VARCHAR2(2000);
IN_USER_NAME VARCHAR2(50);
IN_INDEX_TYPE VARCHAR2(200);
BEGIN
IN_ROWID_TABLE := '<ROWID_TABLE>' ; -- rowid_table from c_repos_
table where table_name = 'your table name'

IN_ROWID_COL_LIST := NULL; -- List of rowid_column values


from c_repos_column where rowid_table = '<rowid_table value for your
table>'

-- Notes:
-- 1. Trailing spaces in the rowid_column values are significant
-- 2. Separate each rowid_column with a ~ character and end the
list with ~ character e.g. '123 ~456 ~'

IN_USER_NAME := NULL; -- Your name / identifier; does not have to


be an Informatica MDM Hub user name
IN_INDEX_TYPE := NULL; -- FK, PK, NI (non-unique index), UI
(Unique Index). You should ONLY create and register indexes of type
NI.

CMXUT.REGISTER_CUSTOM_INDEX ( IN_ROWID_TABLE, IN_ROWID_COL_LIST,


IN_USER_NAME, IN_INDEX_TYPE );
COMMIT;
END;

- 606 -
Removing Data from a Base Object and Supporting
Metadata Tables
Use the CMXUT.CLEAN_TABLE procedure to remove all data from a base object
and its supporting metadata tables. If a base object is referenced by a foreign
key in another base object, then the referencing base object must be empty
before you run cmxut.clean_table for the referenced base object.

Example
DECLARE
IN_TABLE_NAME VARCHAR2(30);
OUT_ERROR_MESSAGE VARCHAR2(1024);
RC NUMBER;
BEGIN
IN_TABLE_NAME := 'C_BO_TO_CLEAN'; --Name of the BO table
OUT_ERROR_MESSAGE := NULL; --Return msg; output parameter
RC := NULL; --Return code; output parameter
CMXUT.CLEAN_TABLE ( IN_TABLE_NAME, OUT_ERROR_MESSAGE, RC );
COMMIT;
END;

Writing Messages to the Informatica MDM


HubDatabase Debug Log
Use the CMXLB.DEBUG_PRINT procedure to write your own messages to
Informatica MDM Hub database debug log file. The message is written to the
log if logging is enabled and if it has been configured correctly; for details, see
the Informatica MDM Hub Installation Guide.

Example
DECLARE
IN_DEBUG_TEXT VARCHAR2(32000);
BEGIN
IN_DEBUG_TEXT := NULL; --String that you want to print in the log file
CMXLB.DEBUG_PRINT ( IN_DEBUG_TEXT );
COMMIT;
END;

Example Custom Stored Procedure


CREATE OR REPLACE PACKAGE CMXBG_EXAMPLE
AS

PROCEDURE UPDATE_TABLE(
IN_ROWID_TABLE_OBJECT IN CMXLB.CMX_ROWID
,IN_USER_NAME IN CMXLB.CMX_USER_NAME
,IN_ROWID_JOB IN CMXLB.CMX_ROWID
,OUT_ERR_MSG OUT VARCHAR
,OUT_ERR_CODE OUT INT

- 607 -
);
END CMXBG_EXAMPLE;
/
CREATE OR REPLACE PACKAGE BODY CMXBG_EXAMPLE
AS
BEGIN
DECLARE
CUTOFF_DATE DATE;
RECORD_COUNT INT;
RUN_STATUS INT;
STATUS_MESSAGE VARCHAR2 (2000);
START_DATE DATE := SYSDATE;
MRM_ROWID_TABLE CMXLB.CMX_ROWID;
OBJ_FUNC_TYPE CHAR (1);
JOB_ID CHAR (14);
SQL_STMT VARCHAR2 (2000);
TABLE_NAME VARCHAR2(30);
RET_CODE INT;
REGISTER_JOB_ERR EXCEPTION;
BEGIN
SQL_STMT :=
'ALTER SESSION SET NLS_DATE_FORMAT=''DD MON YYYY
HH24:MI:SS''';

EXECUTE IMMEDIATE SQL_STMT;


CMXLB.DEBUG_PRINT ('START OF CUSTOM BATCH JOB...');
OBJ_FUNC_TYPE := 'A';

SELECT ROWID_TABLE
INTO MRM_ROWID_TABLE
FROM C_REPOS_TABLE_OBJECT
WHERE ROWID_TABLE_OBJECT = IN_ROWID_TABLE_OBJECT;

SELECT START_RUN_DATE
INTO CUTOFF_DATE
FROM C_REPOS_JOB_CONTROL
WHERE ROWID_JOB = IN_ROWID_JOB;

IF CUTOFF_DATE IS NULL THEN


CUTOFF_DATE := SYSDATE - 7;
END IF;
SELECT TABLE_NAME
INTO TABLE_NAME
FROM C_REPOS_TABLE RT, C_REPOS_TABLE_OBJECT RTO
WHERE RTO.ROWID_TABLE_OBJECT = IN_ROWID_TABLE_OBJECT
AND RTO.ROWID_TABLE = RT.ROWID_TABLE;

-- THE REAL WORK!


SQL_STMT :=
'UPDATE ' || TABLE_NAME || ' SET ZIP4 = ''0000'', LAST_
UPDATE_DATE = '''
|| CUTOFF_DATE
|| ''''
|| ' WHERE ZIP4 IS NULL';
CMXLB.DEBUG_PRINT (SQL_STMT);
EXECUTE IMMEDIATE SQL_STMT;

- 608 -
RECORD_COUNT := SQL%ROWCOUNT;
COMMIT;
-- For testing, sleep to make the procedure take longer
-- dbms_lock.sleep(5);
-- Set zero or many metrics about the job
CMXUT.SET_METRIC_VALUE (IN_ROWID_JOB, 1, RECORD_COUNT,
OUT_ERR_CODE, OUT_ERR_MSG);
COMMIT;

IF RECORD_COUNT <= 0 THEN


OUT_ERR_MSG := 'FAILED TO UPDATE RECORDS.';
OUT_ERR_CODE := -1;
ELSE
IF OUT_ERR_CODE >= 0 THEN
OUT_ERR_MSG := 'COMPLETED SUCCESSFULLY.';
END IF;
-- Else keep success code and msg from set_metric_value
END IF;
EXCEPTION
WHEN OTHERS
THEN
OUT_ERR_CODE := SQLCODE;
OUT_ERR_MSG := SUBSTR (SQLERRM, 1, 200);
END;
END;
END CMXBG_EXAMPLE;
/

- 609 -
Part 5: Configuring Application Access

Part 5: Configuring Application


Access

Contents
• "Generating ORS-specific APIs and Message Schemas" on page 611
• "Setting Up Security" on page 621
• "Viewing Registered Custom Code" on page 678
• "Auditing Informatica MDM Hub Services and Events" on page 684

- 610 -
Chapter 19: Generating ORS-specific
APIs and Message Schemas

This chapter describes how to use the SIF Manager tool to generate ORS-
specific APIs and how to use the JMS Event Schema Manager tool to generate
ORS-specific JMS Event Message objects.

Chapter Contents
• "Before You Begin" on page 611
• "Generating ORS-specific APIs" on page 611
• "Generating ORS-specific Message Schemas" on page 615

Before You Begin


The SIF SDK requires a Java Development Kit (JDK) and the Apache Jakarta
Ant build system. It can build client applications and custom web services, but
only for supported application servers. Refer to the Informatica MDM Hub
Release Notes for information about the specific versions of JDK, Ant, and
supported application servers. For more information about the SIF SDK, see
the Informatica MDM Hub Services Integration Framework Guide.

Note: Use of the ORS-specific API does not imply that you must use the SIF
SDK. Alternatively, you could use the ORS-specific API as SOAP web-services.

Generating ORS-specific APIs


You use the SIF Manager tool to generate and deploy the code to support SIF
APIs for packages, remote packages, mappings, and cleanse functions in an
ORS database. Once generated, the ORS-specific APIs will be available with
SiperianClient by using the client jar and also as a web service. For more
information about the SiperianClient, see the Informatica MDM Hub Services
Integration Framework Guide.

About ORS-specific Schemas


The ORS-specific message schema is an XML schema XSD file which defines
the structure of the JMS data change event messages. For more information
regarding JMS event messages, see "JMS Message XML Reference" on page
464.

About the SIF Manager Tool


Use the SIF Manager tool in the Hub Console to produce ORS-specific APIs.

- 611 -
Starting the SIF Manager Tool
To start the SIF Manager tool:
1. In the Hub Console, connect to an Operational Reference Store (ORS). For
more information, see "Changing the Target Database" on page 37.
2. Expand the Informatica Utilities workbench and then click SIF Manager.
The Hub Console displays the SIF Manager tool.

The SIF Manager tool displays the following areas:


Area Description
SIF Shows the logical name, java name, WSDL URL, and API generation
ORS- time for the SIF ORS-specific APIs.
Specific Use this function to generate and deploy SIF APIs for packages,
APIs remote packages, mappings, and cleanse functions in an ORS
database. Once generated, the ORS-specific APIs will be available
with SiperianClient by using the client jar and also as a web service.
The logical name is used to name the components of the deployment.
Out of Shows the database objects in the schema that are out of sync. with
Sync the generated schema.
Objects

Generating and Deploying ORS-specific SIF APIs


This operation requires access to a Java compiler on the application server
machine. The Java software development kit (SDK) includes a compiler in
tools.jar. The Java runtime environment (JRE) does not contain a compiler. If
the SDK is not available, you will need to add the tools.jar file to the
classpath of the application server.

- 612 -
Note: The following procedure assumes that you have already configured the
base objects and packages of the ORS. If you subsequently change any of
these, regenerate the ORS-specific APIs.

Note: SIF API generation requires that at least one secure package, remote
package, cleanse function, or mapping be defined.

To produce and use ORS-specific APIs:


1. Start the SIF Manager. For more information, see "Starting the SIF
Manager Tool" on page 612.
The Hub Console displays the SIF Manager in the right pane.
2. Acquire a write lock.
In order to make any changes to the schema, you must have a write lock.
For more information, see "Acquiring a Write Lock" on page 36.
3. Enter a value in the Logical Name field.
You can keep the default value, which is the name of the ORS. If you
change the logical name, it must be different from the logical name of any
other ORS registered on this server.
4. Click Generate and Deploy ORS-specific SIF APIs.
SIF Manager generates the APIs. The time this requires depends on the
size of the ORS schema. When the generation is complete, SIF Manager
deploys the ORS-specific APIs and displays their URL. You can use the URL
to access the WSDL descriptions from your development environment.

Note: To prevent running out of heap space for the associated SIF API
Javadoc, you may need to increase the size of the heap. The default heap size
is 256M. You can override this default using the sif.jvm.heap.size parameter
in the cmxserver.properties file.

Renaming ORS-specific SIF APIs

To rename the ORS-specific APIs:


1. Start the SIF Manager. For more information, see "Starting the SIF
Manager Tool" on page 612.
The Hub Console displays the SIF Manager in the right pane.
2. Acquire a write lock.
In order to make any changes to the schema, you must have a write lock.
For more information, see "Acquiring a Write Lock" on page 36.
3. Enter a new value in the Logical Name field and save it.
Note: You can keep the default value, which is the display name (and not
the schema name) of the ORS. If you change the logical name, it must be

- 613 -
different from the logical name of any other ORS registered on this server
to prevent duplicates.
4. Click Generate and Deploy ORS-specific SIF APIs.
SIF Manager generates the APIs. The time this requires depends on the
size of the ORS schema. When the generation is complete, SIF Manager
deploys the ORS-specific web services and displays their URL. Note that
these are not required for Java ORS-specific APIs to work. Java and web
services ORS-specific APIs have no dependencies on each other, so you
can use one while the other is not in use.
You can use the resulting URL to access the WSDL descriptions from your
development environment.

Note: To prevent running out of heap space for the associated SIF API
Javadocs, you may need to increase the size of the heap. The default heap
size is 256M. You can also override this default using the sif.jvm.heap.size
parameter.

Downloading ORS-specific Client JAR Files

You can download ORS-specific JAR file at any point after the APIs have been
generated.

To download client JAR files:


1. Start the SIF Manager. For more information, see "Starting the SIF
Manager Tool" on page 612.
The Hub Console displays the SIF Manager in the right pane.
2. Click Download ORS-specific Client JAR File.
SIF Manager downloads a file called nameClient.jar, where name is the
logical name you provided in step 2, to a location you specify on your local
machine. The JAR file includes the classes that represent your ORS-
specific configuration and their Javadoc.
Note: This jar file needs to be used in conjunction with sifsdk folder client
jar (generic client jar).
3. If you are using an integrated development environment (IDE) and have a
project file for building web services, add the JAR file to your build
classpath.
4. Modify the SIF SDK build.xml file so that the build_war macro includes
the JAR file. For more information about the SIF SDK, see the Informatica
MDM Hub Services Integration Framework Guide.

- 614 -
Finding Out-of-Sync Objects

The SIF Manager Find Out of Sync Objects function compares the last
generated APIs to the defined objects in the ORS. The SIF Manager reports
any differences between these. If differences are found, the ORS-specific API
should be regenerated.

To find the out-of-sync objects:


1. Start the SIF Manager. For more information, see "Starting the SIF
Manager Tool" on page 612.
The Hub Console displays the SIF Manager in the right pane.
2. Click Find Out of Sync Objects.
The SIF Manager displays all out-of-sync objects in the lower panel.

Note: Once you have evaluated the impact of the out-of-sync objects, you can
then decide whether or not to re-generate the schema (typically, external
components which interact with the Hub are written to work with a specific
version of the generated schema). If you regenerate the schema, these
external components may no longer work.

Removing ORS-specific APIs

To remove the ORS-specific APIs:


1. Start the SIF Manager. For more information, see "Starting the SIF
Manager Tool" on page 612.
The Hub Console displays the SIF Manager in the right pane.
2. Click Remove ORS-specific SIF APIs.

Generating ORS-specific Message Schemas


Informatica MDM Hub supports two formats for JMS events: the legacy XML
format and the new ORS-specific XML format. By default, the ORS-specific
format is used. You can choose to use the legacy format in the Message
Queues tool

Note: If your Informatica MDM Hub implementation requires that you use the
legacy XML message format (Informatica MDM Hub XU version) instead of the
current version of the XML message format (described in this section), see
"Legacy JMS Message XML Reference" on page 479 instead.

Use the JMS Event Schema Manager tool to generate and deploy ORS-specific
JMS Event Messages for the current ORS. The XML schema for these messages

- 615 -
can be downloaded or accessed using a URL. For more information about JMS
Event Messages, see "JMS Message XML Reference" on page 464.

About the JMS Event Schema Manager Tool


The JMS Event Schema Manager uses an XML schema that defines the
message structure the Hub uses to generate JMS messages. This XML schema
is included as part of the Informatica MDM Hub Resource Kit. (The ORS-
specific schema is available using a URL or downloadable as a file).

Note: JMS Event Schema generation requires at least one secure package or
remote package be defined.

Important: If there are two databases that have the same schema (for
example, CMX_ORS), the logical name (which is the same as the schema
name) will be duplicated for JMS Events when the configuration is initially
saved. Consequently, the database display name is unique and should be used
as the initial logical name instead of the schema name to be consistent with
the SIF APIs. You will need to change the logical name before generating the
schema.

Additionally, each ORS has an XSD file specific to the ORS that uses the
elements from the common XSD file (siperian-mrm-events.xsd). The ORS-
specific XSD is named as <ors-name>-siperian-mrm-event.xsd. The XSD defines
two objects for each package and remote package in the schema:
Object Name Description
[packageName]Event Complex type containing elements of type
EventMetadata and [packageName].
[packageName]Record Complex type representing a package and its fields.
Also includes an element of type SipMetadata. This
complex type resembles the package record structures
defined in the Informatica MDM HubServices
Integration Framework (SIF). For more information,
refer to the Informatica MDM Hub Services Integration
Framework Guide.

Note: If legacy XML event message objects are to be used, ORS-specific


message object generation is not required.

Starting the JMS Event Schema Manager Tool


To start the JMS Event Schema Manager tool:
1. In the Hub Console, connect to an Operational Reference Store (ORS). For
more information, see "Changing the Target Database" on page 37.
2. Expand the Informatica Utilities workbench and then click SIF Manager.
3. Click the JMS Event Schema Manager tab.

- 616 -
The Hub Console displays the JMS Event Schema Manager tool.

The JMS Event Schema Manager tool displays the following areas:
Area Description
JMS ORS- Shows the event message schema for the ORS.
specific Use this function to generate and deploy ORS-specific JMS Event
Event Messages for the current ORS. The logical name is used to name
Message the components of the deployment. The schema can be
Schema downloaded or accessed using a URL.
Note: If legacy XML event message objects are to be used, ORS-
specific message object generation is not required.
Out of Sync Shows the database objects in the schema that are out of sync.
Objects with the generated API.

Generating and Deploying ORS-specific Schemas


This operation requires access to a Java compiler on the application server
machine. The Java software development kit (SDK) includes a compiler in
tools.jar. The Java runtime environment (JRE) does not contain a compiler.
If the SDK is not available, you will need to add the tools.jar file to the
classpath of the application server.

Important: If there are two databases that have the same schema (for
example, CMX_ORS), the logical name (which is the same as the schema
name) will be duplicated for JMS Events when the configuration is initially
saved. Consequently, the database display name is unique and should be used
as the initial logical name instead of the schema name to be consistent with
the SIF APIs. You will need to change the logical name before generating the
schema.

Additional notes:

- 617 -
• The following procedure assumes that you have already configured the
base objects, packages, and mappings of the ORS. If you subsequently
change any of these, regenerate the ORS-specific schemas.
• JMS Event Schema generation requires at least one secure package or
remote package.

To generate and deploy ORS-specific schemas:


1. Start the JMS Event Schema Manager. For more information, see "Starting
the JMS Event Schema Manager Tool" on page 616.
The Hub Console displays the JMS Event Schema Manager tool.
2. Enter a value in the Logical Name field for the event schema.
In order to make any changes to the schema, you must have a write lock.
For more information, see "Acquiring a Write Lock" on page 36.
3. Click Generate and Deploy ORS-specific Schemas.

Note: There must be at least one secure package or remote package


configured to generate the schema. If there are no secure objects to generate,
the Informatica MDM Hub generates a runtime error message.

Downloading an XSD File

An XSD file defines the structure of an XML file and can also be used to
validate the XML file. For example, if an XML file contains a reference to an
XSD, an XML validation tool can be used to verify that the tags in the XML
conform to the definitions defined in the XSD.

To download an XSD file:


1. Start the JMS Event Schema Manager. For more information, see "Starting
the JMS Event Schema Manager Tool" on page 616.
The Hub Console displays the JMS Event Schema Manager tool.
2. Acquire a write lock.
In order to make any changes to the schema, you must have a write lock.
For more information, see "Acquiring a Write Lock" on page 36.
3. Click Download XSD File.
Alternatively, you can use the URL specified in the Schema URL to access
the XSD file.

Finding Out-of-Sync Objects

You use Find Out Of Sync Objects to determine if the event schema needs to
be re-generated to reflect changes in the system. The JMS Event Schema
Manager displays a list of packages and remote packages that have changed
since the last schema generation.

- 618 -
Note: The Out of Sync Objects function compares the generated APIs to the
database objects in the schema so both must be present to find the out-of-
sync objects.

To find the out-of-sync objects:


1. Start the JMS Event Schema Manager. For more information, see "Starting
the JMS Event Schema Manager Tool" on page 616.
The Hub Console displays the JMS Event Schema Manager tool.
2. Acquire a write lock.
In order to make any changes to the schema, you must have a write lock.
For more information, see "Acquiring a Write Lock" on page 36.
3. Click Find Out of Sync Objects.
The JMS Event Schema Manager displays all out of sync objects in the
lower panel.

Note: Once you have evaluated the impact of the out-of-sync objects, you can
then decide whether or not to re-generate the schema (typically, external
components which interact with the Hub are written to work with a specific
version of the generated schema). If you regenerate the schema, these
external components may no longer work.

If the JMS Event Schema Manager returns any out-of-sync objects, click
Generate and Deploy ORS-specific Schema to re-generate the event
schema. For more information, see "Generating and Deploying ORS-specific
Schemas" on page 617.

Auto-searching for Out-of-Sync Objects

You can configure Informatica MDM Hub to periodically search for out-of-sync
objects and re-generate the schema as needed. This auto-poll feature
operates within the data change monitoring thread which automatically
engages a specified number of milliseconds between polls. You specify this
time frame using the Message Check Interval in the Message Queues tool.
When the monitoring thread is active, this automatic service first checks if the
out-of-sync interval has elapsed and if so, performs the out-of-sync check and
then re-generates the event schema as needed.

To configure the Hub to periodically search for out-of-sync objects:


1. Set the logical name of the schema to be generated in the JMS Event
Schema Manager.
For more information, see "Generating and Deploying ORS-specific
Schemas" on page 617.

- 619 -
Note: If you bypass this step, the Hub issues a warning in the server log
asking you configure the schema generation.
2. Enable the Queue Status for Data Changes Monitoring message. For more
information, see "Configuring Global Message Queue Settings" on page
451.
3. Select the root node Message Queues and set the Out of sync check
interval (milliseconds). For more information, see "Configuring Global
Message Queue Settings" on page 451.
Since the out-of-sync auto-poll feature effectively depends on the Message
check interval, you should set the Out-of-sync check interval to a value
greater than or equal to that of the Message check interval.
Note: You can disable to out-of-sync check by setting the out-of-sync
check interval to 0.

- 620 -
Chapter 20: Setting Up Security

This chapter describes how to set up security for your Informatica MDM Hub
implementation using the Hub Console. To learn how to configure user access
to the Hub Console, see "About User Access to Hub Console Tools" on page
737.

To learn more about configuring security using the Services Integration


Framework (SIF) instead, see the Informatica MDM Hub Services Integration
Framework Guide.

Chapter Contents
• "About Setting Up Security" on page 621
• "Securing Informatica MDM Hub Resources" on page 629
• "Configuring Roles" on page 638
• "Configuring Informatica MDM Hub Users" on page 646
• "Configuring User Groups" on page 658
• "Assigning Users to the Current ORS Database" on page 661
• "Assigning Roles to Users and User Groups" on page 662
• "Managing Security Providers" on page 664

About Setting Up Security


This section provides an overview of—and introduction to—Informatica MDM
Hub security.

Note: Before you begin, you must have:


• installed Informatica MDM Hub and created the Hub Store according to the
instructions in the Informatica MDM Hub Installation Guide
• built the schema; for more information, see "About the Schema" on page
73

Informatica MDM Hub Security Concepts


Security is the ability to protect information privacy, confidentiality, and data
integrity by guarding against unauthorized access to, or tampering with, data
and other resources in your Informatica MDM Hub implementation.

Before setting up security for your Informatica MDM Hub implementation, it is


important for you to understand some key concepts.

- 621 -
Security Access Manager

Informatica MDM Hub Security Access Manager (SAM) is Informatica’s


comprehensive security framework for protecting Informatica MDM Hub
resources from unauthorized access. At run time, SAM enforces your
organization’s security policy decisions for your Informatica MDM Hub
implementation, handling user authentication and access authorization
according to your security configuration.

Note: SAM security applies primarily to users of third-party applications who


want to gain access to Informatica MDM Hub resources. SAM applies only
tangentially to Hub Console users. The Hub Console has its own security
mechanisms to authenticate users and authorize access to Hub Console tools
and resources.

Authentication

Authentication is the process of verifying the identity of a user to ensure that


they are who they claim to be. A user is an individual who wants to access
Informatica MDM Hub resources (see "Configuring Informatica MDM Hub
Users" on page 646). In Informatica MDM Hub, users are authenticated based
on their supplied credentials—user name / password, security payload, or a
combination of both.

Informatica MDM Hub supports the following types of authentication:


Authentication Description
Type
Internal Informatica MDM Hub’s authentication mechanism in which the
user logs in with a user name and password (see "Starting the
Hub Console" on page 30)
External User authentication using an external user directory, with
Directory native support for LDAP-enabled directory serves, Microsoft
Active Directory, and Kerberos (see "External User Directory"
on page 625)
External User authentication using third-party authentication providers
Authentication (see "Managing Security Providers" on page 664)
Providers When configuring user accounts, you designate externally-
authenticated users by checking (selecting) the Use external
authentication? check box, as described in "Using External
Authentication" on page 650.

Informatica MDM Hub implementations can use each type of authentication


exclusively, or they can use a combination of them. The type of authentication
used in your Informatica MDM Hub implementation depends on how you
configure security, as described in "Security Implementation Scenarios" on
page 625.

- 622 -
Authorization

Authorization is the process of determining whether a user has sufficient


privileges to access a requested Informatica MDM Hub resource.

Informatica MDM Hub provides two types of authorization:


• Internal: Informatica MDM Hub’s internal authorization mechanism, in
which a user’s access to secure resources is determined by the privileges
associated with any roles that are assigned to their user account.
• External: Authorization using third-party authorization providers (see
"Managing Security Providers" on page 664)

Informatica MDM Hub implementations can use either type of authorization


exclusively, or they can use a combination of both. The type of authorization
used in your Informatica MDM Hub implementation depends on how you
configure security, as described in "Security Implementation Scenarios" on
page 625.

Secure Resources and Privileges

Informatica MDM Hub provides general types of resources that you can
configure to be secure resources: base objects, mappings, packages, remote
packages, cleanse functions, match rule sets, batch groups, metadata, content
metadata, Metadata Manager, HM profiles, the audit table, and the users
table. You can configure security for these resources in a highly granular way,
granting access to Informatica MDM Hub resources according to various
privileges (read, create, update, merge, and execute). Resources are either
PRIVATE (the default) or SECURE. Privileges can be granted only to secure
resources. To learn more see "Securing Informatica MDM Hub Resources" on
page 629.

Roles

In Informatica MDM Hub, resource privileges are allocated to roles. A role


represents a set of privileges to access secure Informatica MDM Hub
resources (see "Configuring Roles" on page 638). Users and user groups are
assigned to roles. A user’s resource privileges are determined by the roles to
which they are assigned, as well as by the roles assigned to the user group(s)
to which the user belongs. Security Access Manager enforces resource
authorization for requests from external application users. Administrators and
data stewards who use the Hub Console to access Informatica MDM Hub
resources are less directly affected by resource privileges (see "Privileges" on
page 631).

- 623 -
Access to Hub Console Tools

For users who will be using the Hub Console to access Informatica MDM Hub
resources, you can use the Tool Access tool in the Configuration workbench to
control access privileges to Hub Console tools. For example, data stewards
typically have access to only the Data Manager and Merge Manager tools. For
more information, see "About User Access to Hub Console Tools" on page 737.

How Users, Roles, Privileges, and Resources Are


Related
The following diagram shows how users, roles, privileges, and resources are
related in Informatica MDM Hub’s internal security framework.

When configuring security in Informatica MDM Hub:


• a specific resource is configured to be secure (not private).

- 624 -
• a specific role is configured to have access to one or more secure
resources.
• each secure resource is configured with specific privileges (READ, WRITE,
CREATE, and so on) that define that role’s access to the secure resource.
• a user is assigned one or more roles.

At run time, in order to execute a SIF request, the logged-in user must be
assigned a role that has the required privilege(s) to access the resource(s)
involved with the request. Otherwise, the user’s request will be denied.

Security Implementation Scenarios


This section describes a range of high-level scenarios in which security can be
configured in Informatica MDM Hub implementations. Policy decision points
(PDPs) are specific security check points that determine, at run-time, the
validity of a user’s identity (authentication), along with that user’s access to
Informatica MDM Hub resources (authorization). These scenarios vary in the
degree to which PDPs are handled internally by Informatica MDM Hub or
externally by third-party security providers or other security services.

Internal-only PDP

The following figure shows a security deployment in which all PDPs are
handled internally by Informatica MDM Hub.

In this scenario, Informatica MDM Hub makes all policy decisions based on
how users, groups, roles, privileges, and resources are configured using the
Hub Console.

External User Directory

The following figure shows a security deployment in which Informatica MDM


Hub integrates with an external directory.

- 625 -
In this scenario, the external user directory manages user accounts, groups,
and user profiles. The external user directory is able to authenticate users and
provide information to Informatica MDM Hub about group membership and
user profile information.

Users or user groups that are maintained in the external user directory must
still be registered in Informatica MDM Hub. Registration is required so that
Informatica MDM Hub roles—and their associated privileges—can be assigned
to these users and groups.

Roles-based Centralized PDP

The following figure shows a security deployment where role assignment—in


addition to user accounts, groups, and user profiles—is handled externally to
Informatica MDM Hub.

In this scenario, external roles are explicitly mapped to Informatica MDM Hub
roles.

Comprehensive Centralized PDP

The following figure shows a security deployment in which role definition and
privilege assignment—in addition to user accounts, groups, user profiles, and
role assignment—is handled externally to Informatica MDM Hub.

- 626 -
In this scenario, Informatica MDM Hub simply exposes the protected
resources using external proxies, which are synchronized with the internally-
protected resources using SIF requests (RegisterUsers, UnregisterUsers, and
ListSiperianObjects). All policy decisions are external to Informatica MDM
Hub.

Summary of Security Configuration Tasks


To configure security for a Informatica MDM Hub implementation using
Informatica MDM Hub’s internal security framework, you complete the
following minimal tasks using tools in the Hub Console:
1. Define global password policies for all users according to your
organization’s security policies and procedures. For instructions on using
the Users tool to define global password policies, see "Managing the Global
Password Policy" on page 654.
2. Add user accounts for your users. For instructions on using the Users tool
to configure user accounts, see "Configuring Users" on page 648.
3. Provide users with access to the database(s) they need to use. For
instructions on using the Users tool to provide database access, see
"Configuring User Access to ORS Databases" on page 653.
4. Optionally, configure user groups and assign users to them, if applicable.
For instructions on using the Users and Groups tool to configure user
groups, see "Configuring User Groups" on page 658.
5. Configure secure Informatica MDM Hub resources and (optionally)
resource groups. For instructions on using the Secure Resources tool to
configure resources and resource groups, see "Setting the Status of a
Informatica MDM Hub Resource" on page 634.
6. Define roles and assign resource privileges to roles. For instructions on
using the Roles tool to configure roles, see "Configuring Roles" on page
638.
7. Assign roles to users and (optionally) user groups. For instructions on
using the Users and Groups tool to assign roles, see "Assigning Roles to
Users and User Groups" on page 662.

- 627 -
8. For non-administrator users who will interact with Informatica MDM Hub
using the Hub Console, provide them with access to the Hub Console tools
that they will need to use, as described in "Configuring User Access to ORS
Databases" on page 653. For example, data stewards typically need
access to the Merge Manager and Data Manager tools (which are described
in the Informatica MDM Hub Data Steward Guide).

If you are using external security providers instead to handle any portion of
security in your Informatica MDM Hub implementation, you must configure
them in the Hub Console, as described in "Managing Security Providers" on
page 664.

Configuration Tasks For Security Scenarios


The following table shows the security configuration tasks that pertain to each
of the scenarios described in "Security Implementation Scenarios" on page
625. If a cell does not contain an “X”, then the associated task is handled
externally to Informatica MDM Hub.
Service / Task "Internal- "External "Roles-based "Comprehensive
only PDP" User Centralized Centralized
on page Directory" PDP" on page PDP" on page
625 on page 626 626
625
Users and Groups
"Configuring X X
Informatica MDM Hub
Users" on page 646
"Using External X
Authentication" on
page 650
"Assigning Users to X X
the Current ORS
Database" on page
661
"Managing the Global X
Password Policy" on
page 654
"Configuring User X X
Groups" on page 658
Secure Resources
"Securing Informatica X X X X
MDM Hub Resources"
on page 629
"Setting the Status of X X X X
a Informatica MDM
Hub Resource" on
page 634
Roles
"Configuring Roles" on X X X
page 638
"Mapping Internal X
Roles to External
Roles" on page 641

- 628 -
Service / Task "Internal- "External "Roles-based "Comprehensive
only PDP" User Centralized Centralized
on page Directory" PDP" on page PDP" on page
625 on page 626 626
625
"Assigning Resource X X X
Privileges to Roles" on
page 641
Security Providers
"Managing Security X X X
Providers" on page
664
Role Assignment
"Assigning Roles to X X
Users and User
Groups" on page 662

Note: This document describes how to configure Informatica MDM Hub’s


internal security framework using the Hub Console. If you are using third-
party security providers to handle any portion of security in your Informatica
MDM Hub implementation, refer to your security provider’s configuration
instructions instead.

Securing Informatica MDM Hub Resources


This section describes how to configure Informatica MDM Hub resources for
your Informatica MDM Hub implementation. The instructions in this section
apply to all scenarios described in "Security Implementation Scenarios" on
page 625.

About Informatica MDM Hub Resources


The Hub Console allows you to expose or hide Informatica MDM Hub resources
to external applications.

Types of Informatica MDM Hub Resources

The following types of Informatica MDM Hub resources can be configured as


secure resources:
Resource Notes
Type
BASE_ User has access to all secure base objects, columns, and content
OBJECT metadata. For details, see "Configuring Base Objects" on page 82.
CLEANSE_ User can execute all secure cleanse functions. For details, see
FUNCTION "Using Cleanse Functions" on page 314.
HM_ User has access to all secure HM Profiles. For details, see
PROFILE "Deleting Relationship Types from a Profile" on page 215.
MAPPING User has access to all secure mappings and their columns. For
 details, see "Mapping Columns Between Landing and Staging
Tables" on page 286.

- 629 -
Resource Notes
Type
PACKAGE User has access to all secure packages and their columns. For
 details, see "Configuring Packages" on page 151.
REMOTE User has access to all secure remote packages. Applicable only to
PACKAGE Informatica MDM Hub implementations with an Activity Manager
license.

In addition, the Hub Console allows you to protect other resources that are
accessible by SIF requests, including content metadata, match rule sets,
metadata, batch groups, validate metadata, the audit table, and the users
table.

Secure and Private Resources

A protected Informatica MDM Hub resource can be configured as either secure


or private.
Status Description
Setting
SECURE Exposes this Informatica MDM Hub resource to the Roles tool,
allowing the resource to be added to roles with specific privileges.
When a user account is assigned to a specific role, then that user
account is authorized to access the secure resources using SIF
requests according to the privileges associated with that role.
PRIVATE Hides this Informatica MDM Hub resource from the Roles tool.
Default. Prevents its access using SIF requests. When you add a new
resource in Hub Console (such as a new base object), it is
designated a PRIVATE resource by default.

In order for external applications to access a Informatica MDM Hub resource


using SIF requests, that resource must be configured as SECURE. Because all
Informatica MDM Hub resources are PRIVATE by default, you must explicitly
make a resource SECURE after the resource has been added.

There are certain Informatica MDM Hub resources that you might not want to
expose to external applications. For example, your Informatica MDM Hub
implementation might have mappings or packages that are used only in batch
jobs (not in SIF requests), so these could remain private.

Note: Package columns are not considered to be secure resources. They


inherit the secure status and privileges from the parent base object columns.
If package columns are based on system table columns (that is, C_REPOS_
AUDIT), or columns of tables that are not based on the base object (that is,
landing tables), there is no need to set up security for them, since they are
accessible by default.

- 630 -
Privileges

With Informatica MDM Hub internal authorization, each role is assigned one of
the following privileges.
Privilege Allows the User To....
READ View but not change data.
CREATE Create data records in the Hub Store.
UPDATE Update data records in the Hub Store.
MERGE Merge and unmerge data.
EXECUTE Execute cleanse functions (see "Using Cleanse Functions" on page
314) and batch groups (see "Running Batch Jobs Using the Batch
Group Tool" on page 512).

Privileges determine the access that external application users have to


Informatica MDM Hub resources. For example, a role might be configured to
have READ, CREATE, UPDATE, and MERGE privileges on particular packages.

Note: Each privilege is distinct and must be explicitly assigned. Privileges do


not aggregate other privileges. For example, having UPDATE access to a
resource does automatically give you READ access to it as well—both
privileges must be individually assigned.

These privileges are not enforced when using the Hub Console, although the
settings still affect the use of Hub Console to some degree. For example, the
only packages that data stewards can see in the Merge Manager and Data
Manager tools are those packages to which they have READ privileges. In
order for data stewards to edit and save changes to data in a particular
package, they must have UPDATE and CREATE privileges to that package (and
associated columns). If they do not have UPDATE or CREATE privileges, then
any attempts to change the data in the Data Manager will fail. Similarly, a
data steward must have MERGE privileges to merge or unmerge records using
the Merge Manager. To learn more about the Merge Manager and Data
Manager tools, see the Informatica MDM Hub Data Steward Guide.

Resource Groups

A resource group is a logical collection of secure resources. Using the Secure


Resources tool, you can define resource groups, and then assign related
resources to them. Resource groups simplify privilege assignment, allowing
you to assign privileges to multiple resources at once and easily assigning
resource groups to a role.

Resource Group Hierarchies

A resource group can also contain other resource groups—except itself or any
resource group to which it belongs—allowing you to build a hierarchy of

- 631 -
resource groups and to simplify the management of a large collection of
resources.

SECURE Resources Only

Only SECURE resources can belong to resource groups—PRIVATE resources


cannot. If you change the status of a resource to PRIVATE, then the resource
is automatically removed from any resource groups to which it belongs. When
the status of a resource is set to SECURE, the resource is added automatically
to the appropriate resource group (ALL_* resource groups by object type,
which are visible in the Roles tool).

Guidelines for Defining Resource Groups

To simplify administration, consider the implications of creating the following


kinds of resource groups:
• Define an ALL_RESOURCES resource group that contains all secure
resources, which allows you to set minimal privileges globally.
• Define resource groups by resource type (such as PACKAGES_READ) so
that you can easily set minimal privileges to those kinds of resources.
• Define resource groups by functional area (such as TEST_ONLY or
TRAINING_RESOURCES).
• Define a catch-all resource group that can be assigned to many different
roles that have similar privileges.

About the Secure Resources Tool


You use the Secure Resources tool in the Hub Console to manage security for
Informatica MDM Hub resources in a highly granular manner, including setting
the status (secure or private) of any Informatica MDM Hub resource, and
configuring a hierarchy of resources using resource groups. The Secure
Resources tool allows you to expose resources to, or hide resources from, the
Roles tool and SIF requests. To use the tool, you must be connected to an
ORS.

Starting the Secure Resources Tool


To start the Secure Resources tool:
• In the Hub Console, expand the Security Access Manager workbench, and
then click Secure Resources.

The Hub Console displays the Secure Resources tool.

The Secure Resources tool contains the following tabs:

- 632 -
Column Description
Resources Used to set the status of individual Informatica MDM Hub resources
(SECURE or PRIVATE). Informatica MDM Hub resources organized
in a hierarchy that shows the relationships among resources.
Global resources appear at the top of the hierarchy. For details,
see "Configuring Resources" on page 633.
Resource Used to configure resource groups. For details, see "Configuring
Groups Resource Groups" on page 635.

Configuring Resources
Use the Resources tab in the Secure Resources tool to browse and configure
Informatica MDM Hub resources.

Navigating the Resources Tree

Resources are organized hierarchically in the navigation tree by resource


type.

To expand the hierarchy:


• Click the plus (+) sign next to a resource type or resource.
OR

• Click to expand the entire tree (if you have acquired a write lock).

To hide resources beneath a resource type:


• Click the minus (-) sign next to its resource type.
OR

• Click to collapse the entire tree (if you have acquired a write lock).

- 633 -
Setting the Status of a Informatica MDM Hub Resource

You can configure the resource status (SECURE or PRIVATE) for any concrete
Informatica MDM Hub resource.

Note: This status setting does not apply to resource groups (which contain
only SECURE resources) or to global resources (for example, BASE_
OBJECT.*)—only to the resources that they contain.

To set the status of a one or more Informatica MDM Hub resources:


1. Start the Secure Resources tool. For more information, see "Starting the
Secure Resources Tool" on page 632.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. On the Resources tab, navigate the Resources tree to find the resource(s)
that you want to configure.
4. Do one of the following:
• Double-click the resource name to toggle between SECURE or
PRIVATE.
OR
• Select one or more resource name(s) (hold down the CTRL key to
select multiple resources at a time) and:

• Click to make all selected resources secure.


OR

• Click to make all selected resources private.

5. Click the Save button to save your changes.

Filtering Resources

To simplify changing the status of a collection of Informatica MDM Hub


resources, especially for an implementation with a large and complex
schema, you can specify a filter that displays only the resources that you want
to change.

To filter Informatica MDM Hub resources:


1. Start the Secure Resources tool. For more information, see "Starting the
Secure Resources Tool" on page 632.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.

3. Click the Filter Resources button.

- 634 -
The Secure Resources tool displays the Resources Filter dialog.

4. Do the following:
• Check (select) the resource type(s) that you want to include in the
filter.
• Uncheck (clear) the resource type(s) that you want to exclude in the
filter.
5. Click OK.
The Secure Resources tool displays the filtered Resources tree.

Configuring Resource Groups


As described in "Resource Groups" on page 631, you can use the Secure
Resources tool to define resources groups and create a hierarchy of
resources. You can then use the Roles tool to assign privileges to multiple
resources in a single operation.

Direct and Indirect Membership

The Secure Resources tool differentiates visually between resources that


belong directly to the current resource group (explicitly added) and resources
that belong indirectly because they are members of a resource group that
belongs to this resource group (implicitly added). For example, suppose you
have two resource groups:
• Resource Group A contains the Consumer base object, which means that
the Consumer base object is a direct member of Resource Group A
• Resource Group B contains the Address base object
• Resource Group A contains Resource Group B, which means that the
Address base object is an indirect member of Resource Group A

While editing Resource Group A, the Address base object is slightly grayed.

In this example, you cannot change the check box for the Address base object
when you are editing Resource Group A. You can change the check box only
when editing Resource Group B.

Adding Resource Groups

To add a resource group:


1. Start the Secure Resources tool. For more information, see "Starting the
Secure Resources Tool" on page 632.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.

- 635 -
3. Click the Resource Groups tab.
The Secure Resources tool displays the Resource Group tab.

4. Click the Add button.


The Secure Resources displays the Add Resources to Resource Group
dialog.

5. Enter a unique, descriptive name for the resource group.


6. Click the plus (+) sign to expand the resource hierarchy as needed.
Each resource has a check box indicating membership in the resource
group. If a parent in the tree is selected, all its children are automatically
selected as well. For example, if the Base Objects item in the tree is
selected, then all base objects and their child resources are selected.
7. Check (select) the resource(s) that you want to assign to this resource
group.
8. Click OK.
The Secure Resources tool adds the new resource to the Resource Groups
node.

Editing Resource Groups

To edit a resource group:


1. Start the Secure Resources tool. For more information, see "Starting the
Secure Resources Tool" on page 632.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Resource Groups tab.
4. Select the resource group whose properties you want to edit.

5. Click the Edit button.

- 636 -
The Secure Resources tool displays the Assign Resources to Resource
Group dialog.

6. Edit the resource group name, if you want.


7. Click the plus (+) sign to expand the resource hierarchy as needed.
8. Check (select) the Show Only Resources Selected for this Resource
Group check box, if you want.
9. Check (select) the resources that you want to assign to this resource
group.
10. Uncheck (clear) the resources that you want to remove this resource
group.
11. Click OK.

Deleting Resource Groups

To delete a resource group:


1. Start the Secure Resources tool. For more information, see "Starting the
Secure Resources Tool" on page 632.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Resource Groups tab.
4. Select the resource group that you want to remove.

5. Click the Remove button.


The Secure Resources tool prompts you to confirm deletion.
6. Click Yes.
The Secure Resources tool removes the deleted resource from the
Resource Groups node.

Refreshing the Resources List


If a resource such as a package or mapping has been recently added, be sure
to refresh the resources list to ensure that you can make it secure.

To refresh the Resources list:


• From the Secure Resources menu, choose Refresh.

The Secure Resources tool updates the Resources list.

Refreshing Other Security Changes


You can also change the refresh interval for all other security changes.

- 637 -
To set the refresh rate for security changes:
• Set the following parameter in the cmxserver.properities file:
cmx.server.sam.cache.resources.refresh_interval

Note: The default refresh interval is 5 minutes if not set.

Configuring Roles
This section describes how to configure roles for your Informatica MDM Hub
implementation.

Note: If you are using a Comprehensive Centralized PDP security deployment


(see "Comprehensive Centralized PDP" on page 626), in which users are
authorized externally, if your external authorization provider does not require
you to define roles in Informatica MDM Hub, then you can skip this section.

About Roles
In Informatica MDM Hub, a role represents a set of privileges to access secure
Informatica MDM Hub resources. In order for a user to view or manipulate a
secure Informatica MDM Hub resource, that user must be assigned a role that
grants them sufficient privileges to access the resource. Roles determine what
a user is authorized to access and do in Informatica MDM Hub. For more
information, see "Authorization" on page 623 and "Privileges" on page 631.

Informatica MDM Hub roles are highly granular and flexible, allowing
administrators to implement complex security safeguards according to your
organization’s unique security policies, procedures, and requirements. Some
users might be assigned to a single role with access to everything (such as an
administrator) or with explicitly-restricted privileges (such as a data
steward), while others might be assigned to multiple roles of varying
privileges.

A role can also have other roles assigned to it, thereby inheriting the access
privileges configured for those roles. Privileges are additive, meaning that,
when roles are combined, their privileges are combined as well. For example,
suppose Role A has READ privileges to an Address base object, and Role B has
CREATE and UPDATE privileges to it. If a user account is assigned Role A and
Role B, then that user account will have READ, CREATE, and UPDATE privileges
to the Address base object. A user account inherits the privileges configured
for any role to which the user account is assigned.

Resource privileges vary depending on the scope of access that is required for
users to do their jobs—ranging from broad and deep access (for example,
super-user administrators) to very narrow, focused access (for example,

- 638 -
READ privileges on one base object). It is generally recommended that you
follow the principle of least privilege—users should be assigned the least set
of privileges needed to do their work.

Because Informatica MDM Hub provides you with the ability to vary resource
privileges per role, and because resource privileges are additive, you can
define roles in a highly-granular manner for your Informatica MDM Hub
implementation. For example, you could define separate roles to provide
different access levels to human resources data (such as HRAppReadOnly,
HRAppCreateOnly, and HRAppUpdateOnly), and then combine them into
another aggregate role (such as HRAppAll). You would then assign to various
users just the role(s) that are appropriate for their job function.

Starting the Roles Tool


You use the Roles tool in the Security Access Manager workbench to configure
roles and assign access privileges to Informatica MDM Hub resources.

To start the Roles tool:


• In the Hub Console, expand the Security Access Manager workbench, and
then click Roles.
The Hub Console displays the Roles tool.

The Roles tool contains the following tabs:


Column Description
Resource Used to assign resource privileges to roles. For details, see
Privileges "Assigning Resource Privileges to Roles" on page 641.
Roles Used to assign roles to other roles. For details, see "Assigning
Roles to Other Roles" on page 643.
Report Used to generate a distilled report of resource privileges granted to
a given role. For details, see "Generating a Report of Resource
Privileges for Roles" on page 644.

Adding Roles
To add a new role:
1. Start the Roles tool. For more information, see "Starting the Roles Tool" on
page 639.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Point anywhere in the navigation pane, right-click the mouse, and choose
Add Role.
The Roles tool displays the Add Role dialog.

- 639 -
4. Specify the following information.

Field Description
Name Name of this role. Enter a unique, descriptive name.
Description Optional description of this role.
External External name (alias) of this role. For more information, see
Name "Mapping Internal Roles to External Roles" on page 641.
5. Click OK.
The Roles tool adds the new role to the roles list.

Editing Roles
To edit an existing role:
1. Start the Roles tool. For more information, see "Starting the Roles Tool" on
page 639.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Scroll the roles list and select the role that you want to edit.

4. For each property that you want to edit, click the Edit button next to it,
and specify the new value.

5. Click the Save button to save your changes.

Editing Resource Privileges

You can also assign and edit resource privileges for roles. For more
information, see "Assigning Resource Privileges to Roles" on page 641.

Inheriting Privileges

You can also edit the privileges for a specific role to inherit privileges from
other roles. For more information, see "Assigning Roles to Other Roles" on
page 643.

- 640 -
Deleting Roles
To delete an existing role:
1. Start the Roles tool. For more information, see "Starting the Roles Tool" on
page 639.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Scroll the roles list and select the role that you want to delete.
4. Point anywhere in the navigation pane, right-click the mouse, and choose
Delete Role.
The Roles tool prompts you to confirm deletion.
5. Click Yes.
The Roles tool removes the deleted role from the roles list.

Mapping Internal Roles to External Roles


For the Roles-Based Centralized PDP scenario (see "Roles-based Centralized
PDP" on page 626), you need to create a mapping (alias) between the
Informatica MDM Hub internal role and the external role that is managed
separately from Informatica MDM Hub. The external role name used by an
organization (for example, APAppsUser) might be very different from an
internal role name (such as VendorReadOnly) that makes sense in the context
of a Informatica MDM Hub environment.

Configuration details depend on the role mapping implementation of the


security provider. Role mapping is done within a configuration (XML) file. It is
possible to map one external role to more than one internal role.

Note: There is no predefined format for a configuration file. It might not be an


XML file or even a file at all. The mapping is a part of the custom user profile
or authentication provider implementation. The purpose of the mapping is to
populate a user profile object roles list with internal role IDs (rowids).

Assigning Resource Privileges to Roles


To assign resource privileges to a role:
1. Start the Roles tool. For more information, see "Starting the Roles Tool" on
page 639.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Scroll the roles list and select the role for which you want to assign
resource privileges.

- 641 -
4. Click the Resource Privileges tab.
The Roles tool displays the Resource Privileges tab.

This tab contains the following columns:

Field Description
Resources Hierarchy of secure Informatica MDM Hub resources. Displays
only those Informatica MDM Hub resources whose status has
been set to SECURE in the Secure Resources tool. For more
information, see "Setting the Status of a Informatica MDM Hub
Resource" on page 634.
Privileges Privileges to assign to secure resources. For more information,
see "Privileges" on page 631.
5. Expand the Resources hierarchy to show the secure resources that you
want to configure for this role.

6. For each resource that you want to configure:


• Check (select) any privilege that you want to grant to this role.
• Uncheck (clear) any privilege that you want to remove from this role.

- 642 -
7. Click the Save button to save your changes.

Assigning Roles to Other Roles


A role can also inherit other roles, except itself or any role to which it belongs.
For example, if you assign Role B to Role A, then Role A inherits Role B’s
access privileges. For more information, see "About Roles" on page 638.

To assign roles to a role:


1. Start the Roles tool. For more information, see "Starting the Roles Tool" on
page 639.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Scroll the roles list and select the role to which you want to assign other
roles.
4. Click the Roles tab.
The Roles tool displays the Roles tab.

The Roles tool displays any role(s) that can be assigned to the selected
role.
5. Check (select) any role that you want to assign to the selected role.
6. Uncheck (clear) any role that you want to remove from this role.

7. Click the Save button to save your changes.

- 643 -
Generating a Report of Resource Privileges for
Roles
You can generate a report that describes only the resource privileges granted
to a given role.

To generate a report of resource privileges for a role:


1. Start the Roles tool. For more information, see "Starting the Roles Tool" on
page 639.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Scroll the roles list and select the role for which you want to generate a
report.
4. Click the Report tab.
The Roles tool displays the Report tab.

5. Click Generate. The Roles tool generates the report and displays it on the
tab.

- 644 -
Clearing the Report Window

To clear the report window:


• Click Clear.

Saving the Generated Report as an HTML File

To clear a generated report as an HTML file:


1. Click Save.
The Roles tool prompts you to specify the target location for the saved
report.

2. Navigate to the target location.


3. Click Save.

- 645 -
The Security Access Manager saves the report using the following naming
convention:
<ORS_Name>-<Role_Name>-RolePrivilegeReport.html
where:
• ORS_Name—Name of the target database.
• Role_Name—Role associated with the generated report.
The Roles tool saves the current report as an HTML file in the target
location. You can subsequently display this report using a browser.

Configuring Informatica MDM Hub Users


This section describes how to configure users for your Informatica MDM Hub
implementation. Whenever Informatica MDM Hub internal authorization is
involved in an implementation, users must be registered in the Master
Database.

Before You Begin


Depending on how you have deployed security (see "Security Implementation
Scenarios" on page 625), your Informatica MDM Hub implementation might or
might not require that you add users to the Master Database.

You must configure users in the Master Database if:


• you are using Informatica MDM Hub’s internal authorization (see "Internal-
only PDP" on page 625)
• you are using Informatica MDM Hub’s external authorization (see "External
User Directory" on page 625)

- 646 -
• multiple users will run the Hub Console using different accounts (for
example, administrators and data stewards).

About Configuring Informatica MDM Hub Users


This section provides an overview of configuring Informatica MDM Hub users.
In Informatica MDM Hub, a user is an individual who can access Informatica
MDM Hub resources. For an introduction to Informatica MDM Hub users, see
the Informatica MDM Hub Overview.

How Users Access Informatica MDM Hub Resources

Users can access Informatica MDM Hub resources in the following ways:


Access Description
using
Hub Users who interact with Informatica MDM Hub by logging into the
Console Hub Console and using the tool(s) to which they have access,
such as administrators and data stewards.
Third-Party Users (called external application users) who interact with
Applications Informatica MDM Hub data indirectly using third-party
applications that use SIF classes. These users never log into Hub
Console. They log into Informatica MDM Hub using the
applications that they use to invoke SIF classes. To learn more
about the kinds of SIF requests that developers can invoke, see
the Informatica MDM Hub Services Integration Framework
Guide.

User Accounts

Users are represented in Informatica MDM Hub by user accounts, which are
defined in the master database in the Hub Store. You use the Users tool in the
Configuration workbench to define and configure user accounts for
Informatica MDM Hub users, as well as to change passwords and enable
external authentication. External applications with sufficient authorization can
also register user accounts using SIF requests, as described in the Informatica
MDM Hub Services Integration Framework Guide. A user needs to be defined
only once, even if the same user will access more than one ORS associated
with the Master Database.

A user account gains access to Informatica MDM Hub resources using the
role(s) assigned to it, inheriting the privileges configured for each role, as
described in "About Roles" on page 638.

Informatica MDM Hub allows for multiple concurrent SIF requests from the
same user account. For an external application in which granular auditing and
user tracking is not required, multiple users can use the same user account
when submitting SIF requests.

- 647 -
Starting the Users Tool
To start the Users tool:
1. In the Hub Console, connect to the master database, if you have not
already done so.
2. Expand the Configuration workbench and click Users.
The Hub Console displays the Users tool.

The Users tool contains the following tabs:


Tab Description
User Displays a list of all users that have been defined, except the
default admin user (which is created when Informatica MDM Hub is
installed). For more information, see "Configuring Users" on page
648.
Target Assign users to target databases. For more information, see
Database "Configuring User Access to ORS Databases" on page 653.
Global Specify global password policies. For more information, see
Password "Managing the Global Password Policy" on page 654.
Policy

Configuring Users
This section describes how to configure users in the Users tool. It refers to
functionality that is available on the Users tab of the Users tool.

Adding User Accounts

To add a user account:


1. Start the Users tool. For more information, see "Starting the Users Tool"
on page 648.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Users tab.

4. Click the button. The Users tool displays the Add User dialog.

- 648 -
5. Specify the following settings for this user.

Property Description
First name First name for this user.
Middle name Middle name for this user.
Last name Last name for this user.
User name Name of the user account for this user. Name that this
user will enter to log into the Hub Console.
Default Default database for this user. This is the database that is
database automatically selected when the user logs into Hub
Console, as described in "Starting the Hub Console" on
page 30. If you want to change this database later, see
"Configuring User Access to ORS Databases" on page 653.
Password Password for this user. If you want to change this
password later, see "Changing Password Settings for User
Accounts" on page 652.
Verify Type the password again to verify.
password
Use external One of the following settings:
authentication? • Check (select) this option to use external
authentication using a third-party security provider
instead of Informatica MDM Hub’s default
authentication. For more information, see "Managing
Security Providers" on page 664.
• Uncheck (clear) this option to use the default
Informatica MDM Hub authentication.
6. Click OK.
The Users tool adds the new user to the list of users on the Users tab.

Editing User Accounts

For each user, you can update their name, their default login database, and
specify other settings—such as whether Informatica MDM Hub retains a log of

- 649 -
user logins/logouts, whether they can log into Informatica MDM Hub, and
whether they have administrator-level privileges.

To edit user account settings:


1. Start the Users tool. For more information, see "Starting the Users Tool"
on page 648.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Users tab.
4. Select the user account that you want to configure.

5. To change a name, double-click the cell and type a different name.


6. Select a different login database and server, if you want.
7. Change any of the following settings, if you want.

Property Description
Administrator One of the following settings:
• Check (select) this option to give this user
administrative access, which allows them to have access
to all Hub Console tools and all databases.
• Uncheck (clear) this option if you do not want to grant
administrative access to this user. This is the default.
Enable One of the following settings:
• Check (select) this option to activate this user account
and allow this user to log in.
• Uncheck (clear) this option to disable this user account
and prevent this user from logging in.

8. Click the Save button to save your changes.

Using External Authentication

When adding or editing a user account that will be authenticated externally,


you need to check (select) the Use External Authentication check box. If
unchecked (cleared), then Informatica MDM Hub’s default authentication will
be used for this user account instead. For more information, see "Managing
Security Providers" on page 664.

- 650 -
Editing Supplemental User Information

In Informatica MDM Hub implementations that are not tied to an external user
directory (see "External User Directory" on page 625), you can use
Informatica MDM Hub to manage supplemental information for each user,
such as their e-mail address and phone numbers. Informatica MDM Hub does
not require that you provide this information, nor does Informatica MDM Hub
use this information in any special way.

To edit supplemental user information:


1. Start the Users tool. For more information, see "Starting the Users Tool"
on page 648.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Users tab.
4. Select the user whose properties you want to edit.

5. Click the Edit button.


The Users tool displays the Edit User dialog.

6. Specify any of the following properties:

Property Description
Title User’s title, such as Dr. or Ms. Click the drop-down list
and select a title.

- 651 -
Property Description
Initials User’s initials.
Suffix User’s suffix, such as MD or Jr.
Job title User’s job title.
Email User’s e-mail address.
Telephone area Area code for user’s telephone number.
code
Telephone User’s telephone number.
number
Fax area code Area code for user’s fax number.
Fax number User’s fax number.
Mobile area Area code for user’s mobile phone.
code
Mobile number User’s mobile phone number.
Login message Message that the Hub Console displays after this user
logs in.
7. Click OK.

8. Click the Save button to save your changes.

Deleting User Accounts

To remove a user:
1. Start the Users tool. For more information, see "Starting the Users Tool"
on page 648.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Users tab.
4. Select the user that you want to remove.

5. Click the button.


In the Users tool prompts you to confirm deletion.
6. Click Yes to confirm deletion.
The Users tool removes the deleted user account from the list of users on
the Users tab.

Changing Password Settings for User Accounts

To change password settings for a user:


1. Start the Users tool. For more information, see "Starting the Users Tool"
on page 648.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Users tab.
4. Select the user whose password you want to change.

- 652 -
5. Click the button.
The Users tool displays the Change Password dialog for the selected user.

6. Specify the new password and in both the Password and Verify password
fields, if you want.
7. Do one of the following:
• Check (select) this option to use external authentication using a third-
party security provider instead of Informatica MDM Hub’s default
authentication. For more information, see "Managing Security
Providers" on page 664.
• Uncheck (clear) this option to use the default Informatica MDM Hub
authentication.
8. Click OK.

Configuring User Access to ORS Databases


Once a user account is defined in Informatica MDM Hub, you need to explicitly
provide the account with access to one or more ORS databases.

To configure user access to databases:


1. Start the Users tool. For more information, see "Starting the Users Tool"
on page 648.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Target Database tab.
The Users tool displays the Target Database tab.

- 653 -
4. Expand each database node to see which users that can access that
database.
5. To change user assignments to a database, right-click on the database
name and choose Assign User.
The Users tool displays the Assign User to Database dialog.

6. Check (select) the names of any users that you want to assign to the
selected database.
7. Uncheck (clear) the names of any users that you want to unassign from
the selected database.
8. Click OK.

Configuring Password Policies


You can define password policies for all users (global password policy) as well
as for individual users (private password policies that override the global
password policy).

Managing the Global Password Policy

The global password policy applies to users who do not have private password
policies specified for them (as described in "Specifying Private Password
Policies for Individual Users" on page 656).

To manage the global password policy:


1. Start the Users tool. For more information, see "Starting the Users Tool"
on page 648.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Global Password Policy tab.
The Global Password Policy window is displayed.

- 654 -
4. Specify the following password policy settings.

Policy Description
Password Minimum and maximum length, in characters.
Length
Password Do one of the following:
Expiry • Check (select) the Password Expires check box and
specify the number of days before the password
expires.
• Uncheck (clear) the Password Expires check box so that
the password never expires.
Login Number of grace logins and maximum number of failed
Settings logins.
Password Number of times that a password can be re-used.
History
Password Other configuration settings, such as:
Requirements • enforce case-sensitivity
• enforce password validation
• enforce a minimum number of unique characters
• password patterns

5. Click to save your global settings.

- 655 -
Specifying Private Password Policies for Individual Users

For any given user, you can specify a private password policy that overrides
the global password policy (see "Managing the Global Password Policy" on
page 654).

Note: For ease of password policy maintenance, it is recommended that,


whenever possible, password policies be managed at the global policy level
rather than at private policy levels.

To specify the private password policy for a user:


1. Start the Users tool. For more information, see "Starting the Users Tool"
on page 648.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Users tab.
4. Select the user for whom you want to set the private password policy.

5. Click the button.


The Users tool displays the Private Password Policy window for the
selected user.

6. Check (select) Private password policy enabled.


7. Specify the password policy settings you want for this user, as described in
"Managing the Global Password Policy" on page 654.
8. Click OK.

- 656 -
9. Click the Save button to save your changes.

Configuring Secured JDBC Data Sources


In Informatica MDM Hub implementations, if a JDBC data source has been
secured using application server security, you need to store the application
server’s user name and password for the JDBC data source in the
cmxserver.properties file. Passwords must be encrypted—they cannot be
stored as clear text. To learn more about secured JDBC data sources, see your
application server documentation.

Configuring User Names and Passwords for a Secured JDBC Data


Source

To configure user names and passwords for a secured JDBC data source in the
cmxserver.properties file, use the following parameters:
databaseId.username=username
databaseId.password=encryptedPassword

where databaseId is the unique identifier of the JDBC data source.

ORS Database ID for Oracle SID Connection Types

For an Oracle SID connection type, the databaseId consists of the following


strings:
<database_hostname>-<Oracle_SID>-<schema_name>

For example, given the following settings:


• <database_hostname> = localhost
• <Oracle_SID> = MDMHUB
• <schema name> = Test_ORS

the username and password properties would be:


localhost-MDMHUB-Test_ORS.username=weblogic
localhost-MDMHUB-Test_ORS.password=9C03B113CD8E4BBFD236C56D5FEA56EB

ORS Database ID for Oracle Service Connection Types

For an Oracle Service connection type, the databaseId consists of the following
strings:
<service_name>-<schema_name>

For example, given the following settings:


• <service_name> = MDM_Service
• <schema name> = Test_ORS

- 657 -
the username and password properties would be:
MDM_Service-Test_ORS.username=weblogic
MDM_Service-Test_ORS.password=9C03B113CD8E4BBFD236C56D5FEA56EB

Database ID for the Master Database

If you want to secure the JDBC data source that accesses the Master
Database, the databaseId is CMX_SYSTEM. In this case, the properties would
be:
CMX_SYSTEM.username=weblogic
CMX_SYSTEM.password=9C03B113CD8E4BBFD236C56D5FEA56EB

Generating an Encrypted Password

To generate an encrypted password, use the following commands:


C:\>java -cp siperian-common.jar com.siperian.common.security.Blowfish
password
Plaintext Password: password
Encrypted Password: 9C03B113CD8E4BBFD236C56D5FEA56EB

Configuring User Groups


This section describes how to configure user groups in your Informatica MDM
Hub implementation.

About User Groups


A user group is a logical collection of user accounts. User groups simplify
security administration. For example, you can combine external application
users into a single user group, and then grant security privileges to the user
group rather than to each individual user. In addition to users, user groups can
contain other user groups. To learn about users and user accounts, see
"Configuring Informatica MDM Hub Users" on page 646.

You use the Groups tab in the Users and Groups tool in the Security Access
Manager workbench to configure users groups and assign user accounts to
user groups. To use the Users and Groups tool, you must be connected to an
ORS.

Starting the Users and Groups Tool


To start the Users and Groups tool:
1. In the Hub Console, connect to an ORS, if you have not already done so.
2. Expand the Security Access Manager workbench and click Users and
Groups.

- 658 -
The Hub Console displays the Users and Groups tool.

The Users and Groups tool contains the following tabs:


Tab Description
Groups Used to define user groups and assign users to user groups. For
more information, see "Configuring User Groups" on page 658.
Users Used to associate user accounts with a database. For more
Assigned to information, see "Assigning Users to the Current ORS
Database Database" on page 661.
Assign Used to associate users and user groups with roles. For more
Users/Groups information, see "Assigning Users and User Groups to Roles" on
to Role page 662.
Assign Roles Used to associate roles with users and user groups. For more
to User / information, see "Assigning Roles to Users and User Groups" on
Group page 663.

Adding User Groups


To add a user group:
1. Start the Users and Groups tool. For more information, see "Starting the
Users and Groups Tool" on page 658.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Groups tab.

4. Click the button.


The Users and Groups tool displays the Add User Group dialog.

- 659 -
5. Enter a descriptive name for the user group.
6. Optionally, enter a description of the user group.
7. Click OK.
The Users and Groups tool adds the new user group to the list.

Editing User Groups


To edit an existing user group:
1. Start the Users and Groups tool. For more information, see "Starting the
Users and Groups Tool" on page 658.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Groups tab.
4. Scroll the list of user groups and select the user group that you want to
edit.

5. For each property that you want to edit, click the Edit button next to it,
and specify the new value.

6. Click the Save button to save your changes.

- 660 -
Deleting User Groups
To delete a user group:
1. Start the Users and Groups tool. For more information, see "Starting the
Users and Groups Tool" on page 658.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Groups tab.
4. Scroll the list of user groups and select the user group that you want to
delete.

5. Click the button.


The Users and Groups tool prompts you to confirm deletion.
6. Click Yes.
The Users and Groups tool removes the deleted user group from the list.

Assigning Users and Users Groups to User Groups


To assign members (users and user groups) to a user group:
1. Start the Users and Groups tool. For more information, see "Starting the
Users and Groups Tool" on page 658.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Group tab.
4. Scroll the list of user groups and select the user group to which you want
to edit.
5. Right-click the user group that you just created and choose Assign Users
and Groups.
The Users and Groups tool displays the Assign to User Group dialog.

6. Check (select) the names of any users and user groups that you want to
assign to the selected user group.
7. Uncheck (clear) the names of any users and user groups that you want to
unassign from the selected user group.
8. Click OK.

Assigning Users to the Current ORS Database


This section describes how to assign users to the currently-targeted ORS
database. To assign user access to other ORS databases, see "Configuring

- 661 -
User Access to ORS Databases" on page 653.

To assign users to the current ORS database:


1. Start the Users and Groups tool. For more information, see "Starting the
Users and Groups Tool" on page 658.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Users Assigned to Database tab.

4. Click to assign users to an ORS database.


The Users and Groups tool displays the Assign User to Database dialog.

5. Check (select) the names of any users that you want to assign to the
selected ORS database.
6. Uncheck (clear) the names of any users that you want to unassign from
the selected ORS database.
7. Click OK.

Assigning Roles to Users and User Groups


This section describes how to associate roles with users and user groups. The
Users and Groups tool provides two ways to define the association:
• assigning users and user groups to roles
• assigning roles to users and user groups

You can choose the way that is most expedient for your implementation.

Assigning Users and User Groups to Roles


To assign users and user groups to a role:
1. Start the Users and Groups tool. For more information, see "Starting the
Users and Groups Tool" on page 658.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Assign Users/Groups to Role tab.

- 662 -
4. Select the role to which you want to assign users and user groups.

5. Click the Edit button.


The Users and Groups tool displays the Assign Users to Role dialog.

6. Check (select) the names of any users and user groups that you want to
assign to the selected role.
7. Uncheck (clear) the names of any users and user groups that you want to
unassign from the selected role.
8. Click OK.

Assigning Roles to Users and User Groups


To assign roles to users and user groups:
1. Start the Users and Groups tool. For more information, see "Starting the
Users and Groups Tool" on page 658.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Click the Assign Roles to User/Group tab.

4. Select the user or user group to which you want to assign roles.

5. Click the Edit button.


The Users and Groups tool displays the Assign Roles to User dialog.

6. Check (select) the roles that you want to assign to the selected user or
user group.
7. Uncheck (clear) the roles that you want to unassign from the selected user
or user group.

- 663 -
8. Click OK.

Managing Security Providers


This section describes how to manage security providers in your Informatica
MDM Hub implementation.

About Security Providers


A security provider is a third-party organization that provides security
services for users accessing Informatica MDM Hub. Security providers are
used in certain Informatica MDM Hub security deployment scenarios, as
described in "Security Implementation Scenarios" on page 625.

Types of Security Providers

Informatica MDM Hub supports the following types of security providers:


Service Description
Authentication Authenticates a user by validating their identity. Informs
Informatica MDM Hub only that the user is who they claim to
be—not whether they have access to any Informatica MDM Hub
resources.
Authorization Informs Informatica MDM Hub whether a user has the required
privilege(s) to access particular Informatica MDM Hub
resources.
User Profile Informs Informatica MDM Hub about individual users, such as
user-specific attributes and the roles to which the user belongs.

Internal Providers

Informatica MDM Hub comes with a set of default internal security providers
(labeled Internal Provider in the Security Providers tool). You can also add
your own third-party security providers. Internal security providers cannot be
removed.

Starting the Security Providers Tool


You use the Security Providers tool in the Configuration workbench to register
and manage security providers for Informatica MDM Hub. To use the Security
Providers tool, you must be connected to the master database.

To start the Security Providers tool:


• In the Hub Console, expand the Configuration workbench, and then click
Security Providers.
The Hub Console displays the Security Providers tool.

- 664 -
In the Security Providers tool, the navigation tree has the following main
nodes:
Tab Description
Provider Expand to display the provider files that have been uploaded in your
Files Informatica MDM Hub implementation. For more information, see
"Managing Provider Files" on page 665.
Providers Expand to display the list of providers that are defined in your
Informatica MDM Hub implementation. For more information, see
"Managing Security Provider Settings" on page 668.

Informatica MDM Hub provides a set of default providers:


• Internal providers represent Informatica MDM Hub’s internal
implementations for authentication, authorization, and user profile
services.
• Super providers always return a positive response for authentication and
authorization requests. Super providers are useful in development
environments when you do not want to configure users, roles, privileges,
and so on. For this purpose, these should be set first in an adjudication
sequence and enabled.
Super providers can also be used in a production environment in which
security is provided as a layer on top of the SIF requests for performance
gains.

Managing Provider Files


If you want to use your own third-party security providers (in addition to
Informatica MDM Hub’s default internal security providers), you must
explicitly register using the Security Providers tool. To register a provider,
you upload a provider file that contains the profile information needed for
registration.

About Provider Files

A provider file is a JAR file that contains the following information:

- 665 -
• A manifest that describes one or more external security provider(s). Each
security provider has the following settings:
• Provider Name
• Provider Description
• Provider Type
• Provider Factory Class Name
• Properties for configuring the provider (a list of name-value pairs:
property names with default values)
• One or more JAR files containing the provider implementation and any
required third-party libraries.

Sample Provider File

The Informatica sample installer copies a sample implementation of a


provider file into the SamSample subdirectory under the target samples
directory (such as c:\<infamdm_install_dir>\oracle\sample\SamSample). For
more information, see the Informatica MDM Hub Installation Guide.

Provider Files List

The Security Providers tool displays a list of provider files under the Provider
Files node in the left navigation pane. You use right-click menus in the left
navigation pane of the Security Providers tool to upload, delete, and move
provider files in the Provider Files list.

Selecting a Provider File

To select a provider file in the Security Providers tool:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. In the left navigation pane, click the provider file that you want to select.
The Security Providers tool displays the Provider File panel for the
selected provider file.

The Provider File panel contains no editable fields.

- 666 -
Uploading a Provider File

To upload a provider file to add or update provider information:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the left navigation pane, right-click Provider Files and choose Upload
Provider File.

The Security Provider tool prompts you to select the JAR file for this
provider.

4. Specify the JAR file, navigating the file system as needed and selecting the
JAR file that you want to upload.

5. Click Open.

- 667 -
The Security Provider tool checks the selected file to determine whether it
is a valid provider file.
If the provider name from the manifest is the same as the name of an
existing provider file, then the Security Provider tool asks you whether to
overwrite the existing provider file. Click Yes to confirm.
The Security Provider tool uploads the JAR file to the application server,
adds the provider file to the list, populates the Providers list with the
additional provider information, and refreshes the left navigation pane.

Once the file has been uploaded, the original file can be removed from the
file system, if you want. The Security Provider tool has already imported
the information and does not subsequently refer to the original file.

Deleting a Provider File

Note: Internal security providers that are shipped with Informatica MDM Hub
cannot be removed. For internal security providers, there is no separate
provider file under the Provider Files node.

To delete a provider file:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the left navigation pane, right-click the provider file that you want to
delete, and then choose Delete Provider File.
The Security Provider tool prompts you to confirm deletion.
4. Click Yes.
The Security Provider tool removes the deleted provider file from the list.

Managing Security Provider Settings


The Security Providers tool displays a list of registered providers under the
Provider node in the left navigation pane. This list is sorted by provider type
(Authentication, Authorization, or User Profile provider).

You use right-click menus in the left navigation pane of the Security Providers
tool to move providers up and down in the Providers list.

Sequence of the Providers List

The order of providers in the Provider list represents the order in which they
are invoked. For example, when a user attempts to log in and supplies their

- 668 -
user name and password, Informatica MDM Hub submits their login
credentials to each authentication provider in the Authentication list,
proceeding sequentially through the list. If authentication succeeds with one of
the providers in the list, then the user is deemed authenticated. If
 authentication fails with all available authentication providers, then
authentication for that user fails. To learn about changing the processing
order, see "Moving a Security Provider Up in the Processing Order" on page
676 and "Moving a Security Provider Down in the Processing Order" on page
676.

Selecting a Security Provider

To select a provider in the Security Providers tool:


• In the left navigation pane, click the provider that you want to select.
The Security Providers tool displays the Provider panel for the selected
provider file.

Properties on the Provider Panel

The Provider panel contains the following fields:


Field Description
Name Name of this security provider.
Description Description of this security provider.
Provider Type of security provider. One of the following values:
Type • Authentication
• Authorization
• User Profile
For more information, see "About Security Providers" on page
664.
Provider Name of the provider file associated with this security provider,
File or Internal Provider for internal providers. For more
information, see "Managing Provider Files" on page 665.
Enabled Indicates whether this security provider is enabled (checked) or
not (unchecked). Note that internal providers cannot be disabled.
Properties Additional properties for this security provider, if defined by the
security provider. Each property is a name-value pair. A security
provider might require or allow unique properties that you can

- 669 -
Field Description
specify here. For more information, see "Configuring Provider
Properties" on page 670.

Configuring Provider Properties

A provider property is a name-value pair that a security provider might


require in order to access for the service(s) that they provide. You can use the
Security Providers tool to define these properties.

Adding Provider Properties

To add provider properties:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the left navigation pane, select the authentication provider for which
you want to add properties.

4. Click the Add button.


5. The Security Providers tool displays the Add Provider Property dialog.

6. Specify the name of the property.


7. Specify the value to assign to this property.
8. Click OK.

Editing Provider Properties

To edit an existing provider property:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the left navigation pane, select the authentication provider for which
you want to edit properties.

4. For each property that you want to edit, click the Edit button next to it,
and specify the new value.

- 670 -
5. Click the Save button to save your changes.

Removing Provider Properties

To remove an existing provider property:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the left navigation pane, select the authentication provider for which
you want to remove properties.
4. Select the property that you want to remove.

5. Click the Delete button.


The Security Providers tool prompts you to confirm deletion.
6. Click Yes.

Custom-added Providers

You can also package custom provider classes in the JAR/ZIP file. Specify the
settings for the custom providers in properties file named
providers.properties. You must place this file within the JAR file in the META-
INF directory. These settings (that is, the name/value pairs) are then read by
the loader and translated to what is displayed in the Hub Console.

Here are the elements of a provider.properties file:

Element Description
Name
ProviderList Comma-separated list of the contained provider names.
File- Description of the package.
Description
Note that the remaining elements listed below come in groups of
five (5) which correspond to each of the names in ProviderList
(so for the remaining elements listed here, “XXX” represents one
of the names that would be specified in ProviderList).
XXX- Display name of the provider XXX.
Provider-
Name
XXX- Description of the provider XXX.
Provider-
Description
XXX- Type of the provider XXX. The allowed values are USER_
Provider- PROFILE_PROVIDER, JAAS_LOGIN_MODULE, AUTHORIZATION_
Type PROVIDER.
XXX- Implementation class of the provider (contained in the same
Provider- JAR/ZIP file).
Factory-
Class-

- 671 -
Element Description
Name
Name
XXX- Comma-separated list of name/value pairs defining provider
Provider- properties (name1=value1,…).
Properties

Note: The provider archive file (JAR/ZIP) must contain all the classes
required for the custom provider to be functional, as well as all of the required
resources. These resources are specific to your implementation.

Example providers.properties File

Note: All of these settings are required except for XXX-Provider-Properties.


ProviderList=ProviderOne,ProviderTwo,ProviderThree,ProviderFour
ProviderOne-Provider-Name: Sample Role Based User Profile Provider
ProviderOne-Provider-Description: Sample User Profile Provider for roled-based
management
ProviderOne-Provider-Type: USER_PROFILE_PROVIDER
ProviderOne-Provider-Factory-Class-Name:
com.siperian.sam.sample.userprofile.SampleRoleBasedUserProfileProviderFactory
ProviderOne-Provider-Properties: name1=value1,name2=value2
ProviderTwo-Provider-Name: Sample Login Module
ProviderTwo-Provider-Description: Sample Login Module
ProviderTwo-Provider-Type: JAAS_LOGIN_MODULE
ProviderTwo-Provider-Factory-Class-Name: com.siperian.sam.sample.authn.SampleLoginModule
ProviderTwo-Provider-Properties:
ProviderThree-Provider-Name: Sample Role Based Authorization Provider
ProviderThree-Provider-Description: Sample Role Based Authorization Provider
ProviderThree-Provider-Type: AUTHORIZATION_PROVIDER
ProviderThree-Provider-Factory-Class-Name:
com.siperian.sam.sample.authz.SampleAuthorizationProviderFactory
ProviderThree-Provider-Properties:
ProviderFour-Provider-Name: Sample Comprehensive User Profile Provider
ProviderFour-Provider-Description: Sample Comprehensive User Profile Provider
ProviderFour-Provider-Type: USER_PROFILE_PROVIDER
ProviderFour-Provider-Factory-Class-Name:
com.siperian.sam.sample.userprofile.SampleComprehensiveUserProfileProviderFactory
ProviderFour-Provider-Properties:
File-Description=The sample provider files

Adding a Login Module

Informatica MDM Hub supports the use of external authentication for users
through the Java Authentication and Authorization Service (JAAS). Informatica
MDM Hub provides templates for the following types of authentication
standards:
• Lightweight Directory Access Protocol (LDAP)
• Microsoft Active Directory
• Network authentication using the Kerberos protocol

- 672 -
These templates provide the settings (protocols, server names, ports, and so
on) that are required for these authentication standards. You can use these
templates to add a new login module and provide the settings you need. To
learn more about these authentication standards, see the applicable vendor
documentation.

To add a login module:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the left navigation pane, right-click Authentication Providers (Login
Modules) and choose Add Login Module.
The Security Providers tool displays the Add Login Module dialog box.

4. Click the down arrow and select a template for the login module.

Template Name Description


OpenLDAP-template Based on LDAP authentication properties.
MicrosoftActiveDirectory- Based on Active Directory authentication
template properties.
Kerberos-template Based on Kerberos authentication
properties.
5. Click OK.
The Security Providers tool adds the new login module to the list.

- 673 -
6. In the Properties panel, click the Edit button next to any property that
you want to edit, such as its name and description, and change the setting.
For LDAP, you can specify the following settings.

Property Description
java.naming.factory.initial Required. Java class name of the JNDI
implementation for connecting to an LDAP
server. Use the following value:
com.sun.jndi.ldap.LdapCtxFactory.
java.naming.provider.url Required. URL of the LDAP server. For example:
ldap://localhost:389/
username.prefix Optional. Tells Informatica MDM Hub how to
parse the LDAP username. An OpenLDAP user
name looks like this:
cn=myopenldapuser,dc=siperian,dc=com
where
• myopenldapuser is the user name
• siperian is the domain name
• com is the top-level domain
In this example, the username.prefix is: cn=
username.postfix Optional. User in conjunction with
username.prefix. Using the previous example,
set username.postfix to:
,dc=siperian,dc=com
Note the comma in the beginning of the string.
For Microsoft Active directory, you can specify the following settings:

Property Description
java.naming.factory.initial Required. Java class name of the JNDI
implementation for connecting to an LDAP
server. Use the following value:
com.sun.jndi.ldap.LdapCtxFactory.
java.naming.provider.url Required. URL of the LDAP server. For example:
ldap://localhost:389/
For Kerberos authentication:

- 674 -
• To set up Kerberos authentication for a user on JBoss and WebLogic
using Sun’s JVM, use Sun’s LoginModule
(com.sun.security.auth.module.Krb5LoginModule). For more
information, see the Kerberos documentation at http://java.sun.com.
• To set up Kerberos authentication for a user on WebSphere using IBM’s
JVM, you can use IBM’s LoginModule
(com.ibm.security.auth.module.Krb5LoginModule). For more
information, see the Kerberos documentation on http://www.ibm.com.
• To use either of these Kerberos implementations, you must configure
the JVM of the Informatica MDM Hub application server with
winnt\krb5.ini or
JAVA_HOME\jre\lib\security\krb5.conf.

7. Click the Save button to save your changes.

Deleting a Login Module

To add a delete login module:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the left navigation pane, right-click a login module under Authentication
Providers (Login Modules) and choose Delete Login Module.
The Security Provider tool prompts you to confirm deletion.
4. Click Yes.
The Security Provider tool removes the deleted login module from the list
and refreshes the left navigation pane.

Changing Security Provider Settings

To change the settings for a security provider:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. Select the security provider whose properties you want to change, as
described in "Selecting a Security Provider" on page 669.

4. In the Properties panel, click the Edit button next to any property that
you want to edit.

5. Click the Save button to save your changes.

- 675 -
Enabling and Disabling Security Providers
1. Acquire a write lock, if you have not already done so.
2. Select the security provider that you want to enable or disable, as
described in "Selecting a Security Provider" on page 669.
3. Do one of the following:
• Check the Enabled check box to enable a disabled security provider.
• Uncheck the Enabled check box to disable a security provider.
Once disabled, the provider name appears greyed out and at the end of
the Providers list. Disabled providers cannot be moved.

4. Click the Save button to save your changes.

Moving a Security Provider Up in the Processing Order

As described in "Sequence of the Providers List" on page 668, Informatica


MDM Hub processes security providers in the order in which they appear in the
Providers list.

To move a security provider up the list:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
3. In the left navigation pane, select the provider (not the first one in the list,
nor any disabled providers) that you want to move up.
4. In the left navigation pane, right-click and choose Move Provider Up.
The Security Provider tool moves the provider ahead of the previous one in
the Providers list, and then refreshes the left navigation pane.

Moving a Security Provider Down in the Processing Order

As described in "Sequence of the Providers List" on page 668, Informatica


MDM Hub processes security providers in the order in which they appear in the
Providers list.

To move a provider down the list:


1. Start the Security Providers tool. For more information, see "Starting the
Security Providers Tool" on page 664.
2. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.

- 676 -
3. In the left navigation pane, click the provider (not the last one in the list,
nor any disabled providers) that you want to move down.
4. In the left navigation pane, right-click and choose Move Provider Down.
The Security Provider tool moves the provider after the subsequent one in
the Providers list and refreshes the left navigation pane.

- 677 -
Chapter 21: Viewing Registered
Custom Code

This chapter describes how to use the User Object Registry tool to view
registered custom code.

Chapter Contents
• "About User Objects" on page 678
• "About the User Object Registry Tool" on page 678
• "Starting the User Object Registry Tool" on page 679
• "Viewing User Exits" on page 679
• "Viewing Custom Stored Procedures" on page 680
• "Viewing Custom Java Cleanse Functions" on page 681
• "Viewing Custom Button Functions" on page 682

About User Objects


User objects are user-defined functions or procedures that are registered with
the Informatica MDM Hub to extend its functionality. There are four types of
user objects:
User Description
Object
User Exits A user-customized, unencrypted stored procedure that includes a
set of fixed, pre-defined parameters. The procedure is configured,
on a per-base object basis, to execute at a specific point during a
Informatica MDM Hub batch process run. For more information,
see "Viewing User Exits" on page 679.
Custom Stored procedures that are registered in table C_REPOS_TABLE_
Stored OBJECT and can be invoked from Batch Manager. For more
Procedures information, see "Viewing Custom Stored Procedures" on page
680.
Custom Java cleanse functions that supplement the standard cleanse
Java libraries with customer logic. These functions are basically Jar
Cleanse files and stored as BLOBs in the database. For more information,
Functions see "Viewing Custom Java Cleanse Functions" on page 681.
Custom Custom UI functions that supply additional icons and logic in Data
Button Manager, Merge Manager and Hierarchy Manager. For more
Functions information, see "Viewing Custom Button Functions" on page 682.

About the User Object Registry Tool


The User Object Registry Tool is a read-only tool that keeps track of user
objects that have been developed for use in the Informatica MDM Hub.

- 678 -
Note: To view custom user code in the User Object Registry tool, you must
have registered the following types of objects:
• Custom Stored Procedures; for more information regarding stored
procedures, see "Developing Custom Stored Procedures for Batch Jobs" on
page 604
• Custom Java Cleanse Functions; for more information regarding Java
cleanse functions, see "Using Cleanse Functions" on page 314
• Custom Button Functions; for more information regarding custom buttons,
see "About Custom Buttons in the Hub Console" on page 730

Note: You do not need to pre-configure user exit procedures to view them in
the User Object Registry tool.

Starting the User Object Registry Tool


To start the User Object Registry tool:
1. In the Hub Console, connect to an Operational Reference Store (ORS),
according to the instructions in "Changing the Target Database" on page
37.
2. Expand the Informatica Utilities workbench and then click User Object
Registry.
The Hub Console displays the User Object Registry tool.

The User Object Registry tool displays the following areas:


Column Description
Registered User Hierarchical tree of user objects registered in the selected
Object Types ORS, organized by the following categories:
• User Exits
• Custom Stored Procedures
• Custom Java Cleanse Functions
• Custom Button Functions
User Object Properties for the selected user object.
Properties

Viewing User Exits


This section describes how to view user exits in the User Object Registry tool.

About User Exits


A user exit is an unencrypted stored procedure that includes a set of fixed,
pre-defined parameters. The procedure is configured, on a per-base object
basis, to execute at a specific point during a Informatica MDM Hub batch

- 679 -
process run. User exits are triggered by the Informatica MDM Hub back-end
processes that provide a mechanism to integrate custom operations with Hub
Server processes such as POST_LOAD, POST_MERGE, POST_MATCH, and so
on. For more information, see "About User Exits" on page 708.

Note: The User Object Registry tool displays the types of pre-existing user
exits.

Viewing User Exits


To view the Informatica MDM Hub user exits in the User Object Registry tool:
1. Start the User Object Registry tool. For more information, see "Starting
the User Object Registry Tool" on page 679.
2. In the list of user objects, select User Exits.
The User Object Registry tool displays the user exits.

Viewing Custom Stored Procedures


This section describes how to view registered custom stored procedures in the
User Object Registry tool.

About Custom Stored Procedures


In the Hub Console, the Informatica MDM Hub Batch Viewer and Batch Group
tools provide simple mechanisms for executing Informatica MDM Hub batch
jobs. To execute and manage jobs according to a schedule, you need to
execute stored procedures that do the work of batch jobs or batch groups. For
more information, see "About Informatica MDM Hub Batch Jobs" on page 496.

Informatica MDM Hub also allows you to create and run custom stored
procedures for batch jobs. For more information, see "Developing Custom
Stored Procedures for Batch Jobs" on page 604. You can also create and run
stored procedures using the SIF API (using Java, SOAP, or HTTP/XML). For
more information, see the Informatica MDM Hub Services Integration
Framework Guide.

- 680 -
How Custom Stored Procedures Are Registered
You must register a custom stored procedure with Informatica MDM Hub in
order to make it available to users in the Batch Viewer and Batch Group tools
in the Hub Console. For more information, see "Registering a Custom Stored
Procedure" on page 605.

Viewing Registered Custom Stored Procedures


To view the registered custom stored procedures in the User Object Registry
tool:
1. Start the User Object Registry tool. For more information, see "Starting
the User Object Registry Tool" on page 679.
2. In the list of user objects, select Custom Stored Procedures.
The User Object Registry tool displays registered custom stored
procedures.

Viewing Custom Java Cleanse Functions


This section describes how to view registered custom Java cleanse functions in
the User Object Registry tool.

About Custom Java Cleanse Functions


The User Object Registry exposes the details of custom cleanse functions that
have been added to Java libraries (not user libraries). In Informatica MDM
Hub, you can build and execute cleanse functions that cleanse data. A cleanse
function is a function that is applied to a data value in a record to standardize
or verify it. For example, if your data has a column for salutation, you could
use a cleanse function to standardize all instances of “Doctor” to “Dr.” You can
apply cleanse functions successively, or simply assign the output value to a
column in the staging table. For more information, see "About Cleanse
Functions" on page 314 and "Configuring Java Libraries" on page 318.

- 681 -
How Custom Java Cleanse Functions Are Registered
Cleanse functions are configured using the Cleanse Functions tool in the Hub
Console. For more information, see "Configuring Java Libraries" on page 318

Viewing Registered Custom Java Cleanse Functions


To view the registered custom Java cleanse functions in the User Object
Registry tool:
1. Start the User Object Registry tool. For more information, see "Starting
the User Object Registry Tool" on page 679.
2. In the list of user objects, select Custom Java Cleanse Functions.
The User Object Registry tool displays the registered custom Java cleanse
functions.

Viewing Custom Button Functions


This section describes how to view registered custom button functions in the
User Object Registry tool.

About Custom Button Functions


In your Informatica MDM Hub implementation, you can provide Hub Console
users with custom buttons that can be used to extend your Informatica MDM
Hub implementation. Custom buttons can give users the ability to invoke a
particular external service (such as retrieving data or computing results),
perform a specialized operation (such as launching a workflow), and other
tasks. Custom buttons can be added to any of the following tools in the Hub
Console: Merge Manager, Data Manager, and Hierarchy Manager. For more
information, see "About Custom Buttons in the Hub Console" on page 730.

Server and client-based custom functions are visible in the User Object
Registry. For more information, see "Server-Based and Client-Based Custom
Functions" on page 732.

- 682 -
How Custom Button Functions Are Registered
To add a custom button to the Hub Console in your Informatica MDM Hub
implementation, complete the following tasks:
1. Determine the details of the external service that you want to invoke, such
as the format and parameters for request and response messages.
2. Write and package the business logic that the custom button will execute,
as described in "Writing a Custom Function" on page 732.
3. Deploy the package so that it appears in the applicable tool(s) in the Hub
Console, as described in "Deploying Custom Buttons" on page 735.

Viewing Registered Custom Button Functions


To view the registered custom button functions in the User Object Registry
tool:
1. Start the User Object Registry tool. For more information, see "Starting
the User Object Registry Tool" on page 679.
2. Select Custom Button Functions.
The User Object Registry tool displays the registered custom button
functions.

- 683 -
Chapter 22: Auditing Informatica
MDM Hub Services and Events

This chapter describes how to set up auditing and debugging in the Hub
Console.

Chapter Contents
• "About Integration Auditing" on page 684
• "Starting the Audit Manager" on page 686
• "Auditing SIF API Requests" on page 688
• "Auditing Message Queues" on page 689
• "Auditing Errors" on page 690
• "Using the Audit Log" on page 691

About Integration Auditing


Your Informatica MDM Hub implementation has a variety of different log files
that track activities in various components—MRM log, application server log,
database server log, and so on. The auditing covered in this chapter can be
described as integration auditing to track activities associated with the
exchange of data between Informatica MDM Hub and external systems. For
more information about the other types of log files, see the Informatica MDM
Hub Installation Guide.

Auditing is configured separately for each Operational Reference Store (ORS)


in your Informatica MDM Hub implementation.

Auditable Events
Integration with external applications often involves complexity. Multiple
applications interact with each other, exchange data synchronously or
asynchronously, use data transformations back and forth, and engage various
business rules to execute business processes across applications.

To expose the details of application integration to application developers and


system integrators, Informatica MDM Hub provides the ability to create an
audit trail whenever:
• an external application interacts with Informatica MDM Hub by invoking a
Services Integration Framework (SIF) request. For more information, see
the Informatica MDM Hub Services Integration Framework Guide.

- 684 -
• Informatica MDM Hub sends a message (using JMS) to a message queue
for the purpose of distributing data changes to other systems. For more
information, see "Configuring the Publish Process" on page 449.

The Informatica MDM Hub audit mechanism is optional and configurable. It


tracks invocations of SIF requests that are audit-enabled, collects data about
what occurred when, and provides some contextual information as to why
certain actions were fired. It stores audit information in an audit log table (C_
REPOS_AUDIT) that you can subsequently view using TOAD or another
compatible, external data management tool.

Note: Auditing is in effect whether metadata caching is enabled (on) or


disabled (off).

Audit Manager Tool


Auditing is configured using the Audit Manager tool in the Hub Console. The
Audit Manager allows administrators to select:
• which SIF requests to audit, and on which systems (Admin, defined source
systems, or no system).
• which message queues to audit (assigned to use with message triggers) as
outbound messages are sent to JMS queues

For more information, see "Starting the Audit Manager" on page 686.

Capturing XML for Requests and Responses


For thorough debugging of specific SIF requests or JMS events, users can
optionally capture the request and response XML in the audit log, which can be
especially useful for write operations. Because auditing at this granular level
collects extensive information with a possible performance trade-off, it is
recommended for debugging purposes but not for ongoing use in a production
environment.

Auditing Must Be Explicitly Enabled


By default, the auditing of SIF requests and events is disabled. You must use
the Audit Manager tool to explicitly enable auditing for each SIF request and
event that you want to audit.

Auditing Occurs After Authentication


Any SIF request invocation can be audited once the user credentials
associated with the invocation have been authenticated by the Hub Server.
Therefore, a failed login attempt is not audited. For example, if a third-party
application attempts to invoke a SIF request but provides invalid login

- 685 -
credentials, that information will not be captured in the C_REPOS_AUDIT
table. Auditing begins only after authentication succeeds.

Auditing Occurs for Invocations With Valid, Well-


formed XML
Only SIF request invocations with valid and well-formed XML will be audited.
SIF requests with invalid XML or XML that is not well-formed will not be
audited.

Auditing Password Changes


For invocations of the Informatica MDM Hub change password service, the
user’s default database determines whether the SIF request is audited or not.
• If the user’s default database is an Operational Reference Store (ORS),
then the Informatica MDM Hub change password service is audited. For
more information, see "Changing Passwords" on page 67.
• If the user’s default database is the Master Database, then the change
password service invocation is not audited.

Starting the Audit Manager


To start the Audit Manager:
• In the Hub Console, scroll to the Utilities workbench, and then click Audit
Manager.
The Hub Console displays the Audit Manager.

The Audit Manager is divided into two panes.


Pane Description
Navigation Shows (in a tree view) the following information:
pane • auditing types for this Informatica MDM Hub implementation
(see "Auditable API Requests and Message Queues" on page
686)
• the systems to audit (see "Systems to Audit" on page 687)
• message queues to audit (see "Auditing Message Queues" on
page 689)
Properties Shows the properties for the selected auditing type or system.
pane

Auditable API Requests and Message Queues


In the Audit Manager, the navigation pane displays a list of the following types
of items to audit, along with any available systems.

- 686 -
Type Description
API Request invocations made by external applications using the
Requests Services Integration Framework (SIF) Software Development Kit
(SDK).
Message Message queues used for message triggers. For more information,
Queues see "Configuring the Publish Process" on page 449.
Note: Message queues are defined at the CMX_SYSTEM level. These
settings apply only to messages for this Operational Reference
Store (ORS).

Systems to Audit
For each type of item to audit, the Audit Manager displays the list of systems
that can be audited, along with the SIF requests that are associated with that
system.
System Description
No System Services that are not—or not necessarily—associated with a
specific system (such as merge operations).
Admin Services that are associated with the Admin system.
Defined Services that are associated with predefined source systems.
Source For more information, see "About the Databases Tool" on page
Systems 54.

Note: The same API request or message queue can appear in multiple source
systems if, for example, its use is optional on one of those source systems.

Audit Properties
Note: A write lock is not required to configure auditing.

When you select an item to audit, the Audit Manager displays properties in the
properties pane with the following configurable settings.

Field Description
System Name of the selected system. Read-only.
Name
Description Description of the selected system. Read-only.
API List of API requests that can be audited.
Request
Message List of message queues that can be audited.
Queue
Enable By default, auditing is not enabled.
Audit? • Select (check) to enable auditing for the item.
• Clear (uncheck) to disable auditing for the item.
Include This check box is available only if auditing is enabled for this item.
XML? By default, capturing XML in the log is not included. For more
information, see "Capturing XML for Requests and Responses" on
page 685.
• Check (select) to include XML in the audit log for this item.
• Uncheck (clear) to exclude XML from the audit log for this
item.

- 687 -
Field Description
Note: Passwords are never stored in the audit log. If a password
exists in the XML stream (whether encrypted or not), Informatica
MDM Hub replaces the password with asterisks:
...<get>
 <username>admin</username>
 <password>
 <encrypted>false</encrypted>
 <password>******</password>
 </password>
...
Important: Selecting this option can cause the audit log file to
grow very large rapidly. For more information, see "Periodically
Purging the Audit Log" on page 696.

For the Enable Audit? and Include XML? check boxes, you can use the following
buttons.

Button Name Description


Select All Check (select) all items in the list.

Clear All Uncheck (clear) all selected items in the list.

Auditing SIF API Requests


You can audit Services Integration Framework (SIF) requests made by
external applications. Once auditing for a particular SIF API request is
enabled, Informatica MDM Hub captures each SIF request invocation and
response in the audit log.

For more information regarding the SIF API requests, see Informatica MDM
Hub Services Integration Framework Guide.

To audit SIF API requests:


1. Start the Audit Manager. For more information, see "Starting the Audit
Manager" on page 686.
2. In the navigation tree, select a system beneath API Requests.
Select No System to configure global auditing settings across all systems.
In the edit pane, the Audit Manager displays the configurable API requests
for the selected system. For more information, see "Audit Properties" on
page 687.

- 688 -
3. For each SIF request that you want to audit, select (check) the Enable
Audit check box.
4. If auditing is enabled for a particular API request and you also want to
include XML associated with that API request in the audit log, then select
(check) the Include XML check box.

5. Click the Save button to save your changes.


Note: Your saved settings might not take effect in the Hub Server for up to
60 seconds.

Auditing Message Queues


You can configure auditing for message queues for which message triggers
have been assigned. Message queues that do not have configured message
triggers are not available for auditing.

To audit message queues:


1. Start the Audit Manager. For more information, see "Starting the Audit
Manager" on page 686.
2. In the navigation tree, select a system beneath Message Queues.
In the edit pane, the Audit Manager displays the configurable message
queues for the selected system. For more information, see "Audit
Properties" on page 687.

- 689 -
3. For each message queue that you want to audit, select (check) the Enable
Audit check box.
4. If auditing is enabled for a particular message queue and you also want to
include XML associated with that message queue in the audit log, then
select (check) the Include XML check box.

5. Click the Save button to save your changes.


Note: Your saved settings might not take effect in the Hub Server for up to
60 seconds.

Auditing Errors
You can capture error information for any SIF request invocation that triggers
the error mechanism in the Web service—such as syntax errors, run-time
errors, and so on. You can enable auditing for all errors associated with SIF
requests.

Auditing errors is a feature that you enable globally. Even when auditing is not
currently enabled for a particular SIF request, if an error occurs during that
SIF request invocation, then the event is captured in the audit log.

Configuring Global Error Auditing


To audit errors:
1. Start the Audit Manager. For more information, see "Starting the Audit
Manager" on page 686.
2. In the navigation tree, select API Requests to configure auditing for SIF
errors.
In the edit pane, the Audit Manager displays the configuration page for
errors.

- 690 -
3. Do one of the following:
• Select (check) the Enable Audit check box to audit errors.
• Clear (uncheck) the Enable Audit check box to stop auditing errors.
4. If you select Enable Audit and you also want to include XML associated
with errors in the audit log, then select (check) the Include XML check
box.
Note: If you only select Enable Audit, Informatica MDM Hub provides the
associated audit information in C_REPOS_AUDIT.
If you also select Include XML, Informatica MDM Hub includes an
additional column in C_REPOS_AUDIT named DATA_XML which includes
detail log data for audit.
If you select both check boxes, when you run an Insert, Update, or Delete
job in the Data Manager, or run the associated batch job, Informatica MDM
Hub includes the audit data in DATA_XML of C_REPOS_AUDIT.

5. Click the Save button to save your changes.

Using the Audit Log


Once you have configured auditing for SIF request and events, you can use the
populated audit log table (C_REPOS_AUDIT) as needed—for analysis,
exception reporting, debugging, and so on.

About the Audit Log


The C_REPOS_AUDIT table is stored in the Operational Reference Store (ORS).
If auditing is enabled for a given SIF request or event, whenever that SIF
request is invoked or that event is triggered on the Informatica MDM Hub,
then the audit mechanism captures the relevant information and stores it in
the C_REPOS_AUDIT table. For more information about the data stored in this
table, see "Audit Log Table" on page 692.

Note: The SIF Audit request allows an external application to insert new
records in the C_REPOS_AUDIT table. You would use this request to report

- 691 -
activity involving a record(s) in Informatica MDM Hub, that is at a higher
level, or has more information that can be recorded by the Hub. For example,
audit an update to a complex object before transforming and decomposing it
to Hub objects. For more information, see the Informatica MDM Hub Services
Integration Framework Guide.

Audit Log Table


The C_REPOS_AUDIT table has the following columns.
Schema for the Audit Log Table (C_REPOS_AUDIT)
Name Oracle Type DB2 Type Description
ROWID_AUDIT CHAR(14) CHARACTER(14) Unique ID for this record.
Primary key.
CREATE_DATE DATE TIMESTAMP Record creation date. Defaults
to the system date.
CREATOR VARCHAR2(50) VARCHAR(50) User associated with the audit
event.
LAST_ DATE TIMESTAMP Same as CREATE_DATE.
UPDATE_DATE
UPDATED_BY VARCHAR2(50) VARCHAR(50) Same as CREATOR.
COMPONENT VARCHAR2(50) VARCHAR(50) Component involved:
• SIF.sif.api
ACTION VARCHAR2(50) VARCHAR(50) One of the following:
• SIF request name
• message queue name
STATUS VARCHAR2(50) VARCHAR(50) One of the following values:
• debug
• info
• warn
• error
• fatal
ROWID_ CHAR(14) CHARACTER(14) The rowid_object, if known.
OBJECT
DATA_XML CLOB CLOB XML associated with the
auditable event: request,
response, or JMS message.
Populated only if the Include
XML option is enabled
(checked).
Note: Passwords are never
stored in the audit log. If a
password exists in the XML
stream (whether encrypted or
not), Informatica MDM Hub
replaces the password with the
text “******”.
CONTEXT_XML CLOB CLOB XML that might contain
contextual information, such as
configuration data, the URL that
was invoked, trace for the
execution of a match rule, and

- 692 -
Name Oracle Type DB2 Type Description
so on. If an error occurs, the
request XML is always put in this
column to ensure its capture in
case auditing was not enabled
for the SIF request that was
invoked. Populated only if the
Include XML option is enabled
(checked).
ROWID_ CHAR(14) CHARACTER(14) Reference to the ROWID_AUDIT
AUDIT_ of the related previous entry.
PREVIOUS For example, links a response
entry to its corresponding
request entry.
INTERACTION_ NUMBER(19) BIGINT(8) Interaction ID. May be NULL
ID since INTERACTION_ID is
optional.
USERNAME VARCHAR2(50) VARCHAR(50) User that invoked the SIF
request. Null for message
queues.
FROM_SYSTEM VARCHAR2(50) VARCHAR(50) Source system for a SIF
request, or Admin for message
queues.
TO_SYSTEM VARCHAR2(50) VARCHAR(50) System to which the audited
event is related.
For example, API Requests to
Hub set this to “Admin” and the
responses are the system or
null if not known (and vice-
versa for Responses).
Note: Activity Manager Actions
set this value.
TABLE_NAME VARCHAR2(100) VARCHAR(100) Table in the Hub Store that is
associated with this audited
event.
CONTEXT VARCHAR2(255) VARCHAR(255) Metadata. For example,
pkeySource
This is null for audits from Hub,
but may have values for Activity
Manager and audits done
through the SIF API.

Viewing the Audit Log


You can view the audit log using an external data management tool (not
included with Informatica MDM Hub), such as TOAD. The following example
shows viewing the contents of the DATA_XML column in TOAD.

- 693 -
If available in the data management tool you use to view the log file, you can
focus your viewing by filtering entries—by audit level (view only debug-level
or info-level entries), by time (view entries within the past hour), by operation
success / failure (show error entries only), and so on.

The following SQL statement is just one example:


SELECT ROWID_AUDIT, FROM_SYSTEM, TO_SYSTEM, USERNAME, COMPONENT, ACTION,
STATUS, TABLE_NAME, ROWID_OBJECT, ROWID_AUDIT_PREVIOUS, DATA_XML, CREATE_
DATE FROM C_REPOS_AUDIT
WHERE CREATE_DATE >= TO_DATE('07/06/2006 12:23:00', 'MM/DD/YYYY
HH24:MI:SS')
ORDER BY CREATE_DATE

Sample Audit Log Entries


Here is an example C_REPOS_AUDIT with audit log entries. For this example,
the XML data was not included.

- 694 -
Here is an example C_REPOS_AUDIT with audit log entries that includes the
XML column. For this example, both Enable Audit and Include XML check
boxes were enabled.

- 695 -
Periodically Purging the Audit Log
The audit log table can grow very large rapidly, particularly when capturing
XML request and response information (when the Include XML option is
enabled). Using tools provided by your database management system,
consider setting up a scheduled job that periodically deletes records matching
a particular filter (such as entries created more than 60 minutes ago).

The following SQL statement is just one example:


DELETE FROM C_REPOS_AUDIT WHERE CREATE_DATE < (SYSDATE - 1) AND
STATUS='INFO'

- 696 -
Part 6: Appendixes

Part 6: Appendixes

Contents
• "Configuring International Data Support" on page 698
• "Backing Up and Restoring Informatica MDM Hub" on page 706
• "Configuring User Exits" on page 708
• "Viewing Configuration Details" on page 715
• "Implementing Custom Buttons in Hub Console Tools" on page 730
• "Configuring Access to Hub Console Tools" on page 737
• "Row-level Locking" on page 740

- 697 -
Appendix A: Configuring
International Data Support

This topic explains how to configure character sets in a Informatica MDM Hub
implementation. The database needs to support the character set you want to
use, the terminal must be configured to support the character set you want to
use, and the NLS_LANG environment variable must include the Oracle name
for the character set used by your client terminal.

Appendix Contents
• "Configuring Unicode in Informatica MDM Hub" on page 698
• "Configuring the ANSI Code Page (Windows Only)" on page 703
• "Configuring NLS_LANG" on page 704

Configuring Unicode in Informatica MDM Hub


This section explains how to configure Informatica MDM Hub to use Unicode
Transfer Format (UTF8) encoding.

Creating and Configuring the Database


The Oracle database used for your Informatica MDM Hub implementation must
be created and configured to support the character set that you want to use. If
your implementation will use mixed locale information (for example, data
from multiple countries with different character sets or display requirements),
in order for match to work correctly, you must set up a UTF8 Oracle database.
If, however, the database will contain data from a single locale, a UTF8
database is probably not required.

To set up a UTF8 Oracle database, complete the following steps:


1. Create a UTF8 database and choose the following settings:
• database character set: AL32UTF8
• national character set: AL16UTF16
Note: Oracle recommends using AL32UTF8 as the database character set
for Oracle 10g. For previous Oracle releases, refer to your Oracle
documentation.
2. Set NLS_LANG on both the server and the client:
AMERICAN_AMERICA.AL32UTF8

Notes:
• The NLS_LANG setting should match the database character set.

- 698 -
• The language_territory portion of the NLS_LANG setting (represented
as “AMERICA_AMERICA” in the above example) is locale-specific and might
not be suitable for all Informatica MDM Hub implementations. For
example, a Japanese implementation might need to use the following
setting instead:
NLS_LANG=JAPANESE_JAPAN.AL32UTF8

• If you use AL32UTF8 (or even UTF8) as the database character set,
then it is highly recommended that you set NLS_LENGTH_SEMANTICS
to CHAR (in the Oracle init.ora file) when you instantiate the database.
Doing so forces Oracle to default to CHAR (not BYTE) for variable
length definitions. The NLS_LENGTH_SEMANTICS setting affects all
character-related variable types: VARCHAR, VARCHAR2, and CHAR.
3. Ensure that the Regional Font Settings are correctly configured on the
client. For East Asian data, be sure to install East Asian fonts.
4. When editing data, the regional font settings should match the language
being used.
5. If you are using a multi-byte character set in your Oracle database, you
must change the following setting in the REPOS_DB_RELEASE table to zero
(0):
column_length_in_bytes_ind = 0

By default, this setting is one (1), which means that column lengths are
declared as byte values. Changing this to zero (0) means that column
lengths are declared as CHAR values in support of Unicode values.

Configuring Match Settings for Non-US Populations


This section describes how to configure match settings for non-United States
populations. For an introduction, see "Population Sets" on page 248.

Configuring Populations

By default, Informatica MDM Hub supports the population for the United States
(provides a usa.ysp file in the default installation). If your implementation
needs to use a population other than the US population, then additional
analysis of the data is required.
• If the data is exclusively from a different country, and Informatica
provides a population for that country, then use that population. Contact
Informatica Support to obtain the population.ysp file that is appropriate for
your implementation, along with instructions to enable the population.
• If the data is mostly from one country with very small amounts of mixed
data from one or more other populations, consider using the majority
population. Contact Informatica Support to obtain the population.ysp file
for the majority population, along with any instructions.

- 699 -
• If large quantities of data from different countries are mixed, consider
whether it is meaningful to match across such a disparate set of data. If
so, then consider using the “international” population. Contact Informatica
Support to obtain the appropriate population.ysp file and instructions to
enable the population.
• For all other situations, contact Informatica Support.

To configure match settings for UTF8:


1. In the C_REPOS_SSA_POPULATION metadata table, enable the appropriate SSA_
POPULATION.

Contact Informatica Support to obtain the appropriate means to enable the


population you want to use. The SSA_POPULATION defines the Standard
Population Set to use for match purposes. A Standard Population Set
contains the rules that define how the Key Building, Search Strategies, and
Match Purposes operate on a particular population of data. There is one
Standard Population set for each supported country, language, or
population.
2. Copy the appropriate population.ysp file obtained from Informatica
Support to the following location.
Windows
<infamdm_install_dir>\cleanse\resources\match

For example:
C:\<infamdm_install_dir>\hub\cleanse\resources\match

Unix
<infamdm_install_dir>/hub/cleanse/
Note: Informatica ships the usa.ysp file by default. If you need to use the
population set for a different country, contact Informatica Support to
obtain the population.ysp file that is appropriate for your implementation,
along with instructions to enable the population.

Configuring Encoding for Match Processing

To configure encoding for match processing, edit the cmxcleanse.properties


file and add the following setting:
cmx.server.match.server_encoding = 1

This setting helps with the processing of UTF8 characters during match,
ensuring that all data is represented in UTF16 (although its representation in
the database is still UTF8).

- 700 -
Using Multiple Populations Within a Single Base Object

Informatica MDM Hub provides you with the ability to use multiple populations
within a single base object. This is useful if data in a base object comes from
different populations—for example, 70% of the records from the United States
and 30% from China. Populations can vary on a record-by-record basis.

To use multiple (two or more) populations within a base object:


1. Contact Informatica Support to obtain the applicable population.ysp file(s)
for your implementation, along with instructions for enabling the
population.
2. For each population that you want to use, enable it in the C_REPOS_SSA_
POPULATION metadata table (c_repos_ssa_population.enabled_ind=1).

3. Copy the applicable population.ysp file(s) obtained from Informatica


Support to the following location.
Windows
<infamdm_install_dir>\cleanse\resources\match

For example:
C:\<infamdm_install_dir>\hub\cleanse\resources\match

Unix
<infamdm_install_dir>/hub/cleanse/
4. Restart the application server.
5. In the Schema Manager, add a column to the base object that will contain
the population to use for each record.
This must be a VARCHAR column with the physical name of SIP_POP.

Note: The width of the VARCHAR column must fit the largest population
name in use. A width of 30 is probably sufficient for most
implementations.
6. Configure the match column as an exact match column with the name of
SIP_POP, according to the instructions in "Configuring Match Columns" on
page 387.

- 701 -
7. For each record in the base object that will use a non-default population,
provide (in the SIP_POP column) the name of the population to use
instead.
• You can specify values for the SIP_POP column in any manner of
ways—adding the data in the landing tables, using cleanse functions
that calculate the values during the stage process, invoking SIF
requests from external applications—even manually editing the cells
using the Data Manager tool. The only requirement is that the SIP_POP
cells must contain this data for all non-default populations just prior to
executing the Generate Match Tokens process.
• The data in the SIP_POP column can be in any case (upper, lower, or
mixed) because all alphabetic characters will be converted to
lowercase in the match key table. For example, Us, US, and us are all
valid values for this column.
• Invalid values in this column will be processed using the default
population. Invalid values include NULLs, empty strings, and any string
that does not match a population name as defined in c_repos_ssa_
population.population_name.

8. Execute the Generate Match Tokens process on this base object to update
the match key table.
9. Execute the match process on this base object.
Note: The match process compares only records that share the same
population. For example, it will compare Chinese records with Chinese
records, and American records with American records. Any resulting
match pairs will be between records that share the same population.

Cleanse Settings for Unicode


• If you are using the Address Doctor cleanse libraries, ensure that you have
the right database and the unlock code for Address Doctor. You will need to
obtain the Address Doctor database for all countries needed for your
implementation. Contact Informatica Support for details.
• If you are using Trillium, make sure that you use the right template to
create the project. Refer to the Trillium installation documentation to
determine which countries are supported. Obtain country-specific projects
from Trillium directly.

Data in Landing Tables


Make sure that the data that is pushed into the landing table is UTF8. This
should be taken care of during the ETL process.

- 702 -
Hub Console
In the Hub Console, menus, warnings, and so on are in English. Current
Informatica MDM Hub UTF support applies only to business data—not
metadata or the interface. The Hub Console will have UTF8 support in a future
release.

Locale Recommendations for UNIX When Using


UTF8
Many UNIX systems use incompatible character encodings to represent their
local alphabets as binary data. This means that, for example, one string of
text written on a Korean system will not work in a Chinese setting. However,
you can make UNIX systems use UTF-8 encoding for any language. UTF-8 text
encoding supports many languages so that one language does not interfere
with another.

You can configure the system locale settings (which define settings for the
system language) to use UTF-8 by completing the following steps:
1. Run the following command:
locale -a

2. Determine whether you can find a locale for your language with a name
ending in .utf8.
localedef -f UTF-8 -i en_US en_US.utf8

3. Once you know whether you have a locale that allows you to use UTF-8,
instruct the UNIX system to use that locale.
Export LC_ALL="en_US.utf8"
export LANG="en_US.utf8"
export LANGUAGE="en_US.utf8"

Configuring the ANSI Code Page (Windows


Only)
This section explains how to determine and configure the ANSI code page
(ACP) in Windows.

Determining the ANSI Code Page


Like almost all Windows settings, the ACP is stored in the registry. To
determine the ACP:
1. From the Start menu, choose Run.
2. At the command prompt, type regedit and then click OK.
3. Browse the following registry entry:

- 703 -
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage\ACP

Note: There are many registry entries with very similar names, so be sure to
look at the right place in the registry.

Changing the ANSI Code Page


To change the ANSI Code Page in Windows, you need to configure locale and
language settings in the Control Panel. The instructions differ for Windows XP
and Windows 2003 systems. For instructions, refer to your Microsoft Windows
documentation.

Note: On Windows XP systems, you might need to install support for non-
Western languages.

Configuring NLS_LANG
To specify the locale behavior of your client Oracle software, you need to set
your NLS_LANG setting, which specifies the language, territory, and the
character set of your client. This section describes several ways in which to
configure the NLS_LANG setting.

Syntax for NLS_LANG


The NLS setting uses the following format:
NLS_LANG = LANGUAGE_TERRITORY.CHARACTERSET

where:
Setting Description
LANGUAGE Specifies the language used for Oracle messages, as well as the
names of days and months.
TERRITORY Specifies monetary and numeric formats, as well as territory
and conventions for calculating week and day numbers.
CHARACTERSET Controls the character set used by the client application, or it
matches your Windows code page, or it is set to UTF8 for a
Unicode application.

Note: The character set defined with the NLS_LANG parameter does not change
your client's character set. Instead, it is used to let Oracle know which
character set you are using on the client side so that Oracle can perform the
proper conversion. The character set part of the NLS_LANG parameter is never
inherited from the server.

- 704 -
Configuring NLS_LANG in the Windows Registry
On Windows systems, you should make sure that you have set an NLS_LANG
registry subkey for each of your Oracle Homes:

You can modify this subkey using the Windows Registry Editor:
1. From the Start menu, choose Run...
2. At the command prompt, type regedit, and then click OK.
3. Edit the following registry entry:
For Oracle 10g:
HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_<oracle_home_name>

There you should have an entry with the name NLS_LANG.

When starting an Oracle tool (such as sqlplusw), the tool will read the contents
of the oracle.key file located in the same directory to determine which
registry tree will be used (therefore, which NLS_LANG subkey will be used).

Configuring NLS_LANG as an Environment Variable


Although the Windows Registry is the primary repository for settings in
Windows systems, and it is the recommended way to configure NLS_LANG,
there are alternatives. You can set NLS_LANG as a System or User Environment
Variable in the System properties, although this is not the recommended
approach. The configured setting will be used for all Oracle homes.

To check and modify system or user environment variables:


1. Right-click the My Computer icon and choose Properties.
2. Click the Advanced tab.
3. Click Environment Variables.
• The User Variables list contains the settings for the currently logged-in
Windows user.
• The System Variables list contains system-wide variables for all users.
4. Change settings as needed.

Because these environment variables take precedence over the parameters


specified in your Windows Registry, you should not set Oracle parameters at
this location unless you have a very good reason. In particular, note that the
ORACLE_HOME parameter is set on Unix but not on Windows.

- 705 -
Appendix B: Backing Up and
Restoring Informatica MDM Hub

This appendix explains how to back up and restore a Informatica MDM Hub
implementation.

Appendix Contents
• "Backing Up Informatica MDM Hub" on page 706
• "Backup and Recovery Strategies for Informatica MDM Hub" on page 706

Backing Up Informatica MDM Hub


This appendix describes backup and recovery strategies for Master Reference
Manager (MRM) tables (permanent Hub tables) that are operated on by logging
or non-logging operations.

Non-logging operations (such as CTAS, Direct Path SQL Load, and Direct Insert)
are occasionally performed on permanent Hub tables to speed-up batch
processes. These operations are not recorded in the redo logs and, as such,
are not generally recoverable. However, recovery is possible if a backup is
made immediately after the operations are completed.

Backup and recovery strategies are dependent on the value of the GLOBAL_
NOLOGGING_IND column (in the C_REPOS_DB_RELEASE table), which turns non-
logging operations on or off.

The GLOBAL_NOLOGGING_IND column has two possible values:


• GLOBAL_NOLOGGING_IND = 1 (default), which indicates that non-logging
operations are enabled.
• GLOBAL_NOLOGGING_IND = 0, which indicates that non-logging operations are
disabled.

Note: GLOBAL_NOLOGGING_IND controls non-logging operations for permanent


Hub tables only, but not for transient tables that are used in Hub batch
processes.

Backup and Recovery Strategies for


Informatica MDM Hub
Different backup and recovery strategies are required depending on whether
non-logging operations are occurring, that is, depending on the GLOBAL_
NOLOGGING_IND column value. This section describes the two different kinds of

- 706 -
backup and recovery strategies: backup and recovery with non-logging
operations, and backup and recovery without non-logging operations.

Backup and Recovery With Non-Logging Operations


When non-logging operations on permanent Hub tables are enabled (GLOBAL_
NOLOGGING_IND =1), the following Informatica MDM Hub processes perform
non-logging operations on permanent tables:
• Staging with Delta Detection and Raw Detection
• Tokenization
• Match
• Merge

To recover changes that the non-logging operations make, you must perform
an immediate back-up procedure.

Backup and Recovery Without Non-Logging


Operations
If non-logging operations on permanent Hub tables are disabled (GLOBAL_
NOLOGGING_IND = 0), redo logs can be used to ensure database recoverability.

To ensure database recoverability:


1. Log on to sqlplus as the ors user.
2. Use the following command to update the C_REPOS_DB_RELEASE table to
disable non-logging operations:
Run sql:
update c_repos_db_release set GLOBAL_NOLOGGING_IND = 0;
COMMIT;

3. Use the following command to disable index creation with the non-logging
option:
Run sql:
update c_repos_table set NOLOGGING_IND = 0;
COMMIT;

4. Make sure that the database is running in the archive log mode.
5. Perform a database backup.
6. If recovery is needed, apply redo logs on the backup.

- 707 -
Appendix C: Configuring User Exits

This chapter provides reference information for the various predefined


Informatica MDM Hub user exit procedures.

Appendix Contents
• "About User Exits" on page 708
• "Types of User Exits" on page 708

About User Exits


A user exit is an unencrypted stored procedure that includes a set of fixed,
pre-defined parameters. The procedure is configured, on a per-base object
basis, to execute at a specific point during execution of a Informatica MDM
Hub batch job. For more information on how to view user exits with the User
Object Registry Tool, see "Viewing User Exits" on page 679.

Note: The POST_LANDING, PRE_STAGE, and POST_STAGE user exits are only
called from the batch Stage process. For more information, see "Stage Jobs"
on page 556.

Informatica MDM Hub automatically provides the appropriate input parameter


values when it calls a user exit procedure. In addition, Informatica MDM Hub
automatically checks the return code returned by a user exit procedure. A
negative return code causes the Hub process to terminate with an error
condition.

A user exit must perform its own transaction handling. COMMITs / ROLLBACKs
must be explicitly issued for any data manipulation operation(s) in a user exit,
or in stored procedures called from user exits. However, this is not true for
the SIF API requests (for example, Merge, Unmerge, and so on). Transactions
for the API requests are handled by Java code. Any COMMITs / ROLLBACKs in
such a case may cause a Java distributed transaction error.

Note: Dynamic SQL is recommended for all DML/DDL statements, as a user


exit could access objects that only exist at run time.

Note: For Oracle databases, all user exit procedures are located in the cmxue
package.

Types of User Exits


Here are the various types of user exit procedures:

- 708 -
User Exit Description
Name
POST_ Data in a Landing table can be refined using this user exit after
LANDING the Landing table has been populated using an ETL process. For
more information, see "POST_LANDING User Exit" on page 709.
PRE_STAGE Called before loading the data into a Staging table. For more
information, see "PRE_STAGE User Exit" on page 710.
POST_STAGE Called after a Staging table has been populated. For more
information, see "POST_STAGE User Exit" on page 710.
POST_LOAD Called after a Load batch job and after a Put API call. For more
information, see "POST_LOAD User Exit" on page 711.
PRE_MATCH Called before a Match batch job.
POST_MATCH Called after a Match batch job. For more information, see
"POST_MATCH User Exit" on page 712.
PRE_USER_ Called just before records to be merged are assigned to a user.
MERGE_ For more information, see "PRE_USER_MERGE_ASSIGNMENT"
ASSIGNMENT on page 714.
POST_MERGE Called after a Merge or a Multi-Merge batch job and after a
Merge API call. For more information, see "POST_MERGE User
Exit" on page 713.
POST_ Called after a Unmerge API call. For more information, see
UNMERGE "POST_UNMERGE User Exit" on page 713.

User Exits for the Stage Process


The POST_LANDING, PRE_STAGE, and POST_STAGE user exits are only called
from the batch Stage process. For more information, see "Stage Jobs" on page
556.

POST_LANDING User Exit

Use a POST_LANDING user exit for custom work on the landing table prior to
delta detection. For example:
• Hard delete detection
• Replace control characters with printable characters
• Perform any special pre-cleansing processes on Addresses

POST_LANDING Parameters

- 709 -
Parameter Description
Name
IN_ROWID_ Job id for the Stage job, as registered in C_REPOS_JOB_
JOB CONTROL.
IN_ Source table for the Stage job
LANDING_
TABLE_
NAME
IN_ Target table for the Stage job
STAGING_
TABLE_
NAME
IN_PRL_ Previous Landing table name; that is, the copy of the source
TABLE_ data mapped to the staging table from the previous time the
NAME Stage job ran
OUT_ Error message.
ERROR_
MESSAGE
OUT_ Return code.
RETURN_
CODE

PRE_STAGE User Exit

Use a PRE_STAGE user exit for any special handling of delta processes. For
example, use a PRE_STAGE user exit to check delta volumes and determine
whether they exceed pre-defined allowable delta volume limits (for example,
“stop process if source system is System A and the number of deltas is
greater than 500,000”).

PRE_STAGE Parameters
Parameter Name Description
IN_ROWID_JOB Job id for the Stage job, as registered in C_REPOS_JOB_
CONTROL.
IN_LANDING_ Source table for the Stage job.
TABLE_NAME
IN_STAGING_ Target table for the Stage job.
TABLE_NAME
IN_DLT_TABLE_ Delta table name; that is, the table containing the
NAME records identified as deltas.
OUT_ERROR_ Error message.
MESSAGE
OUT_RETURN_ Return code.
CODE

POST_STAGE User Exit

Use a POST_STAGE user exit for any special processing at the end of a Stage
job.For example, use a POST_STAGE user exit for special handling of rejected
records from the Stage job (for example, to automatically delete rejects for
known, non-critical conditions).

- 710 -
POST_STAGE Parameters
Parameter Description
Name
IN_ROWID_ Job id for the Stage job, as registered in c_repos_job_control.
JOB
IN_ Source table for the Stage job
LANDING_
TABLE_
NAME
IN_ Target table for the Stage job.
STAGING_
TABLE_
NAME
IN_PRL_ Previous Landing table name; that is, the copy of the source
TABLE_ data mapped to the staging table from the previous time the
NAME Stage job ran.
OUT_ Error message.
ERROR_
MESSAGE
OUT_ Return code.
RETURN_
CODE

User Exits for the Load Process


POST_LOAD User Exit

Use a POST_LOAD user exit after an update or after an insert from Load.

For the Load process, the IN_ACTION_TABLE has the name of the work table
containing the ROWID_OBJECT values to be inserted/updated.

POST_LOAD Parameters

- 711 -
Parameter Description
Name
IN_ Job id for the Load job, as registered in c_repos_job_control
ROWID_ (Blank for the PUT).
JOB
IN_ Name of the target table (Base Object / Relationship Table) for the
TABLE_ Load job.
NAME
IN_ Name of the source table for the Load job.
STAGE_
TABLE
IN_ For the Load job, this is the name of the table containing the rows
ACTION_ to be inserted or updated (staging_table_name_TINS for inserts,
TABLE staging_table_name_TOPT for updates).
OUT_ Error message.
ERROR_
MESSAGE
OUT_ Return code.
RETURN_
CODE

User Exits for the Match Process


POST_MATCH User Exit

Use a POST_MATCH user exit for custom work on the match table.

For example, use a POST_MATCH user exit to manipulate matches in the


match queue.

POST_MATCH Parameters
Parameter Name Description
IN_ROWID_JOB Job id for the Match job, as registered in c_repos_job_
control
IN_TABLE_NAME Base Object that the Match job is running on.
IN_MATCH_SET_ Match ruleset.
NAME
OUT_ERROR_ Error message.
MESSAGE

- 712 -
Parameter Name Description
OUT_RETURN_CODE Return code.

User Exits for the Merge Process


POST_MERGE User Exit

Use a POST_MERGE user exit to perform custom work after the Merge
process.

For example, use a POST_MERGE user exit to automatically match and merge
child records affected by the match and merge of a parent record.

POST_MERGE Parameters
Parameter Name Description
IN_ROWID_JOB Job id for the Merge job, as registered in c_repos_
job_control.
IN_TABLE_NAME Base Object that the Merge job is running on.
IN_ROWID_OBJECT_ Bulk merge–action table.
TABLE On-line merge–in line view.
OUT_ERROR_MESSAGE Error message.
OUT_RETURN_CODE Return code.

User Exits for the Unmerge Process


POST_UNMERGE User Exit

Use a POST_UNMERGE user exit for custom work after the Unmerge process.

- 713 -
POST_UNMERGE Parameters
Parameter Name Description
IN_ROWID_JOB Job id for the Unmerge transaction, as registered in c_
repos_job_control.
IN_TABLE_NAME Base Object that the Unmerge job is running on.
IN_ROWID_ Re-instated rowid_object.
OBJECT
OUT_ERROR_ Error message.
MESSAGE
OUT_RETURN_ Return code.
CODE

Additional User Exits


PRE_USER_MERGE_ASSIGNMENT

Use this user exit to override or extend user assignment lists. This user exit
procedure runs before the user merge assignment is updated. Note that user
assignment lists are stored in C_REPOS_USER_MERGE_ASSIGNMENTS.

- 714 -
Appendix D: Viewing Configuration
Details

This appendix explains how to use the Enterprise Manager tool in the Hub
Console to configure and view the various properties, history, and database
log information in a Informatica MDM Hub implementation.

Appendix Contents
• "About the Enterprise Manager" on page 715
• "Starting the Enterprise Manager" on page 715
• "Viewing Enterprise Manager Properties" on page 716
• "Viewing Version History" on page 723
• "Using ORS Database Logs" on page 724

About the Enterprise Manager


The Enterprise Manager tool enables you to view properties, version histories,
and environment reports for the Hub server, the Cleanse servers, the ORS
databases, and the Master database. Enterprise Manager also provides access
for configuring and viewing the database logs for your ORS databases.

Starting the Enterprise Manager


To start the Enterprise Manager tool:
• In the Hub Console, do one of the following:
• Expand the Configuration workbench, and then click Enterprise
Manager.
OR
• In the Hub Console toolbar, click the Enterprise Manager tool quick

launch button.
The Hub Console displays the Enterprise Manager tool.

- 715 -
Viewing Enterprise Manager Properties
This section explains how to choose the different servers or databases to
view, and lists the properties that Enterprise Manager displays for the Hub
server, cleanse server, and Master Database. In addition, the Enterprise
Manager displays version history for each component.

Choosing Properties to View


Before you can choose servers or databases to view, you must first start
Enterprise Manager. See "Starting the Enterprise Manager" on page 715.

In the Enterprise Manager screen, choose the hub component tab for the type
of information you want to view. The following options are available:
• Hub Servers
• Cleanse Servers
• Master database
• ORS databases
• Environment Report

In each of these categories, you can choose to view Properties or Version


History. Enterprise Manager displays the properties (or Environment report)
that are specific to your selection.

- 716 -
Hub Server Properties
When you choose the Hub Server tab, Enterprise Manager displays the Hub
Server properties. For additional information regarding these properties, refer
to the cmxserver.properties file.

In addition, to view more information about each property, slide your cursor
or mouse over the specific property.

The following table describes Hub Server properties that Enterprise Manager
can display in the Properties tab. These properties are found in the
cmxserver.properties file (in the hub server installation directory), and are not
configurable.
Property Name Explanation Property
Installation Installation directory of the cmx.home= C:/<infamdm_install_dir>/hub/server
directory Hub Server
Master Type of Master Database cmx.server.masterdatabase.type=ORACLE
database type
Application Type of application server: cmx.appserver.type=<application_server_name>
server type JBoss, WebSphere, WebLogic
Application Optional property used to cmx.appserver.hostname=Clustername
server deploy MRM into the EJB
hostname cluster.
RMI port Application server port cmx.appserver.rmi.port=<port_#>
(depends on the appserver
type)
default settings: 2809 for
Websphere, 1099 for JBoss,
7001 for WebLogic

- 717 -
Property Name Explanation Property
Naming Naming protocol for the cmx.appserver.naming.protocol=Jnp
protocol application server type
iiop for Websphere, jnp for
JBoss, t3 for WebLogic
Initial heap Initial heap size for Java jnlp.initial-heap-size=128m
size for Java
web start JVM
Maximum heap Maximum heap size for Java jnlp.max-heap-size=512m
size for Java web start JVM
web start JVM
Refresh Refresh interval for SAM cmx.server.sam.cache.resources.refresh_
interval for resources interval=5
SAM resources Properties are specific to the
in clock ticks cmx.server.sam.cache.user_profile.refresh_
Security Access Manager interval=1
component and used to
manage cached resources for cmx.server.clock.tick_interval=60000
user profiles.
Refresh Refresh interval for SAM user cmx.server.provider.userprofile.cacheable=false
interval for profiles cmx.server.provider.userprofile.expiration=60000
SAM user
profiles in clock cmx.server.provider.userprofile.lifespan=60000
ticks
Lookup Number of entries that will be sip.lookup.dropdown.limit=100
dropdown limit populated in a dropdown menu
in the Data Manager and
Merge Manager tools.
There is no minimum or
maximum limit for this value.
Java runtime Sun Microsystems Inc.
environment
vendor
Java runtime 1.5.0_15
environment
version

Cleanse Server Properties


When you choose the Cleanse Servers tab, a list of the cleanse servers is
displayed. When you select a specific cleanse server, Enterprise Manager
displays its properties. If you place your mouse over a specific property, the
property values and their source are also displayed.

- 718 -
The following table describes Cleanse Server properties that Enterprise
Manager can display in the Properties tab. These properties are found in the
cmxcleanse.properties file.

Property Explanation Property


Name
MRM Cleanse Installation directory of the cmx.server.datalayer.cleanse.working_files.location=C:/<infam
properties cleanse files
cmx.server.datalayer.cleanse.working_files=KEEP
cmx.server.datalayer.cleanse.execution=LOCAL
Installation Installation directory of the cmx.home= C:/<infamdm_install_dir>/hub/server
directory Hub Server
Application Type of application server: cmx.appserver.type=<application_server_name>
server type JBoss, WebSphere,
WebLogic.
Default port Application server port cmx.appserver.soap.connector.port=<port_#>
8880 for WebSphere (this
property is not applicable for
JBoss and WebLogic)
Match Number of threads used cmx.server.match.num_of_threads=1
properties
Number cmx.server.match.server_encoding=0

- 719 -
Property Explanation Property
Name
Number of records per match cmx.server.match.max_records_per_ranger_node=300
ranger node (limits memory
use)
Number of threads used cmx.server.cleanse.num_of_threads=1
during cleaning activities
Address Address Doctor cleanse cleanse.library.addressDoctor.property.AddressDoctor.UnlockCo
Doctor library unlock code
properties
Address Doctor cleanse cleanse.library.addressDoctor.property.AddressDoctor.Database
library database path
Address Doctor optimization cleanse.library.addressDoctor.property.AddressDoctor.Optimiza
Address Doctor memory cleanse.library.addressDoctor.property.AddressDoctor.MemoryMB
setting
Address Doctor correction cleanse.library.addressDoctor.property.AddressDoctor.Correcti
type
Address Doctor certified cleanse.library.addressDoctor.property.AddressDoctor.
preload part
Address Doctor certified cleanse.library.addressDoctor.property.AddressDoctor.PreLoad.
preload full
Address Doctor correction cleanse.library.addressDoctor.property.AddressDoctor.PreLoad.
preload part
Address Doctor correction cleanse.library.addressDoctor.property.AddressDoctor.PreLoad.
preload full
Trillium Trillium cleanse library config cleanse.library.trilliumDir.property.config.file.1=C:
properties file 1 default_config_Global.txt
Trillium cleanse library config cleanse.library.trilliumDir.property.config.file.2=C:
file 2 default_config_US_detail.txt
Trillium cleanse library config cleanse.library.trilliumDir.property.config.file.3=C:/<infamd
file 3 default_config_US_summary.txt

Master Database Properties


When you choose the Master Database tab, Enterprise Manager displays the
Master Database properties. The only database properties displayed are the
database vendor and version.

ORS Database Properties


When you choose the ORS Databases tab, Enterprise Manager displays a list of
the ORS databases. When a specific ORS database is selected, the list of
properties for that ORS database is displayed.

- 720 -
The top panel contains a list of ORS databases that are registered with the
Master Database. The bottom panel displays the properties and the version
history of the ORS database that is selected in the top panel. Properties of ORS
include database vendor and version, as well as information from the C_
REPOS_DB_RELEASE table. Version history is also kept in the C_REPOS_DB_
VERSION table.

Note: Enterprise Manager displays only ORS databases that are valid for the
current version of Informatica MDM Hub. If Enterprise Manager cannot obtain
database information for an ORS database (for example, if the ORS database
requires upgrading to the current Informatica MDM Hub version), then
Enterprise Manager displays a message explaining why the ORS database(s) is
not included in the list.

C_REPOS_DB_RELEASE Table

The following table describes the C_REPOS_DB_RELEASE properties that


Enterprise Manager displays for the ORS databases, depending on your
preference.
Field Name Description
DEBUG_LEVEL_ Debug level for the ORS database.
STR The default option is DEBUG and it is the only supported
value.
ENVIRONMENT_
ID
DEBUG_FILE_ Path to the location of the debug log.
PATH
DEBUG_FILE_ Name of the ORS database debug log.
NAME
DEBUG_IND Flag that indicates whether debug is enabled or not.

- 721 -
Field Name Description
0 = debug is not enabled
1 = debug is enabled
DEBUG_LEVEL_ Debug level to process (standard 5 from DEBUG to FATAL).
NUMBER Note that the default “DEBUG” is not used by standard
DEBUG_PRINT procedure.
DEBUG_LOG_ Size of database log file (in MB); default is 5.
FILE_SIZE
DEBUG_LOG_ Number of log files used for log rolling; default is 5.
FILE_NUMBER
TNSNAME TNS name of the ORS database.
CONNECTION_ Port on which the ORS database listens.
PORT
ORACLE_SID Oracle database identifier.
DATABASE_ Host on which the database is installed.
HOST
INTER_SYSTEM_ Delta-detection value, in seconds, which determines if the
TIME_DELTA_ incoming data is in the future.
SEC
COLUMN_ Flag that the SQLLoader uses to determine if the database it
LENGTH_IN_ is loading into is a UTF-8 database.
BYTES_IND A default value of 1 means that the database is UTF-8.
LOAD_TEMPLATE
MTIP_ Flag that indicates that the MTIP views will be regenerated
REGENERATION_ before the match/merge process.
REQUIRED_IND The default value of 0 (zero) means that views will not be
regenerated.
GLOBAL_ This is used when tables are created to enable logging for
NOLOGGING_ DB recovery. The default of 1 means no logging.
IND

Environment Report
When you choose the Environment Report tab, Enterprise Manager displays a
summary of the properties of all the other choices, along with any associated
error messages.

- 722 -
Saving the Hub Environment Report

To download this report in HTML format to a file system:


1. Click the Save button at the top of the Enterprise Manager.
The Enterprise Manager displays the Save Hub Environment Report dialog.

2. Specify the desired directory location.


3. Click Save.

Viewing Version History


You use the Enterprise Manager tool to display the version history for the Hub
Server, Cleanse Server, and Master Database.

- 723 -
Before you can choose servers or databases to view, you must first start
Enterprise Manager. See "Starting the Enterprise Manager" on page 715.

To view version history:


1. In the Enterprise Manager screen, choose the tab for the type of
information you want to view: Hub Servers, Cleanse Servers, and Master
or ORS database.
Enterprise Manager displays version history that is specific for your
choice.
2. Select the Version History tab.
Enterprise Manager displays version information for that specific
component. Version history is sorted in descending order of install start
time. All version histories of hub components are similar to the following
figure.

Using ORS Database Logs


You use the Enterprise Manager tool to configure and view ORS database logs
for the various SIF API classes and Informatica MDM Hub Batch jobs.

About Database Logs


Database logs for Hub batch jobs and SIF API requests provide a best practice
for keeping track of debugging, warnings, error messages, and other
information regarding these processes within an ORS database. If the data
steward enables database logging, the Hub appends the associated debugging
information to a file stored on host machine which runs the Oracle database
instance. The name of the actual debug file is stored in the ORS metadata C_
REPOS_DB_RELEASE table.

Database Log File Format

The database log file format is:


<date> <timestamp> [<log_level>] [sid:<session_id>] [<entry_process>;<calling_process_
dot_padded><package_name>.<line_num_in_the_package>] <in_debug_test>

Each entry_process is the name of the entry stored procedure (that is, not the
name of the actual batch job or SIF API request). Here is a sample log:

- 724 -
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Init_debug_vars:Init_debug_vars..........
CMXLOG.173] CMXLOG initializtion; 28 module records read.
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Task Assignment
Daemon:Application.............. CMXUE.304] Start of cmxue.assign_tasks
06-JAN-2009 10:12:53.337[INFO ][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.237] Start of Task Assignment Daemon

For a more complete sample database log file, see "Sample Database Log
File" on page 729.

Database Log Levels

Enterprise Manager includes a set of log levels to aid in the debugging and
information retrieval process. The possible database log levels are:
Name ORS Metadata Table Description
Value
ALL 500 Logs all associated information. Default log
level.
DEBUG 400 Used for debugging (default).
INFO 300 Database log information.
WARNING 200 Warning messages.
ERROR 100 Error messages.

C_REPOS_LOG_MODULE Table

During an ORS database installation, the Hub creates a C_REPOS_LOG_


MODULE table to keep track of database log information for the specific
modules (that is, the actual batch job or SIF API request). This table includes
the following information:
Field Type Description
MODULE_NAME VARCHAR2(100) Name of batch job or specific SIF API
request.
MODULE_TYPE VARCHAR2(50) “Batch Job” or “SIF API” module.
LOG_LEVEL NUMBER 100-500
CREATE_DATE DATE Date database log created.
CREATOR VARCHAR2(50) User ID of log file creator.
LAST_UPDATE_ DATE Date log file last updated.
DATE
UPDATED_BY VARCHAR2(50) User name.

When an ORS database is installed, the Hub adds a row to the C_REPOS_LOG_
MODULE table for each of the entry processes which may run stored
procedures. These module names could be batch jobs or SIF API requests. By
default, the log level for these modules is DEBUG (400).

The Hub stores the ORS debug file information in the C_REPOS_DB_RELEASE table.
For more information, see "C_REPOS_DB_RELEASE Table" on page 721.

- 725 -
Procedures for Appending Data to Database Log Files

The following PL/SQL procedures are used by the ORS CMSMIG package, and
various setup and migration scripts/methods for appending information to the
database log file.

Note: These are external stored procedures for backward compatibility for
DEBUG_PRINT functionality only.
Procedure Parameters Description
CMXLB.DEBUG_ in_debug_text, Basic debug procedure used by the ORS
PRINT in_offset database
CMXMIG.DEBUG_ in_debug_text Used only by procedures of CMXMIG
PRINT in_debug_ package
level
in_debug_file
DEBUG_PRINT in_debug_text Standalone procedure used by setup,
migration scripts, and methods of
CMX_TABLE object type

Log File Rolling

When a database log file reaches the maximum size (as defined by DEBUG_
LOG_FILE_SIZE in the C_REPOS_DB_RELEASE table), the Enterprise Manager
performs a log rolling procedure to archive the existing log information and to
prevent log files from being overridden with new database information:
1. The following settings are defined:
• Maximum file size (MaxFileSize)
• Maximum number of files (MaxBackupIndex)
• Name of the log file (debug.log)
2. The Hub logger (Log4J) appends the various database messages to the
debug.log file.

3. When the debug.log file size exceeds the MaxFileSize, the Hub activates
the log rolling procedure:
1. The active debug.log file is renamed to <filename>.hold.
2. For every file named <filename>.(n), the file is renamed to
<filename>.(n+1).

3. If n+1 is larger than MaxBackupIndex, the file is deleted.


4. When rolling over goes beyond the maximum number of database
files,
the Hub renames cmx_debug.log to cmx_debog.log.1, renames
cmx_debug.log.1 to cmx_debug.log.2, and so forth. The Hub then
overwrites cmx_debug.log with the new log information.
5. The <filename>.hold file is renamed to <filename>.1.

- 726 -
Note: The Hub also creates a log.rolling file prior to executing a rollover. If
your log files are not rolling as expected, check your log file directory and
remove the log.rolling file; as a result, the log rolling can continue.

Configuring ORS Database Logs


Before you can choose servers or databases to view, you must first start
Enterprise Manager. See "Starting the Enterprise Manager" on page 715.

To configure the database logs for an ORS database:


1. In the Enterprise Manager screen, click the ORS databases tab. The
screen displays the various ORS databases that are registered with the
Master Database.
2. Select the desired ORS database from the list of ORS databases.
3. Click the Database Log Configuration tab.
Enterprise Manager displays the various debug file settings, and the SIF
API and Batch Job log level settings for the selected ORS database.

4. Check the Debug Enabled check box in the top pane of the Database
Log Configuration screen.
5. Enter the debug file path, debug file name, and debug file level.

- 727 -
6. Enter the database log file size.
7. Enter the number of database log files.
8. Click Save.
9. Set the associated application server settings for debugging:
IBM WebSphere:
1. Open the following file for editing: log4j.xml
This file is in <infamdm_install_dir>\hub_5910\server\conf.
2. Change default value to DEBUG:
<category name="com.delos">
<priority value="DEBUG"/>
</category>

<category name="com.siperian">
<priority value="DEBUG"/>
</category>

<category name="siperian.performance">
<priority value="INFO"/>
</category>

1. Open the following file for editing: log4j.xml


This file is in <infamdm_install_dir>\hub_5910\cleanse\conf
2. Change default value to DEBUG:
<category name="com.delos">
<priority value="DEBUG"/>
</category>

<category name="com.siperian">
<priority value="DEBUG"/>
</category>

<category name="siperian.performance">
<priority value="INFO"/>
</category>

JBoss:
1. Open the following file for editing: jboss-log4j.xml
This file is in jboss-5.1.0.GA_oracle\server\default\conf
2. Change default value to DEBUG:
<category name="com.delos">
<priority value="DEBUG"/>
</category>

<category name="com.siperian">
<priority value="DEBUG"/>
</category>

<category name="siperian.performance">

- 728 -
<priority value="INFO"/>
</category>

Sample Database Log File


Here is a sample database log file:
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Init_debug_vars:Init_debug_vars..........
CMXLOG.173] CMXLOG initializtion; 28 module records read.
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Task Assignment
Daemon:Application.............. CMXUE.304] Start of cmxue.assign_tasks
06-JAN-2009 10:12:53.337[INFO ][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.237] Start of Task Assignment Daemon
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.239] **Start of cmxtask.assign_tasks
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.240] Maximum number of records to assign per user is 25
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.467] assign_tasks ok
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.470] End of cmxtask.assign_tasks
06-JAN-2009 10:12:53.337[DEBUG][sid:103][Task Assignment
Daemon:Application.............. CMXUE.308] End of cmxue.assign_tasks
06-JAN-2009 10:13:53.369[DEBUG][sid:103][Init_debug_vars:Init_debug_vars..........
CMXLOG.173] CMXLOG initializtion; 28 module records read.
06-JAN-2009 10:13:53.369[DEBUG][sid:103][Task Assignment
Daemon:Application.............. CMXUE.304] Start of cmxue.assign_tasks
06-JAN-2009 10:13:53.369[INFO ][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.237] Start of Task Assignment Daemon
06-JAN-2009 10:13:53.369[DEBUG][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.239] **Start of cmxtask.assign_tasks
06-JAN-2009 10:13:53.369[DEBUG][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.240] Maximum number of records to assign per user is 25
06-JAN-2009 10:13:53.369[DEBUG][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.467] assign_tasks ok
06-JAN-2009 10:13:53.369[DEBUG][sid:103][Task Assignment Daemon:Task Assignment
Daemon... CMXTASK.470] End of cmxtask.assign_tasks
06-JAN-2009 10:13:53.369[DEBUG][sid:103][Task Assignment
Daemon:Application.............. CMXUE.308] End of cmxue.assign_tasks

- 729 -
Appendix E: Implementing Custom
Buttons in Hub Console Tools

This chapter explains how, in a Informatica MDM Hub implementation, you can
add custom buttons to tools in the Hub Console that allow you to invoke
external services on demand.

Appendix Contents
• "About Custom Buttons in the Hub Console" on page 730
• "Adding Custom Buttons" on page 731

About Custom Buttons in the Hub Console


In your Informatica MDM Hub implementation, you can provide Hub Console
users with custom buttons that can be used to extend your Informatica MDM
Hub implementation. Custom buttons can provide users with on-demand, real-
time access to specialized data services. Custom buttons can be added to
Merge Manager and Hierarchy Manager.

Custom buttons can give users the ability to invoke a particular external
service (such as retrieving data or computing results), perform a specialized
operation (such as launching a workflow), and other tasks. Custom buttons
can be designed to access data services by a wide range of service providers,
including—but not limited to—enterprise applications (such as CRM or ERP
applications), external service providers (such as foreign exchange
calculators, publishers of financial market indexes, or government agencies),
and even Informatica MDM Hub itself (for more information, see the
Informatica MDM Hub Services Integration Framework Guide).

For example, you could add a custom button that invokes a specialized
cleanse function, offered as a Web service by a vendor, that cleanses data in
the customer record that is currently selected in the Merge Manager screen.
When the user clicks the button, the underlying code would capture the
relevant data from the selected record, create a request (possibly including
authentication information) in the format expected by the Web service, and
then submit that request to the Web service for processing. When the results
are returned, the Hub displays the information in a separate Swing dialog (if
you created one and if you implemented this as a client custom function) with
the customer rowid_object from Informatica MDM Hub.

Custom buttons are not installed by default, nor are they required for every
Informatica MDM Hub implementation. For each custom button you need to
implement a Java interface, package the implementation in a JAR file, and

- 730 -
deploy it by running a command-line utility. To control the appearance of the
custom button in the Hub Console, you can supply either text or an icon
graphic in any Swing-compatible graphic format (such as JPG, PNG, or GIF).

What Happens When a User Clicks a Custom Button


When a user selects a customer record then clicks a custom button in the Hub
Console, the Hub Console invokes the request, passing content and context to
the Java external (custom) service. Examples of the type of data include
record keys and other data from a base object, package information, and so
on. Execution is asynchronous—the user can continue to work in the Hub
Console while the request is processed.

The custom code can process the service response as appropriate—log the
results, display the data to the user in a separate Swing dialog (if custom-
coded and the custom function is client-side), allow users to copy and paste
the results into a data entry field, execute real-time PUT statements of the
data back into the correct business objects, and so on.

How Custom Buttons Appear in the Hub Console


This section shows how custom buttons, once implemented, will appear in the
Merge Manager and Hierarchy Manager tools of the Hub Console.

Custom Buttons in the Merge Manager

Custom buttons are displayed to the right of the top panel of the Merge
Manager, in the same location as the regular Merge Manager buttons. The
following example shows a button called fx.

Custom Buttons in the Hierarchy Manager

Custom buttons are displayed in the top part of the top panel of the Hierarchy
Manager screen, in the same location as other Hierarchy Manager buttons.
The following example shows a button called fx.

Adding Custom Buttons


To add a custom button to the Hub Console in your Informatica MDM Hub
implementation, complete the following tasks:
1. Determine the details of the external service that you want to invoke, such
as the format and parameters for request and response messages.

- 731 -
2. Write and package the business logic that the custom button will execute,
as described in "Writing a Custom Function" on page 732.
3. Deploy the package so that it appears in the applicable tool(s) in the Hub
Console, as described in "Deploying Custom Buttons" on page 735.

Once an external service button is visible in the Hub Console, users can click
the button to invoke the service.

Writing a Custom Function


To build an external service invocation, you write a custom function that
executes the application logic when a user clicks the custom button in the Hub
Console. The application logic implements the following Java interface:

com.siperian.mrm.customfunctions.api.CustomFunction

To learn more about this interface, see the Javadoc that accompanies your
Informatica MDM Hub distribution.

Server-Based and Client-Based Custom Functions

Execution of the application logic occurs on either:


Environment Description
Client UI-based custom function—Recommended when you want to
display elements in the user interface, such as a separate dialog
that displays response information. To learn more, see
"Example Client-Based Custom Function" on page 732.
Server Server-based custom button—Recommended when it is
preferable to call the external service from the server for
network or performance reasons. To learn more, see "Example
Server-Based Function" on page 733.

Example Custom Functions

This section provides the Java code for two example custom functions that
implement the com.siperian.mrm.customfunctions.api.CustomFunction
interface. The code simply prints (on standard error) information to the server
log or the Hub Console log.

Example Client-Based Custom Function

The name of the client function class for the following sample code is
com.siperian.mrm.customfunctions.test.TestFunction.
//=====================================================================
//project: Informatica Master Reference Manager, Hierarchy Manager
//---------------------------------------------------------------------
//copyright: Informatica Corp. (c) 2008-2010. All rights reserved.
//=====================================================================

- 732 -
package com.siperian.mrm.customfunctions.test;

import java.awt.Frame;
import java.util.Properties;

import javax.swing.Icon;

import com.siperian.mrm.customfunctions.api.CustomFunction;

public class TestFunctionClient implements CustomFunction {

 public void executeClient(Properties properties, Frame frame, String username,


String password, String orsId, String baseObjectRowid, String baseObjectUid, String
packageRowid, String packageUid, String[] recordIds) {
 System.err.println("Called custom test function on the client with the following
parameters:");
 System.err.println("Username/Password: '" + username + "'/'" + password + "'");
 System.err.println(" ORS Database ID: '" + orsId + "'");
 System.err.println("Base Object Rowid: '" + baseObjectRowid + "'");
 System.err.println(" Base Object UID: '" + baseObjectUid + "'");
 System.err.println(" Package Rowid: '" + packageRowid + "'");
 System.err.println(" Package UID: '" + packageUid + "'");
 System.err.println(" Record Ids: ");
 for(int i = 0; i < recordIds.length; i++) {
 System.err.println(" '"+recordIds[i]+"'");
 }
 System.err.println(" Properties: " + properties.toString());
 }

 public void executeServer(Properties properties, String username, String password,


String orsId, String baseObjectRowid, String baseObjectUid, String packageRowid, String
packageUid, String[] recordIds) {
 System.err.println("This method will never be called because getExecutionType()
returns CLIENT_FUNCTION");
 }

 public String getActionText() { return "Test Client"; }

 public int getExecutionType() { return CLIENT_FUNCTION; }


 public Icon getGuiIcon() { return null; }

Example Server-Based Function

The name of the server function class for the following code is
com.siperian.mrm.customfunctions.test.TestFunctionClient.
//=====================================================================
//project: Informatica Master Reference Manager, Hierarchy Manager
//---------------------------------------------------------------------
//copyright: Informatica Corp. (c) 2008-2010. All rights reserved.
//=====================================================================

package com.siperian.mrm.customfunctions.test;

- 733 -
import java.awt.Frame;
import java.util.Properties;

import javax.swing.Icon;

import com.siperian.mrm.customfunctions.api.CustomFunction;

/**
* This is a sample custom function that is executed on the Server.
* To deploy this function, put it in a jar file and upload the jar file
* to the DB using DeployCustomFunction.
*/
public class TestFunction implements CustomFunction {
 public String getActionText() {
 return "Test Server";
 }
 public Icon getGuiIcon() {
 return null;
 }

 public void executeClient(Properties properties, Frame frame, String username,


String password, String orsId, String baseObjectRowid, String baseObjectUid, String
packageRowid, String packageUid, String[] recordIds) {
 System.err.println("This method will never be called because getExecutionType()
returns SERVER_FUNCTION");
 }
 public void executeServer(Properties properties, String username, String password,
String orsId, String baseObjectRowid, String baseObjectUid, String packageRowid, String
packageUid, String[] recordIds) {
 System.err.println("Called custom test function on the server with the following
parameters:");
 System.err.println("Username/Password: '" + username + "'/'" + password + "'");
 System.err.println(" ORS Database ID: '" + orsId + "'");
 System.err.println("Base Object Rowid: '" + baseObjectRowid + "'");
 System.err.println(" Base Object UID: '" + baseObjectUid + "'");
 System.err.println(" Package Rowid: '" + packageRowid + "'");
 System.err.println(" Package UID: '" + packageUid + "'");
 System.err.println(" Record Ids: ");
 for(int i = 0; i < recordIds.length; i++) {
 System.err.println(" '"+recordIds[i]+"'");
 }
 System.err.println(" Properties: " + properties.toString());
 }

 public int getExecutionType() {


 return SERVER_FUNCTION;
 }
}

Controlling the Custom Button Appearance


To control the appearance of the custom button in the Hub Console, you
implement one of the following methods in the
com.siperian.mrm.customfunctions.api.CustomFunction interface:

- 734 -
Method Description
getActionText Specify the text for the button label. Uses the default visual
appearance for custom buttons.
getGuiIcon Specify the icon graphic in any Swing-compatible graphic
format (such as JPG, PNG, or GIF). This image file can be
bundled with the JAR file for this custom function.

Custom buttons are displayed alphabetically by name in the Hub Console.

Deploying Custom Buttons


Before you can see the custom buttons in the Hub Console, you need to
explicitly add them using the DeployCustomFunction utility from the command
line.

To deploy custom buttons:


1. Open a command prompt.
2. Run the DeployCustomFunction utility, which loads and registers a JAR file
that a user has created.
Note: To run DeployCustomFunction, two JAR files must be in the
CLASSPATH—siperian-server.jar and the JDBC driver (in this case,
ojdbc14.jar)— with directory paths that point to these files.

Specify following command at the command prompt:


java -cp siperian-server.jar; ojdbc14.jar
com.siperian.mrm.customfunctions.dbadapters.DeployCustomFunction

Reply to the prompts based on the configured settings for your Informatica
MDM Hub implementation. For example:
Database Type:oracle
Host:localhost
Port(1521):
Service:orcl
Username:ds_ui1
Password:!!cmx!!
(L)ist, (A)dd, (U)pdate, (C)hange Type, (S)et Properties, (D)elete or
(Q)uit:l
No custom actions
(L)ist, (A)dd, (U)pdate Jar, (C)hange Type, (S)et Properties, (D)elete
or (Q)uit:q

3. At the respective prompts, specify the following information (based on the


configured settings for your Informatica MDM Hub implementation):
a. Database host
b. Port
c. Service
d. Login username (schema name)
e. Login password

- 735 -
4. When prompted, specify database connection information: database host,
port, service, login username, and password.
5. The DeployCustomFunction tool displays a menu of the following options.

Label Description
(L)ist Displays a list of currently-defined custom buttons.
(A)dd Adds a new custom button. The DeployCustomFunction tool
prompts you to specify:
• the JAR file for your custom button
• the name of the custom function class that implements the
com.siperian.mrm.customfunctions.api.CustomFunction
interface
• the type of the custom button: m—Merge Manager, d—Data
Manager, h—Hierarchy Manager (you can specify one or
two letters)
(U)pdate Updates the JAR file for an existing custom button. The
 DeployCustomFunction tool prompts you to specify:
• the rowID of the custom button to update
• the JAR file for your custom button
• the name of the custom function class that implements the
com.siperian.mrm.customfunctions.api.CustomFunction
interface
• the type of the custom button: m—Merge Manager, h—
Hierarchy Manager (you can specify one or two letters)
(C)hange Changes the type of an existing custom button. The
Type DeployCustomFunction tool prompts you to specify:
• the rowID of the custom button to update
• the type of the custom button: m—Merge Manager, and /or
h—Hierarchy Manager (you can specify one or two letters)
(S)et Specify a properties file, which defines name/value pairs that
Properties
the custom function requires at execution time (name=value).
The DeployCustomFunction tool prompts you to specify the
properties file to use.
(D)elete Deletes an existing custom button. The DeployCustomFunction
tool prompts you to specify the rowID of the custom button to
delete.
(Q)uit Exits the DeployCustomFunction tool.
6. When you have finished choosing your actions, choose (Q)uit.
7. Refresh the browser window to display the custom button you just added.
8. Test your custom button to ensure that it works properly.

- 736 -
Appendix F: Configuring Access to
Hub Console Tools

Appendix Contents
• "About User Access to Hub Console Tools" on page 737
• "Starting the Tool Access Tool" on page 737
• "Granting User Access to Tools and Processes" on page 738
• "Revoking User Access to Tools and Processes" on page 739

About User Access to Hub Console Tools


For users who will be using the Hub Console in their jobs, you can control
access privileges to Hub Console tools. For example, data stewards typically
have access to only the Data Manager and Merge Manager tools.

You use the Tool Access tool in the Configuration workbench to configure
access to Hub Console tools. To use the Tool Access tool, you must be
connected to the master database.

Note: The Tool Access tool applies only to Informatica MDM Hub users who
are not configured as administrators (users who do not have the
Administrator check box selected in the Users tool, as described in "Editing
User Accounts" on page 649).

Starting the Tool Access Tool


To start the Tool Access tool:
1. In the Hub Console, connect to the master database, if you have not
already done so.
2. Expand the Configuration workbench and click Tool Access.
The Hub Console displays the Tool Access tool.

- 737 -
In the above example, the cmx_global user account exists only to store the
global password policy, which is described in "Managing the Global Password
Policy" on page 654.

Granting User Access to Tools and Processes


To grant user access to Hub Console tools and processes for a specific
Informatica MDM Hub user:
1. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
2. In the Tool Access tool, scroll the User list and select the user that you
want to configure.
3. Do one of the following:
• In the Available processes list, select a process to which you want to
grant access.
• In the Available workbenches list, select a workbench containing
the tool(s) to which you want to grant access.

4. Click the button.


The Tool Access tool adds the selected tool or process to the Accessible
tools and processes list. Granting access to a process automatically
grants access to any tool that the process uses. Granting access to a tool
automatically grants access to any process that uses the tool.

- 738 -
The user will have access to these processes and tools for every ORS to which
they have access. You cannot give a user access to one tool for one ORS and
another tool for a different ORS.

Note: If you want to grant access to only some of the tools in a workbench,
then expand the associated workbench in the Accessible tools and
processes list, select the tool, and revoke access according to the
instructions in the next section, "Revoking User Access to Tools and
Processes" on page 739.

Revoking User Access to Tools and Processes


To revoke user access to Hub Console tools and processes for a specific
Informatica MDM Hub user:
1. Acquire a write lock. For more information, see "Acquiring a Write Lock"
on page 36.
2. In the Tool Access tool, scroll the User list and select the user that you
want to configure.
3. Scroll the Accessible tools and processes list and select the process,
workbench, or tool to which you want to revoke access.
To select a tool, expand the associated workbench.

4. Click the button.


The Tool Access tool prompts you to confirm that you want to remove the
access.
5. Click Yes.
The Tool Access tool removes the selected item from the Accessible tools
and processes list. Revoking access to a process automatically revokes
access to any tool that the process uses. Revoking access to a tool
automatically revokes access to any process that uses the tool.

- 739 -
Appendix G: Row-level Locking

This appendix describes how to enable and use row-level locking to protect
data integrity when asynchronously running batch and API processes
(including SIF requests) or API/API processes.

Note: You can skip this section if batch jobs and API calls are not run
concurrently in your Informatica MDM Hub implementation.

Appendix Contents
• "About Row-level Locking" on page 740
• "Configuring Row-level Locking" on page 741
• "Locking Interactions Between SIF Requests and Batch Processes" on page
742

About Row-level Locking


Informatica MDM Hub uses row-level locking to manage data integrity when
batch and SIF operations are executing concurrently on records in a base
object. Row-level locking:
• enables concurrent updates of base object data (the same base object but
different records) from both batch and SIF processes
• eliminates conflicts between online and batch processes
• provides a high degree concurrent access to data, and
• avoids duplication of the hardware necessary for a mirrored environment.

Row-level locking should be enabled if batch / API or API/API processes will


be run asynchronously in your Informatica MDM Hub implementation. Row-
level locking applies to asynchronous SIF and batch processing. Synchronous
batch processing is restricted due to existing application-level locking.

Default Behavior
Row-level locking is disabled by default. While disabled, API processes
(including SIF requests) and batch processes cannot be executed
asynchronously on the same base object at the same time. If you explicitly
enable row-level locking for an ORS (see "Enabling Row-level Locking on an
ORS" on page 741), then Informatica MDM Hub uses the Oracle row-level
locking mechanism to manage concurrent updates for tokenize, match, and
merge processes.

- 740 -
Types of Locks
Data management in Informatica MDM Hub involves the following types of
locks.
Name Definition
exclusive Prohibits all other jobs (API or batch processes) from processing on
lock the locked base object.
shared Prohibits only certain jobs from running. For example, a batch job
lock could issue a non-exclusive on a base object and, when
interoperability is enabled (on), this shared lock would prohibit
other batch jobs but would allow API jobs to attempt to process on
the base object.
row- Shared lock that also includes a SELECT FOR UPDATE to lock the
level affected base object rows.
lock

Considerations for Using Row-level Locking


When using row-level locking in a Informatica MDM Hub implementation,
consider the following issues:
• A batch process can lock individual records for a significant period of time,
which can interfere with SIF access over the web.
• There can be small windows of time during which SIF requests will be
locked out. However, these should be few and limited to less than ten
minutes.
• The Hub Store needs to be sufficiently sized to support the combined
resource demand load of executing batch processes and SIF requests
concurrently.

Note: Interoperabilty must be enabled if Batch jobs are being run together.
If, you have multiple parents that attempt access to the same child (or parent)
records when running different Batch jobs, one job will fail if it attempts to
lock records being processed by the other batch and it holds them longer than
the batch wait time. The maximum wait time is defined in C_REPOS_TABLE.

Configuring Row-level Locking


This section describes how to configure row-level locking.

Enabling Row-level Locking on an ORS


By default, row-level locking is not enabled. To enable row-level locking for an
ORS:
1. In the Hub Console, start the Databases tool.
2. Select an ORS to configure.

- 741 -
3. Edit the ORS properties according to the instructions in "Editing ORS
Properties" on page 64.
4. Select (check) the Batch API Lock Interoperability check box.
5. Save your changes.

Note: Once you enable row-level locking, the Tokenize on Put property cannot
be enabled.

Configuring Lock Wait Times


Once enabled, you can use the Schema Manager to configure SIF and batch
lock wait times for base objects in the ORS. If the wait time is exceeded,
Informatica MDM Hub displays an error message. For more information, see
"API “lock wait” interval (seconds)" on page 93 and "Batch “lock wait” interval
(seconds)" on page 93.

Locking Interactions Between SIF Requests


and Batch Processes
This section describes the locking interactions between SIF requests and batch
processes.

Interactions When API Batch Interoperability is Enabled

The following table shows the interactions between the API and batch
processes when row-level locking is enabled.
Locking Interactions When API and Batch Interoperability is Enabled
Existing Batch – Batch – Shared Lock API Row-level Lock
Lock / Exclusive
New Lock
Incoming
Call
Batch – Immediately Immediately display An Wait for Batch_Lock_Wait_
Exclusive display an error message Seconds checking the lock
Lock error existence. Display an error
message message if the lock is not
cleared by the wait time.
Called for each table to be
locked.
Batch – Immediately Immediately display An Wait for Batch_Lock_Wait_
Shared display an error message Seconds to apply a row-level
Lock error lock using FOR UPDATE
message SELECT. If the table does not
manage the lock, display an
error message. Called for each
table to be locked.

- 742 -
Existing Batch – Batch – Shared Lock API Row-level Lock
Lock / Exclusive
New Lock
Incoming
Call
API - Immediately Wait for API_Lock_ Wait for API_Lock_Wait_
Row level display an Wait_Seconds to apply Seconds to apply a row-level
lock error a row-level lock using lock using FOR UPDATE
message FOR UPDATE SELECT. If SELECT. If the table does not
table does not manage manage the lock, display an
the lock, display An error message. Called for each
error message. table to be locked.

Interactions When API Batch Interoperability is Disabled

The following table shows the interactions between the API and batch
processes when API batch interoperability is disabled. In this scenario, batch
processes will issue an exclusive lock, while SIF requests will check for an
exclusive lock but will not issue any locks.
Locking Interactions When API and Batch Interoperability is Disabled
Existing Lock / New Batch API
Incoming Call
Batch – Exclusive Lock Immediately display an See "Default Behavior"
error message on page 740.
API Immediately display an See "Default Behavior"
error message on page 740.

- 743 -
Glossary

accept limit

A number that determines the acceptability of a match. The accept limit is


defined by Informatica within a population in accordance with its match
purpose.

active state (records)

This is a state associated with a base object or cross reference record. A base
object record is active if at least one of its cross reference records is active. A
cross reference record contributes to the consolidated base object only if it is
active.

Active records participate in Hub processes by default. These are the records
that are available to participate in any operation. If records are required to go
through an approval process, then these records have been through that
process and have been approved.

Activity Manager

Informatica Activity Manager (AM) evaluates data events, synchronizes


master data, and delivers unified views of reference and activity data from
disparate sources. AM builds upon the extensible, template-driven schema of
Informatica MDM Hub and uses the rules-based, configurable approach to
combining reference, relationship, and activity data. It conducts a rules
evaluation using a combination of reference, transactional, and analytical data
from disparate sources. It also conducts an event-driven, rules-based
orchestration of data writebacks to selected sources, and performs other
event-driven, rules-based actions to centralize data integration and delivery
of relevant data to subscribing users and applications. AM has an intuitive,
powerful UI for defining, designing, delivering, and managing unified views to
downstream applications and systems, as well as built-in lineage, history, and
audit functionality.

Admin source system

Default source system. Used for manual trust overrides and data edits from
the Data Manager or Merge Manager tools. See "source system" on page 779.

- 744 -
administrator

Informatica MDM Hub user who has the primary responsibility for configuring
the Informatica MDM Hub system. Administrators access Informatica MDM
Hub through the Hub Console, and use Informatica MDM Hub tools to configure
the objects in the Hub Store, and create and modify Informatica MDM Hub
security.

auditable events

Informatica MDM Hub provides the ability to create an audit trail for certain
activities that are associated with the exchange of data between Informatica
MDM Hub and external systems. An audit trail can be captured whenever:
• an external application interacts with Informatica MDM Hub by invoking a
Services Integration Framework (SIF) request
• Informatica MDM Hub sends a message (using JMS) to a message queue
for the purpose of distributing data changes to other systems
• Activity Manager invokes an external application.

authentication

Process of verifying the identity of a user to ensure that they are who they
claim to be. In Informatica MDM Hub, users are authenticated based on their
supplied credentials—user name / password, security payload, or a
combination of both. Informatica MDM Hub provides an internal authentication
mechanism and also supports user authentication via third-party
authentication providers. See "credentials" on page 750, "security payload" on
page 778.

authorization

Process of determining whether a user has sufficient privileges to access a


requested Informatica MDM Hub resource. In Informatica MDM Hub, resource
privileges are allocated to roles. Users and user groups are assigned to roles.
A user’s resource privileges are determined by the roles to which they are
assigned, as well as by the roles assigned to the user group(s) to which the
user belongs. See "user" on page 783, "user group" on page 784, "role" on
page 775, "resource" on page 775, and "privilege" on page 771.

automerge

Process of merging records automatically. For merge-style base objects only.


Match rules can result in automatic merging or manual merging. A match rule
that instructs Informatica MDM Hub to perform an automerge will combine
two or more records of a base object table automatically, without manual

- 745 -
intervention. See "manual merge" on page 762, "merge-style base object" on
page 767.

base object

A table that contains information about an entity that is relevant to your


business, such as customer or account.

batch group

A collection of individual batch jobs (for example, Stage, Load, and Match
jobs) that can be executed with a single command. Each batch job in a group
can be executed sequentially or in parallel to other jobs. See also "batch job"
on page 746.

batch job

A program that, when executed, completes a discrete unite of work (a


process). For example, the Match job carries out the match process, checking
the specified match condition for the records of a base object table and then
queueing the matched records for either automerge (Automerge job) or
manual merge (Manual Merge job). See also "batch group" on page 746.

batch mode

Way of interacting with Informatica MDM Hub via batch jobs, which can be
executed in the Hub Console or using third-party management tools to
schedule and execute batch jobs (in the form of stored procedures) on the
database server. See also "real-time mode" on page 773, "batch job" on page
746, "batch group" on page 746, "stored procedure" on page 780.

best version of the truth (BVT)

A record that has been consolidated with the best cells of data from the source
records. Sometimes abbreviated as BVT.
• For merge-style base objects, the base object record is the BVT record,
and is built by consolidating the most-trustworthy cell values from the
corresponding source records.

BI vendor

A company that produces Business Intelligence software products.

bulk merge

See "automerge" on page 745.

- 746 -
BVT

See "best version of the truth (BVT)" on page 746.

cascade delete

When the Delete stored procedure deletes records in the parent object, it also
removes the affected records in the child base object. To enable a cascade
delete operation, set the CASCADE_DELETE_IND parameter to 1. The Delete job
checks each child base object table for related data that should be deleted
given the removal of the parent base object record.

If you do not set this parameter, Informatica MDM Hub generates an error
message if there are child base objects referencing the deleted base object
record; the Delete job fails, and Informatica MDM Hub performs a rollback
operation for the associated data.

cascade unmerge

When records in a parent object are unmerged, Informatica MDM Hub also
unmerges affected records in the child base object.

See also: "linear unmerge" on page 761, "tree unmerge" on page 782.

cell

Intersection of a column and a record in a table. A cell contains a data value or


null.

change list

List of changes to make to a target repository. A change is an operation in the


change list—such as adding a base object or updating properties in a match
rule—that is executed against the target repository. Change lists represent the
list of differences between Hub repositories. See also "creation change list" on
page 750, "comparison change list" on page 748, "Metadata Manager" on page
768.

cleanse

See "data cleansing" on page 751.

cleanse engine

A cleanse engine is a third party product used to perform data cleansing with
the Informatica MDM Hub.

- 747 -
cleanse function

Code changes the incoming data during Stage jobs, converting each input
string to an output string. Typically, these functions are used to standardize
data and thereby optimize the match process. By combining multiple cleanse
functions, you can perform complex filtering and standardization. See also
"data cleansing" on page 751, "internal cleanse" on page 760.

cleanse list

A logical grouping of rules for replacing parts of an input string during the
cleanse process. See "cleanse function" on page 748, "data cleansing" on page
751.

Cleanse Match Server

The Cleanse Match Server run-time component is a servlet that handles


cleanse requests. This servlet is deployed in an application server
environment. The servlet contains two server components:
• a cleanse server that handles data cleansing operations
• a match server that handles match operations

The Cleanse Match Server is multi-threaded so that each instance can process
multiple requests concurrently. It can be deployed on a variety of application
servers.

The Cleanse Match Server interfaces with any of the supported cleanse
engines, such as the Trillium Director cleanse engine. The Cleanse Match
Server and the cleanse engine work to standardize the data. This
standardization works closely with the Informatica Consolidation Engine
(formerly referred to as the Merge Engine) to optimize the data for
consolidation.

column

In a table, a set of data values of a particular type, one for each row of the
table. See "system column" on page 780, "user-defined column" on page 783.

comparison change list

A change list that is the result of comparing the contents of two repositories
and generating the list of changes to make to the target repository.
Comparison change lists are used in Metadata Manager when promoting or
importing design objects. See also "change list" on page 747, "creation change
list" on page 750, "Metadata Manager" on page 768.

- 748 -
complete match tracking

The display of the complete or original match chain that caused two records to
be matched through intermediate records.

conditional mapping

A mapping between a column in a landing table and a staging table that uses a
SQL WHERE clause to conditionally select only those records in the landing
table that meet the filter condition. See "mapping" on page 762, "distinct
mapping" on page 753.

Configuration workbench

Includes tools for configuring a variety of Hub objects, including, the ORS,
users, security, message queues, and metadata validation.

consolidated record

See "master record" on page 763.

consolidation process

Process of merging or linking duplicate records into a single record. The goal
in Informatica MDM Hub is to identify and eliminate all duplicate data and to
merge or link them together into a single, consolidated record while
maintaining full traceability.

consolidation indicator

Represents the consolidation state of a record in a base object. Stored in the


CONSOLIDATION_IND column. The consolidation indicator can have one of the
following values:
Indicator State Name Description
Value
1 CONSOLIDATED This record has been determined to be unique and
represents the best version of the truth.
2 UNMERGED This record has gone through the match process and
is ready to be consolidated.
3 QUEUED_FOR_ This record is a match candidate in the match batch
MATCH that is being processed in the currently-executing
match process.
4 NEWLY_ This record is new (load insert) or changed (load
LOADED update) and needs to undergo the match process.
9 ON_HOLD The data steward has put this record on hold until
further notice. Any record can be put on hold
regardless of its consolidation indicator value. The
match and consolidate processes ignore on-hold

- 749 -
Indicator State Name Description
Value
records. For more information, see the Informatica
MDM Hub Data Steward Guide.

content metadata

Data that describes the business data that has been processed by Informatica
MDM Hub. Content metadata is stored in support tables for a base object,
including cross-reference tables, history tables, and others. Content metadata
is used to help determine where the data in the base object came from, and
how the data changed over time.

control table

A type of system table in an ORS that Informatica MDM Hub automatically


creates for a base object. Control tables are used in support of the load,
merge, and unmerge processes. For each trust-enabled column in a base
object, Informatica MDM Hub maintains a record (the last update date and an
identifier of the source system) in a corresponding control table.

creation change list

A change list that is the result of exporting the contents of a repository.


Creation change lists are used in Metadata Manager for importing design
objects. See also "change list" on page 747, "comparison change list" on page
748, "Metadata Manager" on page 768.

credentials

What a user supplies at login time to gain access to Informatica MDM Hub
resources. Credentials are used during the authorization process to determine
whether a user is who they claim to be. Login credentials might be a user
name and password, a security payload (such as a security token or some
other binary data), or a combination of user name/password and security
payload. See "authentication" on page 745, "security payload" on page 778.

cross-reference table

A type of system table in an ORS that Informatica MDM Hub automatically


creates for a base object. For each record of the base object, the cross-
reference table contains zero to n (0-n) records per source system. This
 record contains the primary key from the source system and the most recent
value that the source system has provided for each cell in the base object
table.

- 750 -
Customer Data Integration (CDI)

A discipline within "Master Data Management (MDM)" on page 763 that focuses
on customer master data and its related attributes. See "master data" on page
763.

Data Access Services

These application server level capabilities enable Informatica MDM Hub to


support multiple modes of data access and expose numerous Informatica
MDM Hub data services via the Informatica Services Integration Framework
(SIF). This facilitates both real-time synchronous integration, as well as
asynchronous integration.

database

Organized collection of data in the Hub Store. Informatica MDM Hub supports
two types of databases: a Master Database and an Operational Reference
Store (Operational Reference Store). See "Master Database" on page 763,
"Operational Reference Store (ORS)" on page 769, and "Hub Store" on page
759.

data cleansing

The process of standardizing data content and layout, decomposing and


parsing text values into identifiable elements, verifying identifiable values
(such as zip codes) against data libraries, and replacing incorrect values with
correct values from data libraries. See "cleanse function" on page 748.

Data Manager

Tool used to review the results of all merges—including automatic merges—


and to correct data content if necessary. It provides you with a view of the
data lineage for each base object record. The Data Manager also allows you to
unmerge previously merged records, and to view different types of history on
each consolidated record.

Use the Data Manager tool to search for records, view their cross-references,
unmerge records, unlink records, view history records, create new records,
edit records, and override trust settings. The Data Manager displays all
records that meet the search criteria you define.

datasource

In the application server environment, a datasource is a JDBC resource that


identifies information about a database, such as the location of the database

- 751 -
server, the database name, the database user ID and password, and so on.
Informatica MDM Hub needs this information to communicate with an ORS.

data steward

Informatica MDM Hub user who has the primary responsibility for data quality.
Data stewards access Informatica MDM Hub through the Hub Console, and use
Informatica MDM Hub tools to configure the objects in the Hub Store.

Data Steward workbench

Part of the Informatica MDM Hub UI used to review consolidated data as well
as matched data queued for exception handling by data analysts or stewards
who understand the data semantics and are guardians of data reliability in an
organization.

Includes tools for using the Data Manager, Merge Manager, and Hierarchy
Manager.

data type

Defines the characteristics of permitted values in a table column—characters,


numbers, dates, binary data, and so on. Informatica MDM Hub uses a common
set of data types for columns that map directly data types for the database
platform (Oracle or DB2) used in your Informatica MDM Hub implementation.

decay curve

Visually shows the way that trust decays over time. Its shape is determined
by the configured decay type and decay period. See "decay period" on page
752, "decay type" on page 752.

decay period

The amount of time (days, weeks, months, quarters, and years) that it takes
for the trust level to decay from the maximum trust level to the minimum
trust level. See "decay curve" on page 752, "decay type" on page 752.

decay type

The way that the trust level decreases during the decay period. See "linear
decay" on page 761, "RISL decay" on page 775, "SIRL decay" on page 778,
"decay curve" on page 752, "decay period" on page 752.

- 752 -
deleted state (records)

Deleted records are records that are no longer desired to be part of the Hub’s
data. These records are not used in process (unless specifically requested).
Records can only be deleted explicitly and once deleted can be restored if
desired. When a record that is Pending is deleted, it is permanently deleted
and cannot be restored.

delta detection

During the stage process, Informatica MDM Hub only processes new or
changed records when this feature is enabled. Delta detection can be done
either by comparing entire records or via a date column.

design object

Parts of the metadata used to define the schema and other configuration
settings for an implementation. Design objects include instances of the
following types of Informatica MDM Hub objects: base objects and columns,
landing and staging tables, columns, indexes, relationships, mappings,
cleanse functions, queries and packages, trust settings, validation and match
rules, Security Access Manager definitions, Hierarchy Manager definitions,
and other settings. See "metadata" on page 768, "Metadata Manager" on page
768.

distinct mapping

A mapping between a column in a landing table and a staging table that selects
only the distinct records from the landing table. Using distinct mapping is
useful in situations in which you have a single landing table feeding multiple
staging tables and the landing table is denormalized (for example, it contains
both customer and address data). See "mapping" on page 762, "conditional
mapping" on page 749.

distinct source system

A source system that provides data that gets inserted into the base object
without being consolidated. See "source system" on page 779.

distribution

Process of distributing the master record data to other applications or


databases after the best version of the truth has been establish via
reconciliation. See "reconciliation" on page 773, "publish" on page 772.

- 753 -
downgrade

Operation that occurs when inserting or updating data using the load process
or using cleansePut & Put APIs when a validation rule reduces the trust for a
record by a percentage.

duplicate

One or more records in which the data in certain columns (such as name,
address, or organization data) is identical or nearly identical. Match rules
executed during the match process determine whether two records are
sufficiently similar to be considered duplicates for consolidation purposes.

entity

In Hierarchy Manager, an entity is a typed object that can be related to other


entities. Examples of entities are: individual, organization, product, and
household. See "entity type" on page 754.

entity base object

An entity base object is a base object used to store information about


Hierarchy Manager entities. See "entity type" on page 754 and "entity" on
page 754.

entity type

In Hierarchy Manager, entity types define the kinds of objects that can be
related using Hierarchy Manager. Examples are individual, organization,
product, and household. All entities with the same entity type are stored in the
same entity base object. In the HM Configuration tool, entity types are
displayed in the navigation tree under the Entity Object with which the Type is
associated. See "entity" on page 754.

exact match

A match / search strategy that matches only records that are identical. If you
specify an exact match, you can define only exact match columns for this base
object (exact-match base objects cannot have fuzzy match columns). A base
object that uses the exact match / search strategy is called an exact-match
base object. See also "match / search strategy" on page 766, "fuzzy match" on
page 756.

- 754 -
exclusive lock

In the Hub Console, a lock that is required in order to make exclusive changes
to the underlying schema. An exclusive lock prevents all other Hub Console
users from making changes to the target database at the same time. An
exclusive lock must be released by the user with the exclusive lock; it cannot
be cleared by another user. See "write lock" on page 785.

execution path

The sequence in which batch jobs are executed when the entire batch group is
executed in the Informatica MDM Hub. The execution path begins with the
Start node and ends with the End node. The Batch Group tool does not validate
the execution sequence for you—it is up to you to ensure that the execution
sequence is correct.

export process

In Metadata Manager, the process of exporting metadata in a repository to a


portable change list XML file, which can then be used to import design objects
into another repository or to save it in a source control system for archival
purposes. The export process copies all supported design objects to the
change list XML file. See also "Metadata Manager" on page 768, "validation
process" on page 784, "import process" on page 759, "promotion process" on
page 772, "change list" on page 747.

external application user

Informatica MDM Hub user who access Informatica MDM Hub data indirectly
via third-party applications.

external cleanse

The process of cleansing data prior to populating the landing tables. External
cleansing is typically performed outside of Informatica MDM Hub using an
extract-transform-load (ETL) tool or some other data cleansing utility. See
also "data cleansing" on page 751, "extract-transform-load (ETL) tool" on page
756, "internal cleanse" on page 760.

external match

Process that allows you to match new data (stored in a separate input table)
with existing data in a fuzzy-match base object, test for matches, and inspect
the results—all without actually changing data in the base object in any way,
or changing the match table associated with the base object.

- 755 -
extract-transform-load (ETL) tool

A software tool (external to Informatica MDM Hub) that extracts data from a
source system, transforms the data (using rules, lookup tables, and other
functionality) to convert it to the desired state, and then loads (writes) the
data to a target database. For Informatica MDM Hub implementations, ETL
tools are used to extract data from source systems and populate the landing
tables. See also "data cleansing" on page 751, "external cleanse" on page 755.

foreign key

In a relational database, a column (or set of columns) whose value


corresponds to a primary key value in another table (or, in rare cases, the
same table). The foreign key acts as a pointer to the other table. For example,
the Department_Number column in the Employee table would be a foreign key
that points to the primary key of the Department table.

function

See "cleanse function" on page 748.

fuzzy match

A match / search strategy that uses probabilistic matching, which takes into
account spelling variations, possible misspellings, and other differences that
can make matching records non-identical. If selected, Informatica MDM Hub
adds a special column (Fuzzy Match Key) to the base object. A base object
that uses the fuzzy match / search strategy is called a fuzzy-match base
object. Using fuzzy match requires a selected population. See "fuzzy match
key" on page 756, "match / search strategy" on page 766, "exact match" on
page 754, and "population" on page 770.

fuzzy match key

Special column in the base object that the Schema Manager adds if a match
column uses the fuzzy match / search strategy. This column is the primary
field used during searching and matching to generate match candidates for
this base object. All fuzzy base objects have one and only one Fuzzy Match
Key. See "fuzzy match" on page 756, "match key" on page 764, "match key
table" on page 764.

global business identifier (GBID)

A column that contains common identifiers (key values) that allow you to
uniquely and globally identify a record based on your business needs.
Examples include:

- 756 -
• identifiers defined by applications external to Informatica MDM Hub, such
as ERP or CRM systems.
• Identifiers defined by external organizations, such as industry-specific
codes (AMA numbers, DEA numbers. and so on), or government-issued
identifiers (social security number, tax ID number, driver’s license
number, and so on).

hard delete

A base object or cross-reference record is physically removed from the


database. See "soft delete" on page 778.

Hierarchies Tool

Informatica MDM Hub administrators use the design-time Hierarchies tool


(was previously the “Hierarchy Manager Configuration Tool”) to set up the
structures required to view and manipulate data relationships in Hierarchy
Manager. Use the Hierarchies tool to define Hierarchy Manager components—
such as entity types, hierarchies, relationships types, packages, and profiles—
for your Informatica MDM Hub implementation. The Hierarchies tool is
accessible via the Model workbench. See "Hierarchy Manager" on page 757.

Hierarchy Manager

Part of the Informatica MDM Hub UI used to set up the structures required to
view and manipulate data relationships. Informatica Hierarchy Manager
(Hierarchy Manager or HM) builds on Informatica Master Reference Manager
(MRM) and the repository managed by Informatica MDM Hub for reference and
relationship data. Hierarchy Manager gives you visibility into how
relationships correlate between systems, enabling you to discover
opportunities for more effective customer service, to maximize profits, or to
enact compliance with established standards.

The Hierarchy Manager tool is accessible via the Data Steward workbench.

hierarchy

In Hierarchy Manager, a set of relationship types. These relationship types are


not ranked based on the place of the entities of the hierarchy, nor are they
necessarily related to each other. They are merely relationship types that are
grouped together for ease of classification and identification. See "hierarchy
type" on page 758, "relationship" on page 774, "relationship type" on page
774.

- 757 -
hierarchy type

In Hierarchy Manager, a logical classification of hierarchies. The hierarchy


type is the general class of hierarchy under which a particular relationship
falls. See "hierarchy" on page 757.

history table

A type of table in an ORS that contains historical information about changes to


an associated table. History tables provide detailed change-tracking options,
including merge and unmerge history, history of the pre-cleansed data,
history of the base object, and history of the cross-reference.

HM package

A Hierarchy Manager package represents a subset of an MRM package and


contains the metadata needed by Hierarchy Manager.

hotspot

In business data, a group of records representing overmatched data—a large


intersection of matches.

Hub Console

Informatica MDM Hub user interface that comprises a set of tools for
administrators and data stewards. Each tool allows users to perform a specific
action, or a set of related actions, such as building the data model, running
batch jobs, configuring the data flow, running batch jobs, configuring external
application access to Informatica MDM Hub resources, and other system
configuration and operation tasks.

hub object

A generic term for various types of objects defined in the Hub that contain
information about your business entities. Some examples include: base
objects, cross reference tables, and any object in the hub that you can
associate with reporting metrics.

Hub Server

A run-time component in the middle tier (application server) used for core and
common services, including access, security, and session management.

- 758 -
Hub Store

In a Informatica MDM Hub implementation, the database that contains the


Master Database and one or more Operational Reference Store (ORS)
database. See "Master Database" on page 763, "Operational Reference Store
(ORS)" on page 769.

immutable source

A data source that always provides the best, final version of the truth for a
base object. Records from an immutable source will be accepted as unique
and, once a record from that source has been fully consolidated, it will not be
changed—even in the event of a merge. Immutable sources are also distinct
systems. For all source records from an immutable source system, the
consolidation indicator for Load and PUT is always 1 (consolidated record).

implementer

Informatica MDM Hub user who has the primary responsibility for designing,
developing, testing, and deploying Informatica MDM Hub according to the
requirements of an organization. Tasks include (but are not limited to)
creating design objects, building the schema, defining match rules,
performance tuning, and other activities.

import process

In Metadata Manager, the process of adding design objects from a library or


change list to a repository. The design object does not already exist in the
target repository. See also "Metadata Manager" on page 768, "validation
process" on page 784, "promotion process" on page 772, "change list" on page
747.

incremental load

Any load process that occurs after a base object has undergone its initial data
load. Called incremental loading because only new or updated data is loaded
into the base object. Duplicate data is ignored. See "initial data load" on page
760.

indirect match

See "transitive match" on page 782.

- 759 -
initial data load

The very first time that you data is loaded into an empty base object. During
the initial data load, all records in the staging table are inserted into the base
object as new records.

internal cleanse

The process of cleansing data during the stage process, when data is copied
from landing tables to the appropriate staging tables. Internal cleansing
occurs inside Informatica MDM Hub using configured cleanse functions that
are executed by the Cleanse Match Server in conjunction with a supported
cleanse engine. See also "data cleansing" on page 751, "cleanse engine" on
page 747, "external cleanse" on page 755.

job execution log

In the Batch Viewer and Batch Group tools, a log that shows job completion
status with any associated messages, such as success, failure, or warning.

job execution script

For Informatica MDM Hub implementations, a script that is used in job


scheduling software (such as Tivoli or CA Unicenter) that executes Informatica
MDM Hub batch jobs via stored procedures.

key match job

A Informatica MDM Hub batch job that matches records from two or more
sources when these sources use the same primary key. Key Match jobs
compare new records to each other and to existing records, and then identify
potential matches based on the comparison of source record keys as defined
by the primary key match rules. See "primary key match rule" on page 771,
"match process" on page 765.

key type

Identifies important characteristics about the match key to help Informatica


MDM Hub generate keys correctly and conduct better searches. Informatica
MDM Hub provides the following match key types: Person_Name,
Organization_Name, and Address_Part1. See "match process" on page 765.

key width

During match, determines how fast searches are during match, the number of
possible match candidates returned, and how much disk space the keys

- 760 -
consume. Key width options are Standard, Extended, Limited, and Preferred.
Key widths apply to fuzzy match objects only. See "match process" on page
765.

land process

Process of populating landing tables from a source system. See "source


system" on page 779, "landing table" on page 761.

landing table

A table where a source system puts data that will be processed by Informatica
MDM Hub.

lineage

Which systems, and which records from those systems, contributed to


consolidated records in the Hub Store.

linear decay

The trust level decreases in a straight line from the maximum trust to the
minimum trust. See "decay type" on page 752, "trust" on page 782.

linear unmerge

A base object record is unmerged and taken out of the existing merge tree
structure. Only the unmerged base object record itself will come out the
merge tree structure, and all base object records below it in the merge tree
will stay in the original merge tree.

See also: "cascade unmerge" on page 747, "tree unmerge" on page 782.

load insert

When records are inserted into the target base object. During the load
process, if a record in the staging table does not already exist in the target
table, then Informatica MDM Hub inserts the record into the target table. See
"load process" on page 761, "load update" on page 762.

load process

Process of loading data from a staging table into the corresponding base
object in the Hub Store. If the new data overlaps with existing data in the Hub
Store, Informatica MDM Hub uses trust settings and validation rules to

- 761 -
determine which value is more reliable. See "trust" on page 782, "validation
rule" on page 785, "load insert" on page 761, "load update" on page 762.

load update

When records are inserted into the target base object. During the load
process, if a record in the staging table does not already exist in the target
table, then Informatica MDM Hub inserts the record into the target table. See
"load process" on page 761, "load insert" on page 761.

lock

See "write lock" on page 785, "exclusive lock" on page 755.

lookup

Process of retrieving a data value from a parent table during Load jobs. In
Informatica MDM Hub, when configuring a staging table associated with a base
object, if a foreign key column in the staging table (as the child table) is
related to the primary key in a parent table, you can configure a lookup to
retrieve data from that parent table.

manual merge

Process of merging records manually. Match rules can result in automatic


merging or manual merging. A match rule that instructs Informatica MDM Hub
to perform a manual merge identifies records that have enough points of
similarity to warrant attention from a data steward, but not enough points of
similarity to allow the system to automatically merge the records. See
"automerge" on page 745, "merge-style base object" on page 767.

manual unmerge

Process of unmerging records manually. See "manual merge" on page 762,


"merge-style base object" on page 767.

mapping

Defines a set of transformations that are applied to source data. Mappings are
used during the stage process (or using the SiperianClient CleansePut API
request) to transfer data from a landing table to a staging table. A mapping
identifies the source column in the landing table and the target column to
populate in the staging table, along with any intermediate cleanse functions
used to clean the data. See "conditional mapping" on page 749, "distinct
mapping" on page 753.

- 762 -
master data

A collection of common, core entities—along with their attributes and their


values—that are considered critical to a company's business, and that are
required for use in two or more systems or business processes. Examples of
master data include customer, product, employee, supplier, and location data.
See "Master Data Management (MDM)" on page 763, "Customer Data
Integration (CDI)" on page 751.

Master Data Management (MDM)

The controlled process by which the master data is created and maintained as
the system of record for the enterprise. MDM is implemented in order to
ensure that the master data is validated as correct, consistent, and complete,
and—optionally—circulated in context for consumption by internal or external
business processes, applications, or users. See "master data" on page 763,
"Customer Data Integration (CDI)" on page 751.

Master Database

Database that contains the Informatica MDM Hub environment configuration


settings—user accounts, security configuration, ORS registry, message queue
settings, and so on. A given Informatica MDM Hub environment can have only
one Master Database. The default name of the Master Database is CMX_
SYSTEM. See also "Operational Reference Store (ORS)" on page 769.

master record

Single record in the base object that represents the “best version of the truth”
for a given entity (such as a specific organization or person). The master
record represents the fully-consolidated data for the entity.

Master Reference Manager (MRM)

Master Reference Manager (MRM) is the foundation product of Informatica


MDM Hub. Informatica MRM consists of the following major components:
Hierarchy Manager, Security Access Manager, Metadata Manager, Services
Integration Framework (SIF), and Activity Manager. Its purpose is to build an
extensible and manageable system-of-record for all master reference data. It
provides the platform to consolidate and manage master reference data
across all data sources—internal and external—of an organization, and acts as
a system-of-record for all downstream applications.

- 763 -
match

The process of determining whether two records should be automatically


merged or should be candidates for manual merge because the two records
have identical or similar values in the specified columns. See "match process"
on page 765.

match candidate

For fuzzy-match base objects only, any record in the base object that is a
possible match.

match column

A column that is used in a match rule for comparison purposes. Each match
column is based on one or more columns from the base object. See "match
process" on page 765.

match column rule

Match rule that is used to match records based on the values in columns you
have defined as match columns, such as last name, first name, address1, and
address2. See "primary key match rule" on page 771, "match process" on
page 765.

match key

Encoded strings that represent the data in the fuzzy match key column of the
base object. Match keys consist of fixed-length, compressed, and encoded
values built from a combination of the words and numbers in a name or
address such that relevant variations have the same match key value. Match
keys are one part of the match tokens that are generated during the tokenize
process, stored in the match key table, and then used during the match
process to identify candidates for matching. See "match token" on page 766,
"fuzzy match key" on page 756, "match key table" on page 764, "tokenize
process" on page 781, "match process" on page 765.

match key table

System table that stores the match tokens (match keys + unencoded, raw
data) that are generated during the tokenize process. This data is used during
the match process to identify candidates for matching, comparing the match
keys according to the match rules that have been defined to determine which
records are duplicates. See "match key" on page 764, "match token" on page
766, "tokenize process" on page 781, "match process" on page 765.

- 764 -
match list

Define custom-built standardization lists. Functions are pre-defined functions


that provide access to specialized cleansing functionality such as address
verification or address decomposition. See "match process" on page 765.

match path

Allows you to traverse the hierarchy between records—whether that hierarchy


exists between base objects (inter-table paths) or within a single base object
(intra-table paths). Match paths are used for configuring match column rules
involving related records in either separate tables or in the same table.

match process

Process of comparing two records for points of similarity. If sufficient points


of similarity are found to indicate that two records probably are duplicates of
each other, Informatica MDM Hub flags those records for merging.

match purpose

For fuzzy-match base objects, defines the primary goal behind a match rule.
For example, if you're trying to identify matches for people where address is
an important part of determining whether two records are for the same
person, then you would use the Match Purpose called Resident. Each match
purpose contains knowledge about how best to compare two records to
achieve the purpose of the match. Informatica MDM Hub uses the selected
match purpose as a basis for applying the match rules to determine matched
records. The behavior of the rules is dependent on the selected purpose. See
"match process" on page 765.

match rule

Defines the criteria by which Informatica MDM Hub determines whether


records might be duplicates. Match columns are combined into match rules to
determine the conditions under which two records are regarded as being
similar enough to merge. Each match rule tells Informatica MDM Hub the
combination of match columns it needs to examine for points of similarity.
See "match process" on page 765.

match rule set

A logical collection of match rules that allow users to execute different sets of
rules at different stages in the match process. Match rule sets include a search
level that dictates the search strategy, any number of automatic and manual
match rules, and optionally, a filter that allows you to selectively include or

- 765 -
exclude records during the match process Match rules sets are used to
execute to match column rules but not primary key match rules. See "match
process" on page 765.

match subtype

Used with base objects that containing different types of data, such as an
Organization base object containing customer, vendor, and partner records.
Using match subtyping, you can apply match rules to specific types of data
within the same base object. For each match rule, you specify an exact match
column that will serve as the “subtyping” column to filter out the records that
you want to ignore for that match rule. See "match process" on page 765.

match table

Type of system table, associated with a base object, that supports the match
process. During the execution of a Match job for a base object, Informatica
MDM Hub populates its associated match table with the ROWID_OBJECT values
for each pair of matched records, as well as the identifier for the match rule
that resulted in the match, and an automerge indicator. See "match process"
on page 765.

match token

Strings that represent both encoded (match key) and unencoded (raw) values
in the match columns of the base object. Match tokens are generated during
the tokenize process, stored in the match key table, and then used during the
match process to identify candidates for matching. See "match key" on page
764, "match key table" on page 764, "match process" on page 765, "tokenize
process" on page 781.

match type

Each match column has a match type that determines how the match column
will be tokenized in preparation for the match comparison. See "match
process" on page 765.

match / search strategy

Specifies the reliability of the match versus the performance you require:
fuzzy or exact. An exact match / search strategy is faster, but an exact match
will miss some matches if the data is imperfect. See "fuzzy match" on page
756, "exact match" on page 754., "match process" on page 765.

- 766 -
maximum trust

The trust level that a data value will have if it has just been changed. For
example, if source system A changes a phone number field from 555-1234 to
555-4321, the new value will be given system A’s maximum trust level for the
phone number field. By setting the maximum trust level relatively high, you
can ensure that changes in the source systems will usually be applied to the
base object.

merge process

Process of combining two or more records of a base object table because they
have the same value (or very similar values) in the specified match columns.
See "consolidation process" on page 749, "automerge" on page 745, "manual
merge" on page 762, "manual unmerge" on page 762.

merge-style base object

Type of base object that is used with Informatica MDM Hub’s match and merge
capabilities. See "merge process" on page 767.

Merge Manager

Tool used to review and take action on the records that are queued for manual
merging.

message

In Informatica MDM Hub, refers to a Java Message Service (JMS) message. A


message queue server handles two types of JMS messages:
• inbound messages are used for the asynchronous processing of
Informatica MDM Hub service invocations
• outbound messages provide a communication channel to distribute data
changes via JMS to source systems or other systems.

message queue

A mechanism for transmitting data from one process to another (for example,
from Informatica MDM Hub to an external application).

message queue rule

A mechanism for identifying base object events and transferring the effected
records to the internal system for update. Message queue rules are supported
for updates, merges, and records accepted as unique.

- 767 -
message queue server

In Informatica MDM Hub, a Java Message Service (JMS) server, defined in


your application server environment, that Informatica MDM Hub uses to
manage incoming and outgoing JMS messages.

message trigger

A rule that gets fired when which a particular action occurs within Informatica
MDM Hub. When an action occurs for which a rule is defined, a JMS message is
placed in the outbound message queue. A message trigger identifies the
conditions which cause the message to be generated (what action on which
object) and the queue on which messages are placed.

metadata

Data that is used to describe other data. In Informatica MDM Hub, metadata is
used to describe the schema (data model) that is used in your Informatica
MDM Hub implementation, along with related configuration settings. See also
"Metadata Manager" on page 768, "design object" on page 753, "schema" on
page 776.

Metadata Manager

The Metadata Manager tool in the Hub Console is used to validate metadata for
a repository, promote design objects from one repository to another, import
design objects into a repository, and export a repository to a change list. See
also "metadata" on page 768, "design object" on page 753, "validation
process" on page 784, "import process" on page 759, "promotion process" on
page 772, "export process" on page 755, "change list" on page 747.

metadata validation

See "validation process" on page 784.

minimum trust

The trust level that a data value will have when it is “old” (after the decay
period has elapsed). This value must be less than or equal to the maximum
trust. If the maximum and minimum trust are equal, the decay curve is a flat
line and the decay period and decay type have no effect. See also "decay
period" on page 752.

- 768 -
Model workbench

Part of the Informatica MDM Hub UI used to configure the solution during
deployment by the implementers, and for on-going configuration by data
architects of the various types of metadata and rules in response to changing
business needs.

Includes tools for creating query groups, defining packages and other schema
objects, and viewing the current schema.

non-contributing cross reference

A cross-reference (XREF) record that does not contribute to the BVT (best
version of the truth) of the base object record. As a consequence, the values
in the cross-reference record will never show up in the base object record.
Note that this is for state-enabled records only.

non-equal matching

When configuring match rules, prevents equal values in a column from


matching each other. Non-equal matching applies only to exact match
columns.

null value

The absence of a value in a column of a record. Null is not the same as blank
or zero.

operation

Deprecated term. See "batch job" on page 746.

Operational Reference Store (ORS)

Database that contains the rules for processing the master data, the rules for
managing the set of master data objects, along with the processing rules and
auxiliary logic used by the Informatica MDM Hub in defining the BVT. A
 Informatica MDM Hub configuration can have one or more ORS databases.
The default name of an ORS is CMX_ORS. See also "Master Database" on page
763.

overmatching

For fuzzy-match base objects only, a match that results in too many matches,
including matches that are not relevant. When configuring match, the goal is

- 769 -
to find the optimal number of matches for your data. See "undermatching" on
page 783.

package

A package is a public view of one or more underlying tables in Informatica


MDM Hub. Packages represent subsets of the columns in those tables, along
with any other tables that are joined to the tables. A package is based on a
query. The underlying query can select a subset of records from the table or
from another package.

password policy

Specifies password characteristics for Informatica MDM Hub user accounts,


such as the password length, expiration, login settings, password re-use, and
other requirements. You can define a global password policy for all user
accounts in a Informatica MDM Hub implementation, and you can override
these settings for individual users.

path

See "match path" on page 765.

pending state (records)

Pending records are records that have not yet been approved for general
usage in the Hub. These records can have most operations performed on
them, but operations have to specifically request Pending records. If records
are required to go through an approval process, then these records have not
yet been approved and are in the midst of an approval process.

policy decision points (PDPs)

Specific security check points that determine, at run time, the validity of a
user’s identity ("authentication" on page 745), along with that user’s access to
Informatica MDM Hub resources ("authorization" on page 745).

policy enforcement points (PEPs)

Specific security check points that enforce, at run time, security policies for
authentication and authorization requests.

population

Defines certain characteristics about data in the records that you are
matching. By default, Informatica MDM Hub comes with the US population, but

- 770 -
Informatica provides a standard population per country. Populations account
for the inevitable variations and errors that are likely to exist in name,
address, and other identification data; specify how Informatica MDM Hub
builds match tokens; and specify how search strategies and match purposes
operate on the population of data to be matched. Used only with the Fuzzy
match/search strategy.

primary key

In a relational database table, a column (or set of columns) whose value


uniquely identifies a record. For example, the Department_Number column
would be the primary key of the Department table.

primary key match rule

Match rule that is used to match records from two systems that use the same
primary keys for records. See also "match column rule" on page 764.

private resource

A Informatica MDM Hub resource that is hidden from the Roles tool,
preventing its access via Services Integration Framework (SIF) operations.
When you add a new resource in Hub Console (such as a new base object), it
is designated a PRIVATE resource by default. See also "secure resource" on
page 777, "resource" on page 775.

privilege

Permission to access a Informatica MDM Hub resource. With Informatica MDM


Hub internal authorization, each role is assigned one of the following
privileges.
Privilege Allows the User To....
READ View data.
CREATE Create data records in the Hub Store.
UPDATE Update data records in the Hub Store.
MERGE Merge and unmerge data.
EXECUTE Execute cleanse functions and batch groups.

Privileges determine the access that external application users have to


Informatica MDM Hub resources. For example, a role might be configured to
have READ, CREATE, UPDATE, and MERGE privileges on particular packages
and package columns. These privileges are not enforced when using the Hub
Console, although the settings still affect the use of Hub Console to some
degree. See "secure resource" on page 777, "role" on page 775.

- 771 -
profile

In Hierarchy Manager, describes what fields and records an HM user may


display, edit, or add. For example, one profile can allow full read/write access
to all entities and relationships, while another profile can be read-only (no add
or edit operations allowed).

promotion process

Meaning depends on the context:


• Metadata Manager: Process of copying changes in design objects from
one repository to another. Promotion is used to copy incremental changes
between repositories. See also "Metadata Manager" on page 768,
"validation process" on page 784, "import process" on page 759, "change
list" on page 747.
• State Management: Process of changing the system state of individual
records in Informatica MDM Hub (for example from PENDING state to
ACTIVE state).

provider

See "security provider" on page 778.

provider property

A name-value pair that a security provider might require in order to access for
the service(s) that they provide.

publish

Process of submitting a Informatica MDM Hub message to a message queue


for distribution to other applications, databases, and so on. See "distribution"
on page 753.

query

A request to retrieve data from the Hub Store. Informatica MDM Hub allows
administrators to specify the criteria used to retrieve that data. Queries can
be configured to return selected columns, filter the result set with a WHERE
clause, use complex query syntax (such as GROUP BY, ORDER BY, and
HAVING clauses), and use aggregate functions (such as SUM, COUNT, and
AVG).

- 772 -
query group

A logical group of queries. A query group is simply a mechanism for


organizing queries. See "query" on page 772.

raw table

A table that archives data from a landing table.

real-time mode

Way of interacting with Informatica MDM Hub via third-party applications,


which invoke Informatica MDM Hub operations via the Services Integration
Framework (SIF) interface. SIF provides operations for various services, such
as reading, cleansing, matching, inserting, and updating records. See also
"batch mode" on page 746, "Services Integration Framework (SIF)" on page
778.

reconciliation

For a given entity, Informatica MDM Hub obtains data from one or more
source systems, then reconciles “multiple versions of the truth” to arrive at
the master record—the best version of the truth—for that entity.
Reconciliation can involve cleansing the data beforehand to optimize the
process of matching and consolidating records for a base object. See
"distribution" on page 753.

record

A row in a table that represents an instance of an object. For example, in an


Address table, a record contains a single address. See also "source record" on
page 779, "consolidated record" on page 749.

referential integrity

Enforcement of parent-child relationship rules among tables based on


configured foreign key relationship.

regular expression

A computational expression that is used to match and manipulate text data


according to commonly-used syntactic conventions and symbolic patterns. In
Informatica MDM Hub, a regular expression function allows you to use regular
expressions for cleanse operations. To learn more about regular expressions,
including syntax and patterns, refer to the Javadoc for java.util.regex.Pattern.

- 773 -
reject table

A table that contains records that Informatica MDM Hub could not insert into a
target table, such as:
• staging table (stage process) after performing the specified cleansing on a
record of the specified landing table
• Hub store table (load process)

A record could be rejected because the value of a cell is too long, or because
the record’s update date is later than the current date.

relationship

In Hierarchy Manager, describes the affiliation between two specific entities.


Hierarchy Manager relationships are defined by specifying the relationship
type, hierarchy type, attributes of the relationship, and dates for when the
relationship is active. See "relationship type" on page 774, "hierarchy" on
page 757.

relationship base object

A relationship base object is a base object used to store information about


Hierarchy Manager relationships.

relationship type

Describes general classes of relationships. The relationship type defines:


• the types of entities that a relationship of this type can include
• the direction of the relationship (if any)
• how the relationship is displayed in the Hub Console

See "relationship" on page 774, "hierarchy" on page 757.

repository

An Operational Reference Store (ORS). The ORS stores metadata about its
own schema and related property settings. In Metadata Manager, when
copying metadata between repositories, there is always a source repository
that contains the design object to copy, and the target repository that is
destination for the design object. See also "Metadata Manager" on page 768,
"validation process" on page 784, "import process" on page 759, "promotion
process" on page 772, "export process" on page 755, "change list" on page
747.

- 774 -
request

Informatica MDM Hub request (API) that allows external applications to access
specific Informatica MDM Hub functionality using the Services Integration
Framework (SIF), a request/response API model.

resource

Any Informatica MDM Hub object that is used in your Informatica MDM Hub
implementation. Certain resources can be configured as secure resources:
base objects, mappings, packages, remote packages, cleanse functions, HM
profiles, the audit table, and the users table. In addition, you can configure
secure resources that are accessible by SIF operations, including content
metadata, match rule sets, metadata, batch groups, the audit table, and the
users table. See "private resource" on page 771, "secure resource" on page
777, "resource group" on page 775.

resource group

A collection of secure resources that simplify privilege assignment, allowing


you to assign privileges to multiple resources at once, such as easily assigning
resource groups to a role. See "resource" on page 775, "privilege" on page
771.

Resource Kit

The Informatica MDM Hub Resource Kit is a set of utilities, examples, and
libraries that provide examples of Informatica MDM Hub functionality that can
be expanded on and implemented.

RISL decay

Rapid Initial Slow Later decay puts most of the decrease at the beginning of
the decay period. The trust level follows a concave parabolic curve. If a
source system has this decay type, a new value from the system will probably
be trusted but this value will soon become much more likely to be overridden.

role

Defines a set of privileges to access secure Informatica MDM Hub resources.


See "user" on page 783, "user group" on page 784, "privilege" on page 771.

row

See "record" on page 773.

- 775 -
rule

See "match rule" on page 765.

rule set

See "match rule set" on page 765.

rule set filtering

Ability to exclude records from being processed by a match rule set. For
example, if you had an Organization base object that contained multiple types
of organizations (customers, vendors, prospects, partners, and so on), you
could define a match rule set that selectively processed only vendors. See
"match process" on page 765.

schema

The data model that is used in a customer’s Informatica MDM Hub


implementation. Informatica MDM Hub does not impose or require any
particular schema. The schema is independent of the source systems.

Schema Manager

The Schema Manager is a design-time component in the Hub Console used to


define the schema, as well as define the staging and landing tables. The
Schema Manager is also used to define rules for match and merge, validation,
and message queues.

Schema Viewer Tool

The Schema Viewer tool is a design-time component in the Hub Console used
to visualize the schema configured for your Informatica MDM Hub
implementation. The Schema Viewer is particularly helpful for visualizing a
complex schema.

search levels

Defines how stringently Informatica MDM Hub searches for matches: narrow,
typical, exhaustive, or extreme. The goal is to find the optimal number of
matches for your data—not too few (undermatching), which misses significant
matches, or too many (overmatching), which generates too many matches,
including insignificant ones. See "overmatching" on page 769,
"undermatching" on page 783.

- 776 -
secure resource

A protected Informatica MDM Hub resource that is exposed to the Roles tool,
allowing the resource to be added to roles with specific privileges. When a
user account is assigned to a specific role, then that user account is authorized
to access the secure resources via SIF according to the privileges associated
with that role. In order for external applications to access a Informatica MDM
Hub resource using SIF operations, that resource must be configured as
SECURE. Because all Informatica MDM Hub resources are PRIVATE by default,
you must explicitly make a resource SECURE after the resource has been
added. See also "private resource" on page 771, "resource" on page 775.
Status Description
Setting
SECURE Exposes this Informatica MDM Hub resource to the Roles tool,
allowing the resource to be added to roles with specific privileges.
When a user account is assigned to a specific role, then that user
account is authorized to access the secure resources using SIF
requests according to the privileges associated with that role.
PRIVATE Hides this Informatica MDM Hub resource from the Roles tool.
Default. Prevents its access via Services Integration Framework
(SIF) operations. When you add a new resource in Hub Console
(such as a new base object), it is designated a PRIVATE resource by
default.

security

The ability to protect information privacy, confidentiality, and data integrity by


guarding against unauthorized access to, or tampering with, data and other
resources in your Informatica MDM Hub implementation. See also
"authentication" on page 745, "authorization" on page 745, "privilege" on page
771, "resource" on page 775.

Security Access Manager (SAM)

Security Access Manager (SAM) is Informatica MDM Hub’s comprehensive


security framework for protecting Informatica MDM Hub resources from
unauthorized access. At run-time, SAM enforces your organization’s security
policy decisions for your Informatica MDM Hub implementation, handling user
authentication and access authorization according to your security
configuration.

Security Access Manager workbench

Includes tools for managing users, groups, resources, and roles.

- 777 -
security provider

A third-party application that provides security services (authentication,


authorization, and user profile services) for users accessing Informatica MDM
Hub.

security payload

Raw binary data supplied to a Informatica MDM Hub operation request that
can contain supplemental data required for further authentication and/or
authorization.

segment matching

Way of limiting match rules to specific subsets of data. For example, you


could define different match rules for customers in different countries by using
segment matching to limit certain rules to specific country codes. Segment
matching is configured on a per-rule basis and applies to both exact-match
and fuzzy-match base objects.

Services Integration Framework (SIF)

The part of Informatica MDM Hub that interfaces with client programs.
Logically, it serves as a middle tier in the client/server model. It enables you
to implement the request/response interactions using any of the following
architectural variations:
• Loosely coupled Web services using the SOAP protocol.
• Tightly coupled Java remote procedure calls based on Enterprise
JavaBeans (EJBs) or XML.
• Asynchronous Java Message Service (JMS)-based messages.
• XML documents going back and forth via Hypertext Transfer Protocol
(HTTP).

SIRL decay

Slow Initial Rapid Later decay puts most of the decrease at the end of the
decay period. The trust level follows a convex parabolic curve. If a source
system has this decay type, it will be relatively unlikely for any other system
to override the value that it sets until the value is near the end of its decay
period.

soft delete

A base object or a cross-reference record is marked as deleted in a user


attribute or in the HUB_STATE_IND. See "hard delete" on page 757.

- 778 -
source record

A raw record from a source system. See also "record" on page 773, "source
system" on page 779.

source system

An external system that provides data to Informatica MDM Hub. See "distinct
source system" on page 753, "source record" on page 779.

stage process

Process of reading the data from the landing table, performing any configured
cleansing, and moving the cleansed data into the corresponding staging table.
If you enable delta detection, Informatica MDM Hub only processes new or
changed records. See "staging table" on page 779, "landing table" on page
761.

staging table

A table where cleansed data is temporarily stored before being loaded into
base objects via load jobs. See "stage process" on page 779, "load process" on
page 761.

state-enabled base object

A base object for which state management is enabled.

state management

The process for managing the system state of base object and cross-reference
records to affect the processing logic throughout the MRM data flow. You can
assign a system state to base object and cross-reference records at various
stages of the data flow using the Hub tools that work with records. In addition,
you can use the various Hub tools for managing your schema to enable state
management for a base object, or to set user permissions for controlling who
can change the state of a record.

State management is limited to the following states: ACTIVE, PENDING, and


DELETED.

state transition rules

Rules that determine whether and when a record can change from one state to
another. State transition rules differ for base object and cross-reference
records.

- 779 -
stored procedure

A named set of Structured Query Language (SQL) statements that are


compiled and stored on the database server. Informatica MDM Hub batch jobs
are encoded in stored procedures so that they can be run using job execution
scripts in job scheduling software (such as Tivoli or CA Unicenter).

stripping

Deprecated term. See "tokenize process" on page 781.

strip table

Deprecated term. See "match key table" on page 764.

surviving cell data

When evaluating cells to merge from two records, Informatica MDM Hub
determines which cell data should survive and which one should be discarded.
The surviving cell data (or winning cell) is considered to represent the better
version of the truth between the two cells. Ultimately, a single, consolidated
record contains the best surviving cell data and represents the best version of
the truth.

survivorship

Determination made by Informatica MDM Hub when evaluating cells to merge


from two records. Informatica MDM Hub determines which cell data should
survive and which one should be discarded. Survivorship applies to both trust-
enabled columns and columns that are not trust enabled. When comparing
cells from two different records, Informatica MDM Hub determines
survivorship based on properties of the data. For example, if the two columns
are trust-enabled, then the cell with the highest trust score wins. If the trust
scores are equal, then the cell with the most recent LAST_UPDATE_DATE wins.
If the LAST_UPDATE_DATE is equal, Informatica MDM Hub uses other criteria
for determining survivorship.

system column

A column in a table that Informatica MDM Hub automatically creates and


maintains. System columns contain metadata. Common system columns for a
base object include ROWID_OBJECT, CONSOLIDATION_IND, and LAST_
UPDATE_DATE. See "column" on page 748, "user-defined column" on page
783.

- 780 -
system state

Describes how base object records are supported by Informatica MDM Hub.
The following states are supported: ACTIVE, PENDING, and DELETED. See
"state management" on page 779.

Systems and Trust Tool

Systems and Trust tool is a design-time tool used to name the source systems
that can provide data for consolidation in Informatica MDM Hub. You use this
tool to define the trust settings associated with each source system for each
trust-enabled column in a base object.

table

In a database, a collection of data that is organized in rows (records) and


columns. A table can be seen as a two-dimensional set of values
corresponding to an object. The columns of a table represent characteristics of
the object, and the rows represent instances of the object. In the Hub Store,
the Master Database and each Operational Reference Store (ORS) represents
a collection of tables. Base objects are stored as tables in an ORS.

target database

In the Hub Console, the Master Database or an Operational Reference Store


(ORS) that is the target of the current tool. Tools that manage data stored in
the Master Database, such as the Users tool, require that your target database
is the Master Database. Tools that manage data stored in an ORS require that
you specify which ORS to

tokenize process

Specialized form of data standardization that is performed before the match


comparisons are done. For the most basic match types, tokenizing simply
removes “noise” characters like spaces and punctuation. The more complex
match types result in the generation of sophisticated match codes—strings of
characters representing the contents of the data to be compared—based on
the degree of similarity required. See also "match key" on page 764, "match
key table" on page 764, "match token" on page 766.

token table

Deprecated term. See "match key table" on page 764.

- 781 -
traceability

The maintenance of data so that you can determine which systems—and which
records from those systems—contributed to consolidated records.

transactional data

Represents the actions performed by an application, typically captured or


generated by an application as part of its normal operation. It is usually
maintained by only one system of record, and tends to be accurate and
reliable in that context. For example, your bank probably has only one
application for managing transactional data resulting from withdrawals,
deposits, and transfers made on your checking account.

transitive match

During the Build Match Group (BMG) process, a match that is made indirectly
due to the behavior of other matches. For example, if record 1 matches to
record 2, record 2 matches to record 3, and record 3 matches to record 4,
after the BMG process removes redundant matches, it might produce results
in which records 2, 3, and 4 match to record 1. In this example, there was no
explicit rule that matched record 4 to record 1. Instead, the match was made
indirectly.

tree unmerge

Unmerge a tree of merged base object records as an intact sub-structure. A


sub-tree having unmerged base object records as root will come out from the
original merge tree structure. (For example, merge a1 and a2 into a, then
merge b1 and b2 into b, and then finally merge a and b into c. If you then
perform a tree unmerge on a, and then unmerge a from a1, a2 is a sub tree
and will come out from the original tree c. As a result, a is the root of the tree
after the unmerge.)

See also: "cascade unmerge" on page 747, "linear unmerge" on page 761.

trust

Mechanism for measuring the confidence factor associated with each cell
based on its source system, change history, and other business rules. Trust
takes into account the age of data, how much its reliability has decayed over
time, and the validity of the data.

- 782 -
trust level

For a source system that provides records to Informatica MDM Hub, a number
between 0 and 100 that assigns a level of confidence and reliability to that
source system, relative to other source systems. The trust level has meaning
only when compared with the trust level of another source system.

trust score

The current level of confidence in a given record. During load jobs,


Informatica MDM Hub calculates the trust score for each records. If validation
rules are defined for the base object, then the Load job applies these
validation rules to the data, which might further downgrade trust scores.
During the consolidation process, when two records are candidates for merge
or link, the values in the record with the higher trust score wins. Data
stewards can manually override trust scores in the Merge Manager tool.

undermatching

For fuzzy-match base objects only, a match that results in too few matches,
which misses relevant matches. When configuring match, the goal is to find
the optimal number of matches for your data. See "overmatching" on page
769.

unmerge

Process of unmerging previously-merged records. For merge-style base


objects only. See "manual unmerge" on page 762, "merge-style base object"
on page 767.

user

An individual (person or application) who can access Informatica MDM Hub


resources. Users are represented in Informatica MDM Hub by user accounts,
which are defined in the Master Database. See "user group" on page 784,
"Master Database" on page 763.

user-defined column

Any column in a table that is not a system column. User-defined columns are
added in the Schema Manager and usually contain business data. See
"column" on page 748, "system column" on page 780.

- 783 -
user exit

An unencrypted stored procedure that includes a set of fixed, pre-defined


parameters. The procedure is configured, on a per-base object basis, to
execute at a specific point during a Informatica MDM Hub batch process run.

Developers can extend Informatica MDM Hub batch processes by adding


custom code to the appropriate user exit procedure for pre- and post-batch
job processing. See "stored procedure" on page 780.

user group

A logical collection of user accounts. See "user" on page 783.

user object

User-defined functions or procedures that are registered with the Informatica


MDM Hub to extend its functionality. There are four types of user objects:
User Description
Object
User Exits A user-customized, unencrypted stored procedure that includes a
set of fixed, pre-defined parameters. The procedure is configured,
on a per-base object basis, to execute at a specific point during a
Informatica MDM Hub batch process run.
Custom Stored procedures that are registered in table C_REPOS_TABLE_
Stored OBJECT and can be invoked from Batch Manager.
Procedures
Custom Java cleanse functions that supplement the standard cleanse
Java libraries with customer logic. These functions are basically Jar
Cleanse files and stored as BLOBs in the database.
Functions
Custom Custom UI functions that supply additional icons and logic in Data
Button Manager, Merge Manager and Hierarchy Manager.
Functions

Utilities workbench

Includes tools for auditing application event, configuring and running batch
groups, and generating the SIF APIs.

validation process

Process of verifying the completeness and integrity of the metadata that


describes a repository. The validation process compares the logical model of a
repository with its physical schema. If any issues arise, the Metadata Manager
generates a list of issues requiring attention. See also "Metadata Manager" on
page 768.

- 784 -
validation rule

Rule that tells Informatica MDM Hub the condition under which a data value is
not valid. When data meets the criteria specified by the validation rule, the
trust value for that data is downgraded by the percentage specified in the
validation rule. If the Reserve Minimum Trust flag is set for the column, then
the trust cannot be downgraded below the column’s minimum trust.

workbench

In the Hub Console, a mechanism for grouping similar tools. A workbench is a


logical collection of related tools. For example, the Cleanse workbench
contains cleanse-related tools: Cleanse Match Server, Cleanse Functions, and
Mappings.

write lock

In the Hub Console, a lock that is required in order to make changes to the
underlying schema. All non-data steward tools (except the ORS security tools)
are in read-only mode unless you acquire a write lock. Write locks allow
multiple, concurrent users to make changes to the schema. See "exclusive
lock" on page 755.

- 785 -
Index: Accept Non-Matched Records As Unique jobs – base objects

Index errors 690

events 684

log entries, examples of 694


A message queues 689
Accept Non-Matched Records As Unique jobs532
password changes 686
,
purging the audit log
568 696

ACTIVE system state, about 160 systems to audit 687

Address match purpose 415 viewing the audit log 693

Address_Part1 key type 392 XML 685

Admin source system authentication

about the Admin source system 265 about authentication 622

renaming 267 external authentication providers 622

allow null foreign key 280 external directory authentication 622

allow null update 279 internal authentication 622

ANSI Code Page 703 authorization

asynchronous batch jobs 563 about authorization 623

Audit Manager external authorization 623

about the Audit Manager 685 internal authorization 623

starting 686 Auto Match and Merge jobs 532, 570

types of items to audit 686 Autolink jobs 532, 570

audit trails, configuring 300 Automerge jobs 534, 571

auditing Auto Match and Merge jobs 534

about integration auditing 684 B

API requests 688 base object style 93

audit log 691 base objects

audit log table 692 adding columns 81

Audit Manager tool 685 converting to entity base objects 184

authentication and 685 creating 95

configurable settings 687 defining 83

enabling 685 deleting 101

- 786 -
Index: batch groups – batch jobs

described 74 ,
568
editing 95
asynchronous execution 563
exact match base objects 247
Auto Match and Merge jobs 532,
fuzzy match base objects 248
570
history table 89
Autolink jobs 532, 570
impact analysis 100
automatically-created batch jobs 499
load inserts 233
Automerge jobs 534, 571
load updates 234
BVT Snapshot jobs 535
overview of 83
C_REPOS_JOB_CONTROL table 565
record survivorship, state management162
C_REPOS_JOB_METRIC table 565
relationship base objects 374
C_REPOS_JOB_METRIC_TYPE table565
reserved suffixes 77
C_REPOS_JOB_STATUS_TYPEC table565
reverting from relationship base objects200
C_REPOS_TABLE_OBJECT_V table563
style 93
clearing history 512
system columns 84
command buttons 504
batch groups
configurable options 504
about batch groups 598
configuring 496
adding 514
design considerations 498
cmxbg.execute_batchgroup stored procedure599
executing 505
cmxbg.get_batchgroup_status stored procedure602
executing, about 559
cmxbg.reset_batchgroup stored procedure601
execution scripts 560
deleting 515
External Match jobs 535, 572
editing 515
foreign key relationships and 498
executing 522
Generate Match Token jobs 573
executing with stored procedures 598
Generate Match Tokens jobs 540
levels, configuring 516
Hub Delete jobs 541
stored procedures for 599
job execution logs 506
batch jobs
job execution status 506
about batch jobs 496
Key Match jobs 541, 579
Accept Non-Matched Records As Unique 532
Load jobs 542, 580

- 787 -
Index: Batch Viewer tool – cleanse functions

Manual Link jobs 545 starting 501

Manual Merge jobs 545 best version of the truth 259

Manual Unlink jobs 546 best version of the truth (BVT) 82,
261
Manual Unmerge jobs 546
build match groups (BMGs) 249
Match Analyze jobs 550, 588
build_war macro 614
Match for Duplicate Data jobs 552,
589 BVT Snapshot jobs 535

Match jobs 547, 586 C

Migrate Link Style to Merge Style jobs552 C_REPOS_AUDIT table 692

Multi Merge jobs 552 C_REPOS_JOB_CONTROL table 565

Promote jobs 552, 592 C_REPOS_JOB_METRIC table 565

properties of 503 C_REPOS_JOB_METRIC_TYPE table 565

refreshing the status 505 C_REPOS_JOB_STATUS_TYPEC table565

rejected records 510 C_REPOS_SYSTEM table 87, 265

Reset Links jobs 555 C_REPOS_TABLE_OBJECT_V

Reset Match Table jobs 555 about 560

results monitoring 563 C_REPOS_TABLE_OBJECT_V table 561,


563
Revalidate jobs 556, 595
cascade delete, about 575
running manually 502
cascade unmerge 446, 583
scheduling 559
cell update 279
selecting 502
cleanse functions
sequencing batch jobs 497
about cleanse functions 314
setting job status to incomplete 505
aggregation 288
Stage jobs 556, 596
availability of 315
status, setting 505, 527
Cleanse Functions tool 315
supporting tables 497
cleanse lists 333
Synchronize jobs 352, 557, 597
conditional execution components331
Unmerge jobs 583
configuration overview 317
when changes occur 500
constants 328
Batch Viewer tool
decomposition 287
about 501

- 788 -
Index: Cleanse Functions tool – cleansing data

function modes 327 match strings, importing 338

graph functions 321 matched property 336

inputs 328 matchFlag property 336

Java libraries 318 output property 336

libraries 315, 317 properties of 334

logging 327 regular expression match 338

mappings 287 replaceAllOccurrences property 335

outputs 329 searchType property 335

properties of 316 SQL match 338

regular expression functions 320 stopOnHit property 335

secure resources 315 string matches 338

testing 330 Strip property 335

types 316 Cleanse Match Servers

types of 315 about Cleanse Match Servers 308

user libraries 317 adding 312

using 314 batch jobs 309

workspace buttons 327 Cleanse Match Server tool 310

workspace commands 327 cleanse requests 309

Cleanse Functions tool configuring 308

starting 315 deleting 313

workspace buttons 327 distributed 309

workspace commands 327 editing 312

cleanse lists modes 308

about cleanse lists 333 on-line operations 309

adding 333 properties of 310

defaultValue property 336 testing 314

editing 336 cleansing data

exact match 338 about cleansing data 307

Input string property 335 setup tasks 308

match output strings, importing 341 Unicode settings 702

- 789 -
Index: clearing history of batch jobs – custom buttons

clearing history of batch jobs 512 consolidation indicator

cmxbg.execute_batchgroup 599 about the consolidation indicator 219

cmxbg.get_batchgroup_status 602 sequence 220

cmxbg.reset_batchgroup 601 values 219

CMXLB.DEBUG_PRINT 726 consolidation process

CMXMIG.DEBUG_PRINT 726 data flow 257

cmxue package managing 259

user exits for Oracle databases 708 options 258

CMXUT.CLEAN_TABLE overview 255

removing BO data 607 CONSOLIDATION_IND column 85

color choices window 203 constants 328

columns constraints, allowing to be disabled 90

adding to tables 102 Contact match purpose 417

data types 102 control tables 345

properties of 103 Corporate_Entity match purpose 418

reserved names 78 CREATE_DATE column 84, 276

command buttons 44 CREATOR column 84, 276

complete tokenize ratio 90 cross-reference tables

concepts 19 about cross-reference tables 86

conditional execution components columns 87

about conditional execution components332 defined 75

adding 332 described 75

when to use 332 history table 89

conditional mapping 296 relationship to base objects 86

configuration requirements ROWID_XREF 87

User Object Registry, for custom code679 custom button functions

Configuration workbench 46, 737 registering 682

consolidated record 82 viewing 683

consolidation custom buttons

best version of the truth 259 about custom buttons 730

- 790 -
Index: custom functions – Databases tool

adding 736 deleting 150

appearance of 731 editing 149

clicking 731 custom stored procedures

custom functions, writing 732 about 680

deploying 735 about custom stored procedures 604

examples of 732 example code 607

icons 734 index, registering 606

listing 736 parameters of 604

properties file 736 registering 605

text labels 734 viewing 681

type change 736 D

updating 736 data cleansing

custom functions about data cleansing 307

client-based 732 Cleanse Match Servers 308

deleting 736 Data Steward workbench 48

server-based 732 data stewards

writing 732 tools for 48

custom indexes data types 102

about custom indexes 97 Database Debug Log

adding 98 writing messages to 607

deleting 100 database object name, constraints 77

editing 100 databases

navigating to the node 98 database ID 65

custom java cleanse functions selecting 32

viewing and registering 681 target database 31

custom Java cleanse functions Unicode, configuring 698

viewing 682 user access 653

custom queries Databases tool 55

about custom queries 147 about the Databases tool 54

adding 148 starting 55

- 791 -
Index: datasources – exact matches

datasources E

about datasources 71 encrypting passwords 68

creating 71 entities

JDBC datasources 71 about 182

removing 72 display options 189

DEBUG_PRINT 726 entity base objects

decay curve 348 about 182

decay types adding 183

linear 348 converting from base objects 184

RISL 348 reverting to base objects 190

SIRL 348 entity icons

DELETED system state, about 160 adding 180

DELETED_BY column 85, 88, 276 deleting 181

DELETED_DATE column 85, 88, 276 editing 181

DELETED_IND column 85, 88, 276 uploading default 179

delta detection entity icons, configuring 180

configuring 302 entity objects

considerations for using 305 about 182

how handled 304 entity types

landing table configuration 270 about 182

DIRTY_IND column 85 adding 186

display packages 153 assigning HM packages to 210

distinct mapping 295 deleting 189

distinct source systems 445 editing 188

Division match purpose 417 errors, auditing 690

documentation 13 exact matches

DROP_TEMP_TABLES stored procedure564 exact match / search strategy 409

duplicate data exact match base objects 247

match for 249 exact match columns 387, 423

duplicate match threshold 91 exact match strategy 370

- 792 -
Index: exclusive locks – group execution logs

exclusive locks 34 foreign key relationship base object

exclusive write lock creating 196

acquiring 36 foreign key relationships

execution scripts 560 creating 113-114

exhaustive search level 402 defined 114

extended key widths 393 lookups 283

external application users 647 supported 498

External Match jobs 572 fuzzy matches

about External Match jobs 535 fuzzy match / search strategy 409

input table 536 fuzzy match base objects 248, 390

output table 537 fuzzy match columns 387

running 539 fuzzy match strategy 370

external match tables G

system columns 536 GBID columns 104

extreme search level 402 Generate Match Tokens jobs 540, 573

F Generate Match Tokens on Load 544

Family match purpose 414 generating match tokens on load 92

Fields match purpose 418 global

filters password policy 654

about filters 384 Global Identifier (GBID) columns 104

adding 385 glossary 744

deleting 387 graph functions

editing 386 about graph functions 322

properties of 384 adding 322

foreign-key relationships adding functions to 323

about foreign-key relationships 113 conditional execution components331

adding 115 inputs 322

deleting 117 outputs 322

editing 116 group execution logs

virtual relationships 115 status values 525

- 793 -
Index: hierarchies – Hub Store

viewing 525 configuring 205

H deleting 209

hierarchies editing 209

about 191 Household match purpose 413

adding 191 Hub

configuring 191 installation details 45

deleting 192 tools 46

editing 192 version information 45

Hierarchies tool Hub Console

configuration overview 170 about the Hub Console 29

starting 178 accessing 30

Hierarchy Manager customizing the interface 45

configuration overview 170 login 31

entity icons, uploading 179 organization of 32

prerequisites 170 Processes view 33

repository base object tables 178 Quick Launch tab 45

sandboxes 214 target database

upgrading from previous versions179 connecting to 32

highest reserved key 278 selecting 31

history toolbar 45

enabling 90 window sizes and positions 45

history tables wizard welcome screens 45

base object history tables 89 Hub Delete jobs 541

cross-reference history tables 89 history tables, impact on 576

defined 75 IN_OVERRIDE_HISTORY_IND 577

enabling 96 IN_PURGE_HISTORY_IND 577

HM packages records on hold (CONSOLIDATION_IND=9), impact

about HM packages 205 stored procedure, about 575

adding 205 Hub Store 74

assigning to entity types 210 creating databases in 52

- 794 -
Index: HUB_STATE_IND column – landing tables

databases 51 Java compilers 612, 617

Master Database 51 JDBC data sources

Operational Record Store (ORS) 51 security, configuring 657

properties of 65 JMS Event Schema Manager

schema 73 about 616

table types 74 auto-searching for out-of-sync objects619

HUB_STATE_IND column 276 finding out-of-sync objects 618

HUB_STATE_IND column, about 160 starting 616

I JMS Event Schema Manager tool

immutable rowid object 443 about 615

importing table column definitions 109 K

IN_OVERRIDE_HISTORY_IND 577 Key Match jobs 541, 579

IN_PURGE_HISTORY_IND 577 key types 392

incremental loads 229 key widths 392

index L

custom, registering 606 land process

Individual match purpose 412 C_REPOS_SYSTEM table 265

initial data loads (IDLs) 229 configuration tasks 264

inputs 328 data flow 222

integration auditing 684 external batch process 223

inter-table paths 374 extract-transform-load (ETL) tool 223

interaction ID column, about 161 landing tables 222

intertable matching managing 224

described 423 overview 222

intra-table paths 377 real-time processing (API calls) 223

J source systems 222

JAR files ways to populate landing tables 223

ORS-specific, downloading 614 landing tables

Java archive (JAR) files about landing tables 269

tools.jar 612, 617 adding 271

- 795 -
Index: LAST_ROWID_SYSTEM column – mapping

columns 269 locks

defined 74 about locks 34

editing 272 types of 34

properties of 266, 270 log file rolling

removing 273 about 726

Unicode 702 login

LAST_ROWID_SYSTEM column 85 changing 38

LAST_UPDATE_DATE column 84, 270, entering 31


276
logs, ORS database
limited key widths 393
about 724
linear decay 348
configuring 727
linear unmerge 584
format 724
Load jobs 542, 580
levels 725
forced updates, about 544
sample file 729
Generate Match Tokens on Load 544
lookups
load batch size 91
about lookups 283
rejected records 510
configuring 284
rules for running 543
M
load process
manual
data flow 227
batch jobs 502
load inserts 232
Manual Merge jobs 545
load updates 232
Manual Unmerge jobs 546
overview 227
Manual Link jobs 545
steps for managing data 231
Manual Unlink jobs 546
tables, associated 228
mapping
loading by rowid 296
between staging and landing tables74
loading data 343
diagrams 289
incremental loads 229
removing 299
initial data loads (IDLs) 229
testing 298
locking

expiration 35

- 796 -
Index: mappings – match columns

mappings match output strings, importing 341

about mappings 286 match pool 251

adding 290 match strings, importing 338

cleansed 287 match subtype 420

conditional mapping 296 match table, resetting 555

configuring 286, 291 match tables 548

copying 290 match tokens, generate on PUT 92

distinct mapping 295 match/search strategy 370

editing 291 maximum matches for manual consolidation368

jumping to a schema 297 non-equal matching 421

loading by rowid 296 NULL matching 421

passed through 287 path 373

properties of 289 populations 699

query parameters 294 populations for fuzzy matches 370

Mappings tool 288, 556 properties

Master Database 51 about match properties 367

creating 52 setting 366, 380

password, changing 67 segment matching 422

match strategy

accept all unmatched rows as unique369 exact matches 370

check for missing children 381 fuzzy matches 370

child records 423 string matches in cleanse lists 338

dynamic match analysis threshold setting372


Match Analyze jobs 588

fuzzy population 370 match column rules

Match Analyze jobs 550 adding 424

match batch 251 deleting 429

Match for Duplicate Data jobs 552 editing 428

match minutes, maximum elapsed91 match columns

match only once setting 371 about match columns 387

match only previous rowid objects setting371exact match base objects 396

- 797 -
Index: Match for Duplicate Data jobs – match rules

exact match columns 387 match key table 248

fuzzy match base objects 390 match pairs 252

fuzzy match columns 387 match rules 247

key widths 392 match table 253

match key types 392 match tables 248

missing children 381 overview 245

Match for Duplicate Data jobs 589 populations 248

Match jobs 547, 586 support tables 248

state-enabled BOs 548 transitive matches 249

match key table 240 match purposes

match key tables field types 388

defined 75 match rule sets

match link about match rule sets 399

Autolink jobs 532 adding 405

Manual Link jobs 545 deleting 407

Manual Unlink jobs 546 editing 405

Migrate Link Style to Merge Style jobs552 editing the name 406

Reset Links jobs 555 filters 403

match paths properties of 401

about match paths 373 search levels 401

inter-table paths 374 match rules

intra-table paths 377 about match rules 247

relationship base objects 374 accept limit 419

match process defining 407

build match groups (BMGs) 249 exact match columns 423

data flow 246 match / search strategy 409

exact match base objects 247 match levels 418

execution sequence 251 match purposes

fuzzy match base objects 248 about 410

managing 254 Address match purpose 415

- 798 -
Index: match subtype – messages

Contact match purpose 417 deleting 453

Corporate_Entity match purpose418 editing 453

Division match purpose 417 message queue triggers

Family match purpose 414 enabling for state changes 164

Fields match purpose 418 message queues

Household match purpose 413 about message queues 261, 454

Individual match purpose 412 adding 454

Organization match purpose 416 auditing 689

Person_Name match purpose 412 deleting 456

Resident match purpose 412 editing 455

Wide_Contact match purpose 418 message check interval 451

Wide_Household match purpose415 Message Queues tool 450

primary key match rules properties of 454

about 434 receive batch size 451

adding 434 receive timeout 451

editing 436-437 status of 451

properties of 408 message schema

Reset Match Table jobs 555 ORS-specific, generating 615

types of 247 ORS-specific, generating and deploying617

match subtype 420 message triggers

matching about message triggers 456

duplicate data 249 adding 458

maximum trust 347 considerations for 458

merge deleting 463

Manual Merge jobs 545 editing 462

Manual Unmerge jobs 546 types of 457

Merge Manager tool 256 messages

message queue servers elements in 464

about message queue servers 452 examples

adding 452 accept as unique message 465

- 799 -
Index: metadata – Operational Reference Stores (ORS)

AmRule message 466 XREF set to Delete message 493

BoDelete message 467 filtering 465, 480

BoSetToDelete message 468 message fields 480

delete message 469 metadata

insert message 470 synchronizing 111

merge message 471, 485 trust 111

merge update message 471 Migrate Link Style to Merge Style jobs552

no action message 472 minimum trust 347

PendingInsert message 473 missing children, checking for 381

PendingUpdate message 474 Model workbench 47

PendingUpdateXref message 475 Multi Merge jobs 552

unmerge message 476 N

update message 476 narrow search level 402

update XREF message 477 New Query Wizard 131, 148

XRefDelete message 478 NLS_LANG 704

XRefSetToDelete message 479 non-equal matching 421

examples (legacy) non-exclusive locks 34

accept as unique message 481 NULL matching 421

bo delete message 481 null values

bo set to delete message 482 allowing null values in a column 103

delete message 483 O

insert message 484 OBJECT_FUNCTION_TYPE_DESC

merge update message 486 list of values 562

pending insert message 487 Operational Reference Stores (ORS)

pending update message 488 about ORSs 51

pending update XREF message 488 assigning users to 661

unmerge message 491 configuring 55

update message 489 connection testing 67

update XREF message 490 creating 52

XREF delete message 492 editing 64

- 800 -
Index: operational reference stores (ORSs) – PKEY_SRC_OBJECT column

editing registration properties 62 deleting 158

GETLIST limit (rows) 66 display packages 153

JNDI data source name 65 HM packages 205

password, changing 68 join queries 157

registering 56 Package Wizard 154

unregistering 70 properties of 155

operational reference stores (ORSs) PUT-enabled packages 153

logical names 611, 615 queries and packages 127

Oracle databases refreshing after query change 157

user exits located in cmxue package708 when to create 153

Organization match purpose 416 parallel degree 91

Organization_Name key type 392 password policies

ORS-specific operations global password policies 654

using SIF Manager tool 611 private password policies 656

ORS database logs passwords

about 724 changing 38

configuring 727 encrypting 68

format 724 global password policy 654

levels 725 private passwords 656

log file rolling 726 path components

procedures for appending data 726 adding 382

sample file 729 deleting 383

OUT_TMP_TABLE_LIST return parameter564 editing 383

outputs 329 properties of 381

overview 10 PENDING system state

P enabling match 163

Package Wizard 154 PENDING system state, about 160

packages Person_Name key type 392

about packages 152 Person_Name match purpose 412

creating 154 PKEY_SRC_OBJECT column 87, 275

- 801 -
Index: populations – publish process

populations primary key match rules

configuring 699 about 434

multiple populations 701 adding 434

non-US populations 699 deleting 437

selecting 370 editing 436

POST_LANDING Private password policy 656

parameters 709 Processes view 33

user exit, using 709 profiles

POST_LOAD about profiles 211

user exit, using 711 adding 211

POST_MATCH copying 215

parameters 712 deleting 215

user exit, using 712 editing 212

POST_MERGE validating 213

parameters 713 Promote batch job

user exit, using 713 promoting records 166

POST_STAGE Promote jobs 552

parameters 711 about 592

user exit, using 710 providers

POST_UNMERGE custom-added 671

parameters 714 providers.properties file

user exit, using 713 example 672

PRE_STAGE publish process

parameters 710 distribution flow 261

user exit, using 710 managing 263

PRE_USER_MERGE_ASSIGNMENT message queues 261

user exit, using 714 message triggers 261

preferred key widths 393 optional 261

preserving source system keys 277 ORS-specific schema file 262

overview 260

- 802 -
Index: purposes, match – Revalidate jobs

run-time flow 262 R

XSD file 262 rapid slow initial later (RISL) decay348

purposes, match 410 regular expression functions

PUT-enabled packages 153 about regular expression functions320

PUT_UPDATE_MERGE_IND column 88 adding 320

Q reject tables 288

queries relationship base objects 374

about queries 128 about 193

adding 130 converting to 198

columns 136 creating 194

conditions 139 reverting to base objects 200

custom queries 147 relationship types

deleting 151 about 194

editing 132 adding 201

impact analysis, viewing 150 deleting 204

join queries 157 editing 204

New Query Wizard 131, 148 relationships

overview of 128 about relationships 193

packages and queries 127 foreign key relationships 113

Queries tool 128-129, 153 repository base object (RBO) tables178

results, viewing 150 requeue on parent merge 91

sort order for results 142 Reset Links jobs 555

SQL, viewing 147 Reset Match Table jobs 555

tables 134 Resident match purpose 412

Queries tool 129, 153 resource groups

query groups adding 635

about query groups 129 deleting 637

adding 129 editing 636

deleting 130 resource privileges, assigning to roles641

editing 130 Revalidate jobs 556, 595

- 803 -
Index: roles – security

roles starting 81

about roles 638 schema match columns 555

adding 639 schema objects 77

assigning resource privileges to roles641 schema trust columns 557

deleting 641 Schema Viewer

editing 640 column names 124

Roles tool 639 command buttons 119

row-level locking context menu 123

about row-level locking 740 Diagram pane 119

configuring 741 hierarchic view 121

considerations for using 741 options 123

default behavior 740 orientation 124

enabling 741 orthogonal view 122

locks, types of 741 Overview pane 119

wait times 742 panes 119

ROWID_OBJECT column 84, 87, 276 printing 125

ROWID_SYSTEM column 87 saving as JPG 124

ROWID_XREF column 87 starting 119

S toggling views 122

sandboxes 214 zooming all 121

schema zooming in 120

ORS-specific, generating 611 zooming out 121

Schema Manager schemas

adding columns to tables 102 about schemas 73

base objects 82 search levels for match rule sets 401

filtering items 39 security

foreign key relationships 113 authentication 622

searching for items 41 authorization 623

show public system tables 40 concepts 621

sorting display names 38 configuring 621

- 804 -
Index: Security Access Manager (SAM) – staging tables

JDBC data sources, configuring 657 defining 264

roles 638 distinct source systems 445

tools 48 highest reserved key 278

Security Access Manager (SAM) 622 immutable source systems 443

Security Access Manager workbench48 preserving keys 277

security provider files removing 268

about security provider files 665 renaming 267

deleting 668 system repository table (C_REPOS_SYSTEM)265

list of provider files 666 Systems and Trust tool, starting 265

selecting 666 SRC_LUD column 87

uploading 667 SRC_ROWID column 276

Security Providers tool Stage jobs 556, 596

about security providers 664 lookups 283

provider files 665 rejected records 510

starting 664 stage process

segment matching 422 data flow 225

sequencing batch jobs 497 managing 226

SIF API overview 224

ORS-specific, generating 611 tables, associated 225

ORS-specific, removing 615 Stage process

ORS-specific, renaming 613 user exits 709

SIF Manager staging data

generating ORS-specific APIs 611 prerequisites 274

out-of-sync objects, finding 615 setup tasks 274

SIF Manager tool staging tables

about 611 about staging tables 275

source systems adding 280

about source systems 265 allow null foreign key 280

adding 266 allow null update 279

Admin source system 265 cell update 279

- 805 -
Index: standard key widths – table columns

column properties 279 batch jobs, list 566

columns 106, 275 C_REPOS_TABLE_OBJECT_V, about560

columns, creating 106 custom stored procedures 604

defined 74 DROP_TEMP_TABLES 564

editing 281 executing batch groups 598

highest reserved key 278 Hub Delete jobs 575

jumping to source system 283 OBJECT_FUNCTION_TYPE_DESC 562

lookups 283 removing BO data 607

preserve source system keys 277 temporary tables 564

properties of 277 transaction management 564

removing 286 survivorship 221

standard key widths 393 Synchronize jobs 352, 557, 597

state management synchronizing metadata 111

about 159 system columns

base object record survivorship 162 base objects 84

enabling 163 described 102, 113

enabling match on pending records163 external match tables 536

history of XREF promotion, enabling163 system repository table 265

HUB_STATE_IND column 160 system states

interaction ID column 161 about 160

Load jobs 542 system tables, showing 40

Match jobs 548 Systems and Trust tool 265

message queue triggers, enabling164 T

modifying record states 164 table columns

Promote batch job 166 about table columns 102

promoting records 165 adding 108

rules for loading data 168 deleting 112

state transition rules, about 161 editing 110

stored procedures Global Identifier (GBID) columns 104

batch groups 599 importing from another table 109

- 806 -
Index: tables – trust

staging tables 106 data flow 242

tables key concepts 242

adding columns to 102 match key table 240

base objects 74 match keys 240

C_REPOS_AUDIT table 692 match tokens 240

C_REPOS_JOB_CONTROL table 565 Tool Access tool 737

C_REPOS_JOB_METRIC table 565 tools

C_REPOS_JOB_METRIC_TYPE table565 Batch Viewer tool 501

C_REPOS_JOB_STATUS_TYPE table565 Cleanse Functions tool 315

C_REPOS_TABLE_OBJECT_V table561 Data Steward tools 48

control tables 345 Databases tool 55

cross-reference tables 75 described 46

history tables 75 Mappings tool 556

Hub Store 74 Merge Manager tool 256

landing tables 74 Queries tool 128-129, 153

match key table 240 Schema Manager 81

match key tables 75 security tools 48

reject tables 288 Tool Access tool 737

staging tables 74 user access to 737

supporting tables used by batch process497 Users tool 648

system repository table (C_REPOS_SYSTEM)265


utilities tools 48

target database write locks 35

changing 37 traceability 257

selecting 31 training 14

temporary tables 564 tree unmerge 584

tokenization process trust 348

DIRTY_IND column 243 about trust 344

when to execute 242 assigning trust levels 349

tokenize process calculations 345

about the tokenize process 240 considerations for setting 349

- 807 -
Index: typical search level – User Object Registry

decay curve 348 unmerge all 583

decay graph types 348 UPDATED_BY column 84, 276

decay periods 344 user exits

defining 349 about 679, 708

enabling 349 cmxue package (Oracle) 708

levels 344 POST_LANDING 709

maximum trust 347 POST_LOAD 711

minimum trust 347 POST_MATCH 712

properties of 347 POST_MERGE 713

slow initial rapid later (SIRL) decay348 POST_STAGE 710

synchronizing trust settings 557 POST_UNMERGE 713

Systems and Trust tool 265 PRE_STAGE 710

trust levels, defined 344 PRE_USER_MERGE_ASSIGNMENT714

typical search level 402 Stage process 709

U types 708

Unicode viewing 680

ANSI Code Page 703 user groups

cleanse settings 702 about user groups 658

configuring 698 adding 659

Hub Console 703 assigning users to 661

NLS_LANG 704 deleting 661

Unix and locale recommendations703 editing 660

unmerge User Object Registry

cascade unmerge 446 about 678

manual unmerge 546 configuration requirements for custom code679

unmerge child when parent unmerges446 custom button functions, viewing 683

Unmerge jobs 583 custom Java cleanse functions, viewing682

cascade unmerge 583 custom stored procedures, viewing681

linear unmerge 584 starting 679

tree unmerge 584 user exits, viewing 680

- 808 -
Index: user objects – write lock

user objects domain checks 356

about 678 downgrade percentage 357

users editing 360, 362

about users 647 enabling columns for validation 354

adding 648 examples of 357

assigning to Operational Record Stores (ORS)661


execution sequence 355

database access 653 existence checks 356

deleting 652 pattern validation 356

editing 649 properties of 356

external application users 647 referential integrity 356

global password policies 654 removing 362

password settings 652 required columns 354

private password policies 656 reserve minimum trust 357

properties of 648 rule column properties 357

supplemental information 651 rule name 356

tool access 737 rule SQL 357

types of users 647 rule types 356

user accounts 647 state-enabled base objects 354

Users and Groups tool 658 validation checks 353

Users tool 648 W

utilities tools 48 Web Services Description Language (WSDL)

Utilities workbench 48 ORS-specific APIs 613

V Wide_Contact match purpose 418

validation checks 353 Wide_Household match purpose 415

validation rules Workbenches view 33

about validation rules 353 workbenches, defined 33

adding 359 write lock

custom validation rules 356, 358 acquiring 36

defined 353 releasing 36

defining 353 tools that require 35

- 809 -
Index: write locks – XSD file

write locks

exclusive locks 34

non-exclusive locks 34

XSD file

downloading 618

- 810 -

Das könnte Ihnen auch gefallen