Sie sind auf Seite 1von 54

Session ID: MDM202 Configuration of Matching-Strategies for SAP Master Data Management

Contributing Speaker(s)
Christian Behre
NetWeaver Product Management, SAP AG

Remo Durante
Solution Architect, SAP Deutschland AG & Co.KG

SAP AG 2004, SAP TechEd / Session MDM202 / 2

Learning Objectives

As a result of this workshop, you will be able to:


understand the role of the Master Data Management understand the architecture of the Content Integrator understand the matching process explain the Content Integrator settings understand the meaning of Normalization and Matching Algorithm and how to adopt them

SAP AG 2004, SAP TechEd / Session MDM202 / 3

Overview Architecture Normalization Matching Algorithm Implementation Considerations

MDM A Key Capability of SAP NetWeaver 04


SAP NETWEAVER ONE PLATFORM ONE PRODUCT
SAP NetWeaver
PEOPLE INTEGRATION

SAP NetWeaver MDM


Enables companies to store,

augment, and consolidate


master data with high quality Ensures consistent distribution to all applications and systems within a

Multi channel access Composite Application Framework Portal Collaboration

INFORMATION INTEGRATION

heterogeneous IT landscape
Life Cycle Mgmt

Bus. Intelligence Master Data Mgmt

Knowledge Mgmt

Leverages existing IT investments in business-critical data Delivers vastly reduced TCO by effective master data management ensuring cross-system data consistency Accelerates and improves the

PROCESS INTEGRATION

Integration Broker

Business Process Mgmt

APPLICATION PLATFORM

J2EE

ABAP

DB and OS Abstraction

execution of business processes

SAP AG 2004, SAP TechEd / Session MDM202 / 5

SAP NetWeaver: Delivering on the promise of ESA


SAP NetWeaver
PEOPLE INTEGRATION Multi channel access COMPOSITE APPLICATION FRAMEWORK Portal Collaboration
Procurement Sales Shipment

INFORMATION INTEGRATION Knowledge Mgmt Bus. Intelligence

Master Data Management Master Data Management PROCESS INTEGRATION Integration Broker Bus. Process Mgmt

APPLICATION PLATFORM J2EE ABAP

DB and OS Abstraction

R/3

SRM

Siebel

i2

Legacy

...

SAP AG 2004, SAP TechEd / Session MDM202 / 6

Process Overview of SAP MDM Loading Master Data


SAP NetWeaver
Enterprise Portal Master Data Management
1

II 1

BI

Staging
4 4

4 1

4 Knowledge Mgmt.

Exchange Infrastructure
3 3

ERP Application Platform


I

CRM

Legacy

3rd Party

Legacy

Loading Master Data with Extraction MDM triggers the load PULL mechanism

II

Loading Master Data with Periodic Inbound Collector Client triggers the load PUSH principle

SAP AG 2004, SAP TechEd / Session MDM202 / 7

Process Overview of SAP MDM Consolidating Master Data


SAP NetWeaver
Enterprise Portal Master Data Management
CI 5 6

? =
BI

Staging

1b 1a 2

?
4

Database
7 Knowledge Mgmt.

=
Exchange Infrastructure

ERP Application Platform


1a Consolidating Master Data from Extraction 1b

CRM

Legacy

3rd Party

Legacy

Consolidating Master Data from Staging (PIC)

Valid for both process variants

SAP AG 2004, SAP TechEd / Session MDM202 / 8

Process Overview of SAP MDM Consolidating Master Data - Example


SAP NetWeaver
Enterprise Portal Master Data Management

? =
BI

Brown & Partner Inc. 248 Meadow Lane Drive Sacramento Ca. 95816 Brownes & Partner Brown Inc.
Exchange Infrastructure

MEADOW Lane

San Diego

CA 93860

248 Meadow Lane Drive Sacramento Ca. 95819

Knowledge Mgmt.

ERP Application Platform

CRM

Legacy

3rd Party

Legacy

Demo
SAP AG 2004, SAP TechEd / Session MDM202 / 9

Consolidating Master Data Challenges & Solutions


How can the data from different systems be compared at all? How can duplicates or identicals be determined? How can granted/may be/most probably not hits be distinguished? How can the system itself decide what cleansing cases to
automatically confirm as duplicates or identicals? present to the master data specialist to take a decision? automatically confirm as non duplicates or identicals?

MDM cleansing capabilities provide: Normalization capabilities to store unaligned data in a comparable format Object type related Matching algorithms to compare data sets Ranking on matching results to calculate matching scores Lower and Upper Score Thresholds to pre-decide on cleansing cases

SAP AG 2004, SAP TechEd / Session MDM202 / 10

Overview Architecture Normalization Matching Algorithm Implementation Considerations

SAP MDM 3.00 - Architecture

BI (BW 3.5) EP 6.0 Master Data Engine 3.00 Master Data Clients*
*i.e. SAP ERP, CRM, SRM

XI 3.00
technical routing structure mapping key-mapping cache (new 3.0)

master data administration process control (process chains) Inbound staging UIs logical routing

SAP Solution Manager 3.1

Content Integrator 3.00


matching engine key-mapping administration

SAP AG 2004, SAP TechEd / Session MDM202 / 12

Content Integrator - Process Overview


Master Data Client (MDC)
Business Partner 1 Business Partner 2 Business Partner 3 Business Partner 4 Product 1 Product 4 Product 3 Product 2 Business Partner 5

Master Data Server (MDS)


Content Integrator (CI)
XML-File RFC

Upload
Validation

Normalization Staging Area


Global Normalizer GetNormalizedAttributeKeys NormalizeObjects

Matching
GetMatchingCandidates CalculateScore

Cleansing
ID-Mapping

The explication you can find in the notes

SAP AG 2004, SAP TechEd / Session MDM202 / 13

Content Integrator Process Details


Master Data Server (MDS)
Staging Area RFC API XML-File

CI

1. Upload Validation

2. Normalization Global Normalizer writes

GetNormalizedAttributeKeys

reads

Object Store Table

NormalizeObjects 4. Cleansing 3. Matching

writes Index Table reads GetMatchingCandidates

ID-Mapping Table
API

writes

Mapping

reads

Score Table

writes

CalculateScore

Comparison

writes MDS

Generation of MDS-Objects

the Normalization is triggered again for the MDS Object (stops after writing into the Index Table)

SAP AG 2004, SAP TechEd / Session MDM202 / 14

Overview: Table View


Table Content during process
MDC-Objects with attributes
Object Store Table

Content end of process


MDS-Objects with attributes MDC-Objects without attributes (exist only to hold the references) normalized and indexed MDS-Data

normalized and indexed MDC-Data


Index Table

Score Table

MDC-Data that lies above the lower threshold with the related calculated score mapped MDC-Objects

for every new process the content is overwritten

ID-Mapping Table

MDS-Objects with the information, to what MDC-Objects they are mapped

SAP AG 2004, SAP TechEd / Session MDM202 / 15

Matching Strategy (Context) The Matching Strategy


contains settings to Normalizing Algorithm Matching Algorithm is defined dependent on the particular Business Objects the requirements of the customer is used within a Matching Context (e.g. Global Spend)

SAP AG 2004, SAP TechEd / Session MDM202 / 16

Overview Architecture Normalization Matching Algorithm Implementation Considerations

Normalization: The Process (Generally)


Master Data Server (MDS)
Staging Area XML-File RFC

CI

1. Upload

2. Normalization Global Normalizer writes

GetNormalizedAttributeKeys

reads

Object Store Table

NormalizeObjects

writes Index Table

Global Normalizer

Makes the data comparable Perticular settings: Ingnore blanks, special characters or several punctuation marks Upper and lower case

Normalizing Algorithm
GetNormailzedAttributeKeys Finds the indices that are defined within the coding NormalizeObject Indicates the data and writes them into the index table
SAP AG 2004, SAP TechEd / Session MDM202 / 18

Settings for the Global Normalizer

Object Type

Upper or lower case Include or exclude the characters Insert the character you want to include or exclude. Enter the Source Path / Target Path Enter the Reference Path
SAP AG 2004, SAP TechEd / Session MDM202 / 19

Settings for the Global Normalizer


Upper or lower case This concerns the whole content of the array. Include or exclude the characters If you include a certain amount of characters, the other characters, that are not separately mentioned are automatically excluded. Insert the character you want to include or exclude Be careful with the character <blank>: It must not stand in the last position when listing several characters. The characters have also to be listed without any separator. Enter the Source Path/Target Path It corresponds to the array name of the source/target table. You can get it form the Development Guide at the Service Market Place. Enter the Reference Path You can get it from the object scheme displayed in the Development Guide at the Service Market Place.

SAP AG 2004, SAP TechEd / Session MDM202 / 20

Global Normalizer Object model diagram of the Content Integrator Business Partner

Reference Path

Source Path/Target Path

(source: Development Guide for Matching Strategies)

SAP AG 2004, SAP TechEd / Session MDM202 / 21

Normalizing Algorithm
GetNormalizedAttributeKeys
The indices can be defined within the java coding You can define as an index
one array several arrays a certain part of an array (e.g. the first 8 characters)

Due to the fact that you can manipulate directly in the coding, the specification of the indices is very flexible.

SAP AG 2004, SAP TechEd / Session MDM202 / 22

Normalizing Algorithm
NormalizeObjects
The data are indicated and written into the index table, dependent from the previous coding settings:
Mandant Object ID consecutive numbering of data records having the same ObjectID Matched Entity ObjectID Column Key/Index Column Value

one object

SAP AG 2004, SAP TechEd / Session MDM202 / 23

NameNormalizer Customizing
At the SAP Enterprise Portal a Normalizing Algorithm Customizing in XMLFormat can be uploaded.

It may contain: The definition of NonNameTokens The substitution of several characters or a special character string The adding/truncation of several characters or a special character string

SAP AG 2004, SAP TechEd / Session MDM202 / 24

Settings for Normalizing Algorithm

Java Classname for Normalizing Algorithm

Customizing XML file according to the Normalizing Algorithm.

SAP AG 2004, SAP TechEd / Session MDM202 / 25

Example Normalization
Activity Example
"Mnchen Traffic Corp." 1 Remove fill signs like "# _ - / \ . ," "Mnchen Traffic Corp" 2 Convert to upper case "MNCHEN TRAFFIC CORP" 3 Normalize special characters according to predefined list (replace by AE) MUENCHEN TRAFFIC CORP 4 (Tokenize) Cut name into tokens using list of token separator: <blank>, "-" MUENCHEN, TRAFFIC, CORP 5 Check and mark token against predefined list of nonname tokens (like CORP, BANK). MUENCHEN, TRAFFIC, CORP 6 Check for minimum Length. MUENCHEN, TRAFFIC, CORP

SAP AG 2004, SAP TechEd / Session MDM202 / 26

Overview Architecture Normalization Matching Algorithm Implementation Considerations

Matching: The Process (Generally)


GetMatchingCandidates Reads the content of the Index Table CalculateScore Calculates the score between the indicated data records the score is calculated with the help of the settings in the xml-file dependent on the defined upper and lower threshold, the results are written into the Score Table objects with a score below the lower threshold are not written into the Score Table
4. Cleansing 3. Matching Index Table reads GetMatchingCandidates writes

Score Table

CalculateScore

SAP AG 2004, SAP TechEd / Session MDM202 / 28

Matching Algorithm
CalculateScore
Calculates the score between the indicated data records with the help of the settings in the XML-File

Matching Algorithm contains: the definition of the score for the particular attributes conditions for the matching check (e.g. check in a certain order with a special dependence)

The score values in a existing XML-File can be replaced without changing the Matching Algorithm.
SAP AG 2004, SAP TechEd / Session MDM202 / 29

Matching Strategy BP_300_Token_cust - Organizations

Source of Information Partner/companyName

Comment

Scoring +25 +10 +10 for first for each additional for each match of NameToken match of NameToken match of NonNameToken

Partner/SecondaryIDs/PartnerIdentification

Only for ABA BuPa available

+50 +20 10 +50 +20 10 -30 +15 +5 +10 +15 +0 +5 +5 +5

if secID and secIDType match else +0 for each additional match of secID / secIDType pairs if secID does not, but secIDType matches if secID and taxType match else +0 for each additional match of secID / taxType pairs if secID no match, but taxType matches if no match, if match, if match, if match, for first for each additional for each for each if sum 90 if sum 90 if sum > 190 else +0 else +0 else +0 else +0 match of NameToken else match of NameToken match of HouseNumber match of NonNameToken

Partner/SecondaryIDs/TaxNumber

Partner/address/countryCode Partner/address/postalCode Partner/address/region Partner/address/city Partner/address/street Partner/address/houseNumber

Score

Sum up scoring points

sum (90 + (sum90)/10) 100

SAP AG 2004, SAP TechEd / Session MDM202 / 30

Matching Algorithm (Example)


Threshold

Original Data Name Street City Postal Code DUNS Normalized Date nameTok en nameTok en non-nameTok en street street number city Postal Code DUNS Matching Result Calculated Score Final Score Result

Reference Object Mnchen Trafic Corporation Brezelstrasse 7 Mnchen 80331 4711 Normalized Values MUENCHEN TRAFIC CORP Brezelstrasse 7 MUENCHEN 80331 4711

Matching Canditate 1 Mnchen Rck AG Brezelstrasse 17 Mnchen 61525 4512

Matching Canditate 2 Mnchener Brtchen Corp. Brezelstrasse 17 Mnchen 61525 -

Matching Canditate 3 Mnchen Trafic Corporation Brezelgasse 17 Mnchen 80331 4711

Normalized Values Score Normalized Values Score Normalized Values Score MUENCHEN 25 MUENCHEN 25 MUENCHEN 25 RUECK 0 BROETCHEN 0 TRAFIC 10 AG 0 CORP 10 CORP 10 Brezelstrasse 15 Brezelstrasse 15 Brezelgasse 0 17 0 17 0 17 0 MUENCHEN 10 MUENCHEN 10 MUENCHEN 10 61525 0 61525 0 80331 15 4512 -10 0 4711 50

40 40 No Match Cleansing Case

60 60 Automatic Match

120 91

SAP AG 2004, SAP TechEd / Session MDM202 / 31

Settings for Matching Algorithm

Java Classname for the Matching Algorithm

Customizing XML file according to the Matching Algorithm

SAP AG 2004, SAP TechEd / Session MDM202 / 32

Overview Architecture Normalization Matching Implementation Considerations

Find Ways to identify Duplicates/Identicals


How to find records representing the same business object?
1. Using the name 2. Using identifying characteristics 3. Using other attributes Name1, Name2, .. D&B no., tax no., manufacturer part no. Street, zip code, place

4. Built decision matrix - verify with business using real life examples Name Street House No. Zip Code City Account Group Duplicate? No No? Yes? Yes

SAP AG 2004, SAP TechEd / Session MDM202 / 34

Know about the Quality of your Data


Examples of questions about the quality of master data
Which identifying characteristics are available in your systems and are correctly maintained there? Are these characteristics always available? What are the insignificant parts of a name for which a matching strategy is not required? Which data might contain typing errors? How big is the volume of data in the different systems? What level of similarity is required for a valid duplicate proposal? Is there a level of similarity above which a duplicate is certain to be found? Are the existing matching strategies sufficient to meet the requirements?

SAP AG 2004, SAP TechEd / Session MDM202 / 35

Be aware of exceptional Cases


Reality Check: Example - Wal-Mart as a customer
If the matching strategy is based on the name, there will be a number of superfluous duplicate proposals and therefore a large amount of effort will be required for the manual decision (clearing). This will also have a negative effect on the system performance.

Possible Solution Adjust limit values for generating a duplicate proposal Make Wal-Mart an insignificant name part Adjust the matching strategy: Are there other identifying attributes or keys that can be taken into account?

SAP AG 2004, SAP TechEd / Session MDM202 / 36

Use all Configuration Possibilities (before Developing)


Matching results can be influenced on 3 levels
1. Portal based configuration
build your Matching Context choose from the out-of-the-box Normalization/Matching Algorithms define your thresholds (lower/upper) configure the Global Normalizer

2.

XML-file based configuration


normalization (e.g. define additional non name tokens) alter the scores for the particular attributes

3.

Adapt a standard algorithms, develop own algorithms


develop your own JAVA coding for normalization and/or matching define your own customizing objects for your new algorithms follow the Matching Strategy Development Guide

SAP AG 2004, SAP TechEd / Session MDM202 / 37

Developing an new Matching Strategy


Steps of the development process
1. Prerequisites and Configuration of the Development Environment
Install and configure the required software SAP MDM 3.0 SP02 or higher NetWeaver Developer Studio 2.0 or higher

2.

Building your application logic


Define your project using NetWeaver Developer Studio Import the matching API and perform recommended settings for your project Design your matching and normalization strategy and implement the logic in Java classes using the interfaces provided by the Matching API (consider Matching Strategy Development Guide)

3.

Customizing and Deploying of a Matching Strategy


Define new customizing object types for matching and normalization algorithm and upload them on the SAP Enterprise Portal. Customization files are maintained in XML format.

SAP AG 2004, SAP TechEd / Session MDM202 / 38

Fine-Tune the Normalization/Matching Algortihms


Test different Matching Strategies
In MDM 3.00 multiple Matching and Normalization Algorithms can be applied on objects within one object type. The data is loaded once, but can be used for multiple normalization and matching processes according to different matching contexts. Generated ID-Mappings will be saved separately marked with the relevant Matching Context. One Matching Context can be marked for productive usage. The whole process can be reset all relevant table content will be deleted.

SAP AG 2004, SAP TechEd / Session MDM202 / 39

Summary Content Integrator (CI) is a component of SAP Master Data Management (MDM) to consolidate business data in a heterogeneous system landscape. The matching process is the core process to identify identical or similar business data objects. Matching context provides the logic and conditions for matching processes. With MDM some Matching Strategies are delivered. Matching Strategies can be configured and/or developed.

SAP AG 2004, SAP TechEd / Session MDM202 / 40

Further Information
Public Web:
www.sap.com SAP Developer Network: www.sdn.sap.com Master Data Management SAP Customer Services Network: www.sap.com/services/

Related SAP Education Training Opportunities


http://www.sap.com/education/ http://service.sap.com/okp

Related MDM Documentation


http://service.sap.com/instguides Installation and Upgrade Guides SAP Components SAP MDM (Configuration Guides, Matching Strategy Development Guide, Description of standard Matching Strategies)

Related Workshops/Lectures at SAP TechEd 2004


MDM201 MDM202 MDM203 MDM252 MDM253 How to Use Key-Mapping in SAP MDM for Reporting Configuration of Matching-Strategies for SAP MDM Customizing ID Mapping in SAP Solution Manager Connecting a New Client-System to SAP MDM SAP MDM: How to Model a New MDM Object Type Lecture Lecture Lecture Hands-on Hands-on

SAP AG 2004, SAP TechEd / Session MDM202 / 41

SAP Developer Network


Look for SAP TechEd 04 presentations and videos on the SAP Developer Network. Coming in December. http://www.sdn.sap.com/

SAP AG 2004, SAP TechEd / Session MDM202 / 42

Questions?

Q&A
SAP AG 2004, SAP TechEd / Session MDM202 / 43

Feedback
Please complete your session evaluation. Be courteous deposit your trash, and do not take the handouts for the following session.

Thank You !

SAP AG 2004, SAP TechEd / Session MDM202 / 44

Overview Architecture Normalization Matching Algorithm Implementation Considerations Appendix

Matching Strategies
Matching Algorithm (com.sap.ci.strategies)
Mat_300_GTIN_A Mat_300_GTIN_SPN_MPN_A

Matching Strategies CI 3.0


Mat_300_GTIN Mat_300_GTIN_SPN_MPN

Normalization Algorithm (com.sap.ci.strategies)


Mat_300_GTIN_N Mat_300_GTIN_SPN_MPN_N Mat_300_ShorttextCategory_cust_N Org_300_Token_cust_N BP_300_Token_cust_N TA_300_ManufacturerInfo_N MDMPartnerStrategyByTokens DummyMaterialNormalizationAlgo MaterialNormalizationByGTIN_SPN_MPN TechnicalAssetNormalization

Mat_300_ShorttextCategory_cust Mat_300_ShorttextCategory_cust_A Org_300_Token_cust BP_300_Token_cust TA_300_ManufacturerInfo Organization_By_Token Material_by_GTIN Material_by_GTIN_MPN_SPN TechAsset_by_ManuInfo


Org_300_Token_cust_A BP_300_Token_cust_A TA_300_ManufacturerInfo_A MDMPartnerStrategyByTokens MDMProductStrategyByGTIN MDMProductStrategyByGTIN_SPN_MPN MDMTechnicalAssetStrategyByManufacturerInfo

Please note that this document is subject to change and may be changed by SAP at any time without notice. The document is not intended to be binding upon SAP to any particular course of business, product strategy and/or development.
SAP AG 2004, SAP TechEd / Session MDM202 / 46

Cleansing: How to Merge Manually

Enter one of the data cleansing case numbers

SAP AG 2004, SAP TechEd / Session MDM202 / 47

Cleansing: How to Merge Manually

List of Cleansing Cases: The mapped data records are displayed

SAP AG 2004, SAP TechEd / Session MDM202 / 48

Cleansing: How to Merge Manually

To compare the two data, mark them and click the button <Compare Objects>

SAP AG 2004, SAP TechEd / Session MDM202 / 49

Cleansing: How to Merge Manually

SAP AG 2004, SAP TechEd / Session MDM202 / 50

Cleansing: How to Merge Manually

To merge the two objects: Set one of the data <Duplicate>, the other <Target> Set the status from <New> to <Released> Click the button <Save>

SAP AG 2004, SAP TechEd / Session MDM202 / 51

Cleansing: How to Merge Manually

SAP AG 2004, SAP TechEd / Session MDM202 / 52

Cleansing: How to Merge Manually

Additionally you have to enter the other cleansing case number.

SAP AG 2004, SAP TechEd / Session MDM202 / 53

Copyright 2004 SAP AG. All Rights Reserved


No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, and Informix are trademarks or registered trademarks of IBM Corporation in the United States and/or other countries. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MaxDB is a trademark of MySQL AB, Sweden. SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary. These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
SAP AG 2004, SAP TechEd / Session MDM202 / 54

Das könnte Ihnen auch gefallen