Sie sind auf Seite 1von 41

First-hand knowledge.

Reading Sample
This sample provides an explanation of what enterprise information
management really is, and what it means for any organization. Youll
also find a chapter that introduces a tool that will help you manage your
information: SAP PowerDesigner, a modeling and design-time metadata
management platform for information management designs.

Introduction

Introducing Enterprise Information



Management

SAP PowerDesigner

Contents

Index

The Authors

Brague, Dichmann, Keller, Kuppe, On


Enterprise Information Management with SAP
605 Pages, 2014, $69.95/69.95
ISBN 978-1-4932-1045-9

www.sap-press.com/3666
Introduction

Welcome to the second edition of Enterprise Information Management with SAP!


The goal of this book continues to be to introduce readers to the concepts of
Enterprise Information Management (EIM), provide examples of how SAPs EIM
solutions are used today, and offer technical instructions on performing some of
the most common EIM tasks in SAP. The second edition includes updates to chap-
ters on SAP Data Services, SAP HANA, SAP Information Steward, SAP Master Data
Governance, SAP Information Lifecycle Management, and SAP Extended Enter-
prise Content Management by OpenText, which are based on recent releases, as
well as some new chapters on SAP Rapid Deployment solutions, SAP PowerDe-
signer, and SAP Hana Cloud Integration.

Target Groups of the Book


This book is intended for both experienced practitioners and those who are new
to managing, governing, and maximizing the use of information that impacts
enterprises. Specifically, it will be of use to business process experts, architects,
data stewards, data owners, business process owners, analysts, and developers
who are new to the topic of EIM in SAP. While there are several specific how to
build and how this works sections, the book content requires no previous
knowledge of EIM or SAPs solutions for EIM.

This book is also intended for existing information management experts who
need to expand their skills from a specific EIM domain to broader information
management strategies. This target group wont need to reference all chapters,
but will be interested in new capability information provided in many (e.g., the
latest release information and new products available).

Structure of the Book


This book is divided into two parts:

Part I: SAPs Enterprise Information Management Strategy and Portfolio


This part of the book starts by introducing EIM and its main concepts, including
information governance and big data. After you understand the ideas behind

17
Introduction Introduction

EIM, we move on to an overview of the solutions for EIM within SAPs portfo- standards and processes, and maps governance activities to technology enablers
lio, offering brief explanations of the main EIM solutions, as well as the rapid for these standards and processes.
deployment paradigm for those solutions. Finally, Part I concludes with real-life Chapter 3: Big Data with SAP HANA, Hadoop, and EIM
examples of how SAPs EIM solutions are used by several different customers. This chapter introduces Big Data in the context of SAPs solutions for EIM. Spe-
Part II: Working with SAPs Enterprise Information Management Solutions cifically, it focuses on the role of SAP HANA and Hadoop.
This part of the book focuses on how to get started using SAPs solutions for Chapter 4: SAPs Solutions for Enterprise Information Management
EIM. Part II includes product details on topics ranging from understanding the This chapter describes SAPs solutions for EIM, introducing and providing
current state of your data, to managing unstructured content and getting overviews of specific products. After reading this chapter, you will be able to
started with master data governance. This section focuses on select parts of quickly identify which chapters in Part II are of the most interest to you.
SAPs EIM offerings with the goal of providing practical examples and step-by-
Chapter 5: Rapid-Deployment Solutions for Enterprise Information
step instructions for key SAP capabilities. Youll learn how to model your infor-
Management
mation landscape (SAP PowerDesigner), get started assessing and monitoring
This chapter explains the rapid-deployment paradigm for EIM solutions with
your data (SAP Information Steward), integrate both on-premise and cloud data
predefined best practices, setting a foundation for the deployment of SAP EIM
sources (SAP Data Services and SAP HANA Cloud Integration), use data quality
solutions.
transforms (SAP Data Services), turn text data into data points (SAP Data Ser-
vices), govern your master data (SAP Master Data Governance), manage struc- Chapter 6: Practical Examples of EIM
tured and unstructured content that impacts business processes (SAP Extended This chapter discusses specific examples of EIM application by various custom-
Content Management by OpenText), and set retention rules and retire informa- ers. Content discussed includes recommendations for your EIM architecture
tion (SAP Information Lifecycle Management). (written by Procter & Gamble), the evolution of SAP Data Services (written by
National Vision), and tips for successful Enterprise Content Management
With the division of the book into two major parts, you can read the different projects (written by Belgian Railways). In addition, there are other customer-
parts as you need them. Part I is critical to understanding EIM and the role it plays written sections on data migration, managing master data, data archiving strat-
in SAPs strategy and portfolio. In Part II, you can access information and insight egy recommendations, and recommendations for positioning different SAP
about the EIM capabilities that are most applicable to your projects, planning, and tools for data and process integration.
information management strategy.
Chapter 7: SAP PowerDesigner
More specifically, the book consists of the following chapters: This chapter focuses on the discipline of enterprise information architecture,
and how SAP PowerDesigner enables you to understand your current informa-
Chapter 1: Introducing Enterprise Information Management
tion landscape, align business information with technical implementation, and
This chapter provides an introduction to the concept of EIM. It defines EIM,
plan for change.
discusses common use cases and business drivers for EIM, discusses the impact
of big data on EIM, explains SAPs strategy for EIM, and discusses common Chapter 8: SAP HANA Cloud Integration
user roles of people and organizations that are normally involved in EIM. Youll Chapter 8 introduces SAP HANA Cloud Integration as SAPs solution for deliv-
also get an introduction to NeedsEIM Inc., which is the fictional company used ering integration between on-premise and cloud applications.
as a basis for examples throughout the book. Chapter 9: SAP Data Services
Chapter 2: Introducing Information Governance Chapter 9 introduces SAP Data Services as a data foundation for EIM. It
Information governance is the practice of overseeing the management of your describes the components and architecture of SAP Data Services and walks you
enterprises information. It touches all aspects of EIM and must be considered in through specific examples of how to start doing data integration, data quality,
any EIM strategy. This chapter provides tips for developing your governance and text data processing with SAP Data Services.

18 19
Introduction Introduction

Chapter 10: SAP Information Steward The second edition brought back some familiar faces as well as some new, aspir-
This chapter introduces SAP Information Steward, which can be used for pro- ing authors. Without exception, each brought fresh energy and commitment to
filing and getting to know the current state of your data. This chapter discusses provide valuable updates and new content to the book. It was a pleasure to work
cataloging your data assets, performing data profiling, and monitoring your with each and every one of them, and I feel extremely appreciative for the extra
data quality over time. time many put forth to make their updates meaningful and to keep the book on
Chapter 11: SAP Master Data Governance track. In addition, there were many other people that took time out of their
Chapter 11 describes how to get started using SAP Master Data Governance for already busy schedules to provide a fresh perspective or critical eye to the mate-
your master data governance initiatives. It includes a description of SAP-pro- rial. A special thank you to John Schitka, Ken Beutler, Marie Goodell, Connie
vided master data governance processes and explains how to create custom Chan, Yingwu Gao, Bharath Ajendla, Anthony Hill, Michael Hill, and Niels Wei-
governance processes. It also describes the use of SAP Business Workflow and gelyour willingness to contribute and provide feedback was truly appreciated.
BRFplus for governing master data. Finally, the chapter gives an example of Finally, I would like to acknowledge my manager, Subha Ramachandran, for sup-
using SAP Information Steward in conjunction with SAP Master Data Gover- porting this project as a priority for me and others in the organization.
nance for monitoring and remediating master data. All of the royalties from this book will continue to be donated to Doctors Without
Chapter 12: SAP Information Lifecycle Management Borders (Mdecins Sans Frontires). Your purchase of this book helps us support an
Chapter 12 provides background information on the concept of information international medical humanitarian organization that delivers emergency aid in
lifecycle management. It then specifically introduces SAP Information Lifecycle many countries. Thank you for enabling us to provide financial support to this
Management, offering discussions of retention management, system decom- important organization and its critical mission.
missioning, and how SAP Information Lifecycle Management works to support
I hope the book becomes a valuable resource to you and your understanding of
the lifecycle of information.
Enterprise Information Management with SAP. Enjoy!
Chapter 13: SAP Extended Enterprise Content Management by OpenText
Chapter 13 discusses the major features of SAP Extended Enterprise Content
Corrie Brague
Management by OpenText, how it uses SAP ArchiveLink, and how it works
Enterprise Information Management Product Management
with the SAP Business Suite.
SAP Labs, LLC
Online Appendices
There are several appendices to assist you: Appendix A covers advanced data
quality capabilities, Appendix B provides details on SAPs migration content,
and Appendix C provides tips for your first data archiving projects. The appen-
dices and an example spreadsheet for monitoring your data migration projects
can be downloaded from the books website at http://www.sap-press.com/3666.

Acknowledgments
This second edition would not have been possible without the incredible efforts
of a diverse set of authors that contributed to the first edition of this book, guided
to success by the spirited leadership of Ginger Gatling. They laid down a solid
foundation to build upon.

20 21
This chapter introduces Enterprise Information Management, including
common use cases and big data. It also provides an overview of SAPs
strategy for Enterprise Information Management.

1 Introducing Enterprise
Information Management

Cloud, big data, and social media are powering new opportunities for companies
that can leverage information-driven insights in real time to respond to customer
preferences, identify operational efficiencies, and in some cases, create completely
new business models. To achieve transformative business results, best-run busi-
nesses treat information as a corporate asset. Its carefully managed, thoughtfully
governed, strategically used, and sensibly controlled.

Effective management of enterprise information can help your organization run


faster. As a result, you can achieve new business outcomes: understanding and
retaining your customers, getting the most from your suppliers, ensuring compli-
ance without increasing your risk, and providing internal transparency to drive
operational and strategic decisions.

SAP helps businesses run better and more simply by enabling IT to more easily
manage and optimize enterprise information. SAP solutions for Enterprise Infor-
mation Management (EIM) provide the critical capabilities to architect, integrate,
improve, manage, associate, and archive all information. This chapter introduces
EIM and explains what it is, why its important to organizations, how it fits into
SAPs strategy, and some typical user roles. Finally, the chapter concludes by
introducing NeedsEIM Inc., a fictional company that well use throughout the
book to illustrate EIM principles.

1.1 Defining Enterprise Information Management


On Gartners IT glossary page, Enterprise Information Management is defined as
an integrative discipline for structuring, describing and governing information

25
1 Introducing Enterprise Information Management Defining Enterprise Information Management 1.1

assets across organizational and technological boundaries to improve efficiency, As illustrated in Figure 1.1, theres an associated cost in bringing information into
promote transparency, and enable business insight.1 an organization, using the information, and hopefully retiring the information
after its no longer producing value. The idea that organizations really just do
EIM involves a strategic and governed execution of the following disciplines: enter-
three things with informationon-board, actively use, and then off-boardis
prise architecture, data integration, data quality, master data management, content
powerful when thinking about EIM solutions.
management, and lifecycle management. It addresses the management of all types
of information, including traditional structured data, semi-structured and unstruc- After information is brought into your organization, its required for many uses
tured data, and content such as documents, emails, audio, video, and so forth. beyond its original purpose. Hence, its advantageous to prepare the information
for these manifold uses. That way, the effort to repurpose information during the
To optimize the use and cost of managing information, we must first understand its
active-use phase is greatly reduced. When the information is no longer required, it
lifecycle. The active management and governance of information helps in avoiding
should be off-boarded or retired in a manner that meets your organizations legal
the costs that are associated with blind information hoarding. The risk of having
and business requirements. The truth is that most organizations dont proactively
too much information is just as real as not having enough when you need it.
consider the reuse and eventual off-boarding of information, which ends up cost-
Figure 1.1 shows a typical spend on information over time. This is a technology ing millions in IT resources due to maintaining systems that are no longer used.
and resources spend curve. What may be surprising for most organizations is the
If you adopt an information strategy, the spend changes to what is shown in
increase in spend during the off-boarding phase. Many companies spend a lot of Figure 1.2.
money maintaining information that is out of control. Is the information still used?
In what systems? Can you decommission those systems? Are you managing pieces
of information that are no longer used? On-boarding Active use Off-boarding
 Creation  Preparedness  Archival
 Migration  Migration  Deletion
 Import  Import  Decommissioning
On-boarding Active use Off-boarding  Retention Policy
Planning

Spend
Spend

Time

Figure 1.2 Spend on Information with an Enterprise Information Management Strategy

Time
Figure 1.2 also provides detailed examples of the types of activities involved
Figure 1.1 Typical Spend on Information Over Time
with EIM across the typical lifecycle of information. In the on-boarding phase,
activities include the creation of information through online user creation,
1 Source: http://www.gartner.com/it-glossary/enterprise-information-management-eim/

26 27
1 Introducing Enterprise Information Management Defining Enterprise Information Management 1.1

integration of processes that involves the creation of new information, import


NeedsEIM, Inc.
of information, and migration of information. Additionally, the on-boarding Manufactures retail durable goods

phase should include lifecycle planning (e.g., how long the information should
be retained). Implementing governance and retention policies as the informa-
tion is on-boarded dramatically lowers the cost of information over its effec- !
Finance Contracts
tive lifetime. Notice that there is still some spend increase as information is !
actively used. This is from incrementally improving, enriching, and preparing
Outsourced Manufacturing
information for alternative uses. The key to bending the cost curve down is Highly diverse and complex supplier network

understanding that information has tremendous value beyond its original pur-
pose and proactively planning for that in your EIM strategy. The result is that
the spend curve goes down over time in the active-use phase as information is
simply repurposed. Again, this can be achieved because the incremental cost is
just the provisioning of existing known and trusted informationas opposed ! ! Procurement
IT

to starting over for each new information initiative.

Next, well look at an example of information flow through a company and then
discuss how this relates to information management. ! Engineering
! Sales

Figure 1.3 NeedsEIM Inc.


1.1.1 Example of Information Flow through a Company
NeedsEIM Inc. is a fictional company thats based on real customer examples. As an example of information flow through NeedsEIM Inc., lets look at the pro-
Well explain NeedsEIM Inc. in detail in Section 1.7 and again throughout Part II cess of contract negotiations with a supplier:
of the book when we describe how to use various EIM capabilities. For now, we
1. The proposed supplier must be researched for due diligence, including type of
want to introduce NeedsEIM and the types of information it must deal with, products or services provided, similar customers serviced, reference calls with
including how information flows through the company. This leads to a discussion current customers, quality history, financial and credit ratings, reliability and
about the types of information included in EIM. trustworthiness, and general reputation.
Figure 1.3 depicts the business processes of NeedsEIM. It manufactures retail This involves emails, online research, and getting information from external
durable goods, and the majority of its manufacturing is outsourced. This business sources such as Dun & Bradstreet.
model results in a complex and diverse supplier network that impacts most This information is shared among the finance, engineering, procurement,
departments. The major departments include finance, which must deal with sup- and contracts departments.
plier payments, and the engineering and contracts department, which must coor-
2. Assuming the due diligence indicates that the supplier is approved, the supplier
dinate contracts and technical spec drawings with the manufacturers.
master data needs to be created and distributed to related systems. The scope,
The IT department must deal with diverse systems, including SAP and non-SAP projects, pricing, contracts, and legal documents must be created.
systems. The procurement department is responsible for the supplier relation- This involves most departments and includes sales if the durable goods price
ships and ensuring the company gets the most from its suppliers. The sales depart- point might be impacted.
ment is always looking for new and creative sales channels, including opportuni-
The supplier sends and receives legal, technical, financial, and other infor-
ties in the supplier population.
mation.

28 29
1 Introducing Enterprise Information Management Defining Enterprise Information Management 1.1

3. After the contracts are negotiated, the supplier requires ongoing communica- As you can see in Figure 1.4, the reality is that information is often required by
tion, including technical drawings, bills of materials, and other information many departments. Sometimes when the information doesnt move from one
required to do the work. In addition, financial documents such as invoices, pur- department to another due to application, political, and/or departmental silos,
chase orders, and so on are exchanged. departments create their own tribal versions of the information, and each
This includes a lot of collaboration among engineering, contracts, procure- department has a different sense of its ownership of the information. (Well talk
ment, and the supplier. more about tribal information in Section 1.3.2.)

Figure 1.4 shows the information as it needs to flow through each department. Earlier, we mentioned several kinds of information needed for negotiations with
Departments use the information with their perspective in mind: They store it, a supplier. This includes detailed information on the supplier, external references
update it, download it, and ensure that it meets the requirements for their depart- for the supplier, pricing and detailed contract information, engineering docu-
ments role with the supplier. ments of what the supplier will provide or build for NeedsEIM Inc., as well as bill-
ing, invoicing, and all the typical supplier interactions. The next section will break
this down further into types of information that are required and how this infor-
NeedsEIM, Inc.
Manufactures retail durable goods mation is included in EIM.

1.1.2 Types of Information Included in Enterprise


Information Management
Figure 1.5 shows the types of information that are included in SAP solutions for
EIM that will be covered in this book.

Finance Contracts
! !

Create Information Governance Retire


IT Procurement
! !
The car
should self-
drive on the HTTP
highway

Figure 1.5 Types of Information Included in Enterprise Information Management


Engineering Sales
! !
These information types are relevant for most companies, including NeedsEIM
Figure 1.4 Example Information Flow for NeedsEIM Inc. Inc. The following provides more information about these types:

30 31
1 Introducing Enterprise Information Management Common Use Cases for EIM 1.2

1 Structured data manufacturing process, feedback from internal departments, and comments on
This includes the familiar data thats used within an application system (e.g., surveys and service tickets.
customers, products, and sales orders); for example, supplier information such
As you can see, EIM includes the support of traditional structured data and
as name, address, credit information, contact information, and so on. This also
unstructured information, from the moment of creation through retirement. The
includes all purchase orders, sales orders, and other data thats related to this
retirement of data and information has the same value as creation. After informa-
supplier.
tion is no longer needed, it becomes a liabilitya legal liability, a cost liability, or
2 Desktop documents some other kind of liability. The entire life span of the data and information, and
These include Microsoft Word, Microsoft Excel, Adobe Acrobat, and other the governance of that information, is covered in EIM.
desktop application documents. This data is stored across the enterprise on
shared drives and laptops, which means that much of it isnt controlled at an
enterprise level. This content may be critical to the application data, so you 1.2 Common Use Cases for EIM
need to manage it with the same importance as the structured data in the data-
base. Examples include purchasing documents (e.g., invoices), contracts with There are many use cases for EIM solutions. Three of the primary scenarios include
suppliers, legal documents, rsums, and HR documents, to name a few. the support of operational, analytical, and information governance initiatives.
3 Pictures, scanned documents, videos, and other images
These could be scanned invoices, videos, pictures of products that are sold in a 1.2.1 EIM for Operational Initiatives
catalog, and drawings of products that are being designed and built. These
This scenario covers the use of EIM in the operation and execution of business
become part of the content that needs to be managed and related to the struc-
processes and tasks that happen throughout the day. It has very broad applica-
tured data when required. Managing content thats associated with a core busi-
tions, from ensuring that material replenishment data is set correctly, to customer
ness process is becoming increasingly important to process efficiency and reg-
data quality management, to migrating new data from a merger, to ensuring that
ulatory compliance. Examples of such content include engineering documents
all contracts and documents are available for the business process, to removing
that are to be shared with suppliers, pictures of raw materials, routine mainte-
data that is no longer required.
nance records in asset management, invoices, and expense report receipts.
4 Semi-structured data SAP solutions for EIM provide trusted data to drive and deliver best practice busi-
This is information such as RSS feeds, blogs and posts, emails associated with ness processes. This value includes the ability to holistically manage data within
purchasing documents, and other semi-structured information thats important business processes, ensuring the quality and ability to reuse the data.
to the enterprise. Here are a few examples of operational uses of EIM:
5 Text data
Cloud integration
In Figure 1.5, the piece of information that reads The car should self-drive on
As more business applications are running in the cloud, organizations need a
the highway may come from a survey or be a comment on a social media or
way to integrate business processes and data between on-premise and cloud
other website and, by itself, might not be important. However, if youre look-
systems.
ing at car design over the next five years, and 60% of the comments you receive
include something about self-driving, this comment warrants further investiga- Data migration due to mergers, acquisitions, and global implementations
across all industries
tion. Information management includes looking into text you receive and anal-
ysis to determine sentiment, feedback, input, or actions that should be taken Information management lowers the risk of business and application disrup-
based on comments. Examples of text data include comments from suppliers tion during mergers, acquisitions, and new application implementations.

32 33
1 Introducing Enterprise Information Management Common Use Cases for EIM 1.2

Harmonized master data across line of businesses quality and assessment is an ongoing business process; it includes, for example,
Harmonized master data across disparate applications enables a single view of tracking articles that have not been maintained in required stores, articles miss-
master data across the enterprise. ing valid sales price conditions, articles missing required procurement data,
Compliance and regulations in the financial industry and articles with duplicate EAN codes.
The financial industry has requirements for financial risk-related data analysis. Notice that many of these examples are focused on ensuring that information is
All data must meet quality levels and industry standards, and all associated con- managed, is available, is reliable, and serves the operational business process; the
tent (e.g., documents and invoices) must be correctly associated to financial list can go on and on.
contracts.
Chapter 6 provides more detailed real-world and practical scenarios for EIM.
Suspect tracking in public safety organizations
Federal, local, and state agencies must share information on criminal activity and
suspect tracking. Information management ensures that each new suspect is 1.2.2 EIM for Analytical Use Cases
compared to others to confirm that its a unique suspect. Data quality rules can EIM has a long history in business intelligence (BI) and analytics. If you look at
ensure that the most up-to-date information is available for suspect tracking. some definitions of EIM online, youll see statements saying that EIM drives
Retaining and deleting information in the pharmaceutical industry decision-making analytics. Many of the operational use cases mentioned previ-
During the development of new medicines, all documents and government ously also fit into operational reporting and have some reuse for strategic report-
standards must be adhered to through various stages of research, development, ing and analytics as well. Some examples of EIM for BI and analytics include the
trial, and release. When the compliance period has ended, information should following:
be removed unless its required for a legal hold.
Big data analysis
Fraud detection in telecommunication and other industries To unlock the potential of big data sources, EIM provides the capabilities to
Telecommunications, media, high tech, and utilities share similar requirements access and understand data from any source and variety, including Hadoop,
for capturing, addressing, and mitigating fraudulent activity. Large volumes of and integrates it with existing data for better analysis of customer sentiment,
data and real-time transactions place these industries at increased risk, as per- fraud detection, new innovation opportunities, and competitive insights.
petrators can be on and gone before they are caught using traditional time-
Analysis of supplier spend
consuming software reporting methods provided by vendors today. Informa-
Analysis of who are the top suppliers, how much they spend, and payment and
tion management enables the filtering of diverse data to determine where the
credit issues can only be done if supplier records are transparent and harmo-
company is losing money across a broad spectrum of applications and business
nized, cleansed, and de-duplicated. When making decisions that are related to
processes.
the supply network, the supplier data must be accurate and trusted.
Plant maintenance compliance and data assessment
True cost assessment of manufacturing goods in the manufacturing industry
Ensuring that the virtual plant aligns with the physical plant, information man-
Analyze total costs for making and delivering products. Crossing multiple busi-
agement ensures that maintenance plans and documents are associated with
ness domains, data must be cleansed, duplicates removed, and correlations cre-
each asset, asset tags are accurate, functional location information is complete,
ated to ensure that analysis provides accurate information.
and all asset document and maintenance guides are available on the plant
floor. Bring together timely, accurate, and actionable data to provide insights into
the factors impacting sales and customer behavior
Data quality and data assessment in the retail industry
Silos of data sources and applications, limited business user access, and depen-
The retail industry requires high data quality; for instance, retailers must know
article data throughout all stores where the articles are sold. For retailers, data dence on IT to create reports limits the ability of a business to gain insights on

34 35
1 Introducing Enterprise Information Management Common Drivers for EIM 1.3

sales and customer behavior. Information management brings together the data Common business problems that require an EIM strategy often may not have the
and provides data lineage and analysis so the users can create reports and know words information, enterprise, management, data, or governance in them. The busi-
where the data is coming from. ness issues driving initiatives for EIM include (but are not limited to) trucks going
Text mining to understand opinion and sentiment out at the wrong weight, deliveries to the wrong location, hazardous products not
Text and rich media content thats accessible on the web or on social media in compliance with government standards, customer satisfaction issues, incorrect
sites contain a lot of information that can be analyzed and used for sentiment billing, misunderstood supplier networks, services that dont align with customer
analysis to get a better understanding of consumer opinions about a product or demand, lack of compliance with a government mandate that impacts payments
idea. or revenue, and so on. Many process issues are the result of a lack of an informa-
tion management strategyfrom poor-quality data to master data not being
updated correctly, to not having the documents required for order processing, to
1.2.3 EIM for Information Governance financial documents not aligned with sales documents, to different parts of the
A primary use case for EIM is the management and governance of information organization using similar terms in different ways.
as a strategic asset, usually referred to as information governance. Information
Adoption of EIM capabilities is usually driven by a few fundamental needs
governance is a discipline that oversees the management of your enterprises
responding to a growing set of compliance requirements, improved operational
information. Without it, there is no EIM. Information governance involves peo-
efficiency, and the strategic application of information to better manage your
ple, processes, policies, and technologies in support of managing information
organization and gain competitive advantage.
across the organization. Its advisable to have some degree of information gov-
ernance in place for any EIM use case, analytical or operational, as this provides Next, we discuss specific examples of issues as drivers of EIM adoption.
a framework for the enterprise to reuse policies, standards, and organizational
best practices.
1.3.1 Operational Efficiency as a Driver of EIM
Information governance is the linchpin of EIM that empowers business users to Operational efficiency includes many moving parts to ensure the company has an
own and manage data as a strategic asset, governs data in the business process to improved operational margin. From the EIM perspective, operational efficiency
optimize operational performance and ensure compliance, and establishes trust in includes the provisioning and preparation of data so that it can be used to keep
structured and unstructured information by ensuring data quality throughout its the business running well. The following subsections describe typical operational
lifecycle. efficiency scenarios and the role of EIM.
Information governance will be a common thread throughout the book and will
be covered in more detail in Chapter 2. Improving Payment Processing
The time thats taken to collect payments and the improvement of payment pro-
cessing is critical in all industries, and is heavily impacted by the quality of the data.
1.3 Common Drivers for EIM One example is the healthcare industry, in which its critical to ensure that hospitals
Information can be a strategic weapon if an organization manages enterprise collect what they should from government agencies such as Medicare and Medicaid
assets such as capital. Treating information as an organizational asset recognizes in the United States. Effectively provisioning data from disparate systems ensures
that it moves from a single-purpose use to something that must be managed for data compliance with U.S. laws for Medicare and Medicaid and enables hospitals to
multiple uses. receive their payments, having an impact in the millions of dollars.

36 37
1 Introducing Enterprise Information Management Common Drivers for EIM 1.3

Ensuring a Successful SAP ERP Go-Live 1.3.2 Information as an Organizational Asset


An SAP customer was implementing a new SAP system and had to migrate data All organizations have assetscapital, employees, materials, brands, and physical
from many non-SAP data sources. The customer was concerned about the large and intellectual propertythat are all managed carefully. Information is similar,
volumes of data to migrate from both the parent company and a variety of sub- as it, too, is an asset that must be managed and protected. With the right EIM
sidiaries. It was critical that the entire business not be on hold during the migra- strategy, information can be leveraged and used as an organizational asset. Well
tion, and the data from the migration had to be loaded accurately and safely. The now discuss some specific examples of how actual companies use information as
requirement from the customer was a single, integrated application providing a an organizational asset.
high degree of visibility that could be easily rolled out to multiple subsidiaries,
eliminating hours of custom coding to load data into the SAP system. SAPs EIM
Improving Patient Care and Payer Response
solutions were used to extract data from third-party applications and support a
smooth transfer to a new environment. This automated approach saved valuable An SAP customer is a large hospital conglomerate focused on first-class patient care
resources and expedited data migration processes, resulting in a smoothand on- and creating innovative ways to improve care. First-class patient care requires the
timego-live, reducing the overall cost of the implementation. management of information in large volumes and with daunting complexity. EIM
was used to extract, transform, integrate, cleanse, load, and correlate patient
records from many diverse systems for analysis by doctors and line managers. The
Consolidating Systems to Improve Information Management cleansed and aligned data enabled line managers to improve operational efficiency
and Reduce IT Spend (including aligning information across multiple hospitals). The project extended
An SAP customer ran 80 % of its business with several SAP systems and wanted the use of information such that doctors now have the ability to slice and dice
to reduce IT costs and improve transparency of information across the systems. information as needed on patient groups and to provide recommended treatments
The company had 8 SAP systems when only 3 were needed and more than 400 and wellness programs based on trends, including re-admittance trends, long-term
non-SAP systems, most of which could be retired. EIMs role in this included performance of different treatments, and so on. The other focus of the project was
the assessment, alignment, migration, and retirement of data and legacy sys- to ensure a high quality level of data provided to and by patients. The improved
tems and ensuring that the 3 remaining SAP systems had accurate and timely data quality improved patient service, which led to improved payments by payers,
information. resulting in the collection of several million outstanding dollars.

Speaking the Same Language to Increase Operational Efficiency Growing Past Tribal Knowledge to Enterprise Information
Another SAP customer had issues where no one spoke the same language. For A large SAP customer had a wealth of information that was vitally needed across
example, the term margin covered different realities depending on the depart- departmental lines, but the informationdocuments, spreadsheets, manuals
ment and employees concerned. To set things right, the company specified four was locked up in information silos. Sharedor nonsharedhard drives, separate
objectives for itself: to centralize its data in a common environment; to secure the portals, and multiple content repositories held the data, with no central search or
data; to make the data more reliable, especially for management access; and to access capability. This resulted in tribal knowledge; the different departments
standardize its vocabulary for indicators. EIM accelerates employee access to could usually find the information that their employees created, but this informa-
information and, as a result, saves significantly on the amount of time required to tion wasnt effectively shared with other departments. By implementing a strate-
perform routine tasks. Teams made enormous gains in responsiveness. Where it gic Enterprise Content Management (ECM) and global search capability, the cus-
previously took one week for data to be available after accounts were closed, the tomer was able to create a single enterprise information store that all employees
operation is now instantaneous. could search and use, regardless of department.

38 39
1 Introducing Enterprise Information Management Impact of Big Data on EIM 1.4

Improving Data Quality for Customer Interactions Building According to Specs


Another SAP customer had a goal to create a 360-degree view of customer data for Remember the Mars probe? Launched in 1999, the Mars Climate Orbiter was
sales, marketing, and service. EIM was used to consolidate heterogeneous data designed to gather amazing data to help scientists better understand the universe.
into a single database; integrate structures and processes across sales, marketing, However, none of that amazing data was gathered from the $125 million venture
and service; and ensure a systematic information exchange between field sales because groups working across the globe failed to operate under similar units of
and sales support. Improved customer data quality strengthened dialogue with measure. Specifically, the American units of measurement used in construction
these customers and systematized customer-related processes across sales, mar- had to be converted to metric units for operation. Core information management
keting, and service. The data quality improvement and improved transparency principles could have helped alleviate this risk by documenting the data defini-
drove a new structured quotation process. The new process provides time savings tions and outlining use of that data throughout the datas lifecycle.
and fewer errors when creating quotations.

Maintaining Industry Standards for Data


1.3.3 Compliance as a Driver of EIM Some external standards apply to entire industries. For example, global standards
As governmental regulations and controls increase, and the cost of legal issues due (GS1 standards) aim to help companies exchange information in the same format,
to data issues rises, compliance plays a key role in most industries. Every company thereby increasing the efficiency and visibility of supply and demand chains glo-
and its network of suppliers that produce a durable good that ends up in a shop- bally.2 To participate, however, you not only need to understand the relevant GS1
ping cart have compliance requirements. Other organizations, such as utilities, standard, but you must also fully understand your data, the data model, and cur-
government agencies, security, and financial service providers, are also subject to rent data quality levels. Without this baseline understanding, your use of GS1
regulatory and compliance issues. In addition to industries, countries have import would be flawed at best, and you would miss golden optimization opportunities.
and export regulations that impact the ability to do business globally.

In the following subsections, we discuss some general examples of compliance


issues that indicate a need for an information management strategy. 1.4 Impact of Big Data on EIM
Its well documented that the volume of data created in organizations is large and
Keeping Data Too Long growing at an unprecedented velocity. Organizational datastores are now com-
For regulatory compliance, companies must ensure that they keep retention-rel- monly measured in terabytes or even petabytes. There are many reasons for the
evant data for a minimum period of time, as defined by retention laws. They must unparalleled growth in datastores: social media, compliance and regulatory
also ensure that certain data is purged from the system. For example, data privacy requirements, transactional data, sensory data (such as data from real-time shop
laws mandate the destruction of person-related data after a specified period of floor sensors), multimedia content, mobile devices, RFID-enabled devices, the
time. In Germany, companies must delete data from rejected job applicants in internet of things (connected devices), the never-ending quest to improve orga-
their HR systems not earlier than 6 months, but not later than 12 months, after nizational effectiveness, and the list goes on. The fact is that data creation has
the applicant was rejected. Failing to comply with these regulations may result in become a by-product of nearly all individual and organizational activities.
large fines for companies. Another example is a pharmaceutical company that Moreover, the reason data is preserved and reused is that it has value well beyond
must keep information related to a new clinical trial for a number of years. After its original use. We dare to say that the value of the data created to automate busi-
that time has passed, the information should be deleted. Not deleting sensitive
ness processes may in some cases be greater than the process itself. Today, the
information after the required retention period increases the risk from potential
lawsuits.
2 Source: http://www.GS1.org/about/overview

40 41
1 Introducing Enterprise Information Management SAPs Strategy for EIM 1.5

market has christened the phenomena of organizations desire to harness the treatment. The human genome contains 6 billion DNA base pairs; as the genome
great torrent of data, as well as the velocity, variety, and variability of information sequence for each patient will be decrypted in the near future, these billions of
known as big data. Figure 1.6 is a representation of the volume, velocity, variety, data points must be managed. Add to that documentation and features such as
and variability of data. It remains to be seen if the term big data will stick. How- speech recognition, and youll end up with 20 terabytes per patient.
ever, as long as organizations can create value through data, the continued growth
The velocity of data collection is building daily, and you must manage and make
and importance of data will be immutable. Fortunately, advancements in compu-
sense of your data on the fly. You need to remain flexible through instability and
tational power, storage capacity, information access and management, and analyt-
change. You cant underestimate the pace of innovation, and you dont want to be
ics are progressing at an equally impressive rate. Two such advancements are playing catch-up with your competitors. If planning and implementing a coherent
Hadoop and SAP HANA (to be discussed further in Chapter 3). The combination data management strategy seems daunting when your organization owns a few
of massively greater amounts of data with the tools and talent to analyze it prom- terabytes of data, how difficult will it be when you own thousands of terabytes?
ises to launch the next wave of innovation and productivity and even spawn new
business models. The best way to realize the promise of big data, today and in the future, is to
develop and adopt an EIM strategy. This strategy should cover your entire enter-
prise to take advantage of the benefits of sharing information and aggregating
Mobile
Inventory data across your organization. Typical topics that must be considered for an
CRM Data effective EIM strategy include interoperable data models, architectures for ana-
lytical and transactional data, integration architecture, analytical architecture,
GPS

and information security and compliance. The goal is to have data that is share-
Emails

Planning Demand able and can be leveraged over time within and across business units. The
deployment of SAP solutions for EIM within a defined EIM strategy is a key start-
Tweets

Speed
ing point. The alternative is to have massive amounts of disintegrated and unre-

Instant Messages
Opportunities

liable data analyzed fast.

Garbage in, garbage out is one of the oldest adages in information processing;
when the volume of data reaches the big data stage, getting productive use of
Velocity poorly managed information becomes the equivalent of searching for a priceless
Customer antique in a landfill.

Things 1.5 SAPs Strategy for EIM


Service Calls

SAP recognizes the importance of maximizing the value of enterprise information


Sales Orders in support of any data-driven analytical, operational, or governance initiatives. To
Transactions achieve this, organizations need a comprehensive suite of solutions providing the
capabilities from architect to archive. Figure 1.7 shows SAP solutions for EIM.
Figure 1.6 Information Growing in Volume, Velocity, Variety, and Variability
SAP solutions for EIM are comprehensive in functionality, including capabilities to
One example of new innovation provided by the ability to manage and analyze support enterprise architecture, data integration, data quality, master data manage-
big data is in the healthcare industry, specifically related to the area of cancer ment, enterprise content management, and information lifecycle management.

42 43
This chapter introduces SAP PowerDesigner as a modeling and
design-time metadata management platform for information
management designs.

7 SAP PowerDesigner

All enterprises today are or will be faced with a transformative event, such as reg-
ulation changes, merger and acquisition activity, or enablement of new business
models from new technologies (e.g., cloud and in-memory). You need to be able
to treat information as a corporate asset to succeed with such business transfor-
mation. This chapter focuses on the discipline of enterprise information architec-
ture (EIA) as part of SAP Enterprise Information Management (EIM), and how
tools such as SAP PowerDesigner, a modeling and design-time metadata manage-
ment platform, enable you to understand your current information landscape,
align business information with technical implementation, and plan for change.

Architecture is about planning for, designing, and executing change. SAP PowerDe-
signer (hereafter PowerDesigner)s value is best realized when we use the current
state information models, captured and documented in the tool, to help us plan the
next generation business. Transformation needs a plan, and designing future state
versions of data models, aligned to the current conceptual data model (CDM) and
business glossary, ensures we make a united step forward in any step along the way.

Adding technical details in logical data models (LDMs) and physical data models
(PDMs), together with specialized analytics models, ensures that we can commu-
nicate details to the responsible database development teams. PowerDesigners
unique Link and Sync technology streamlines impact analysis and design-time
change management, reducing the time, cost, and risk associated with change.

In this chapter, well explore enterprise information architecture, including the


different model types, the core components of each, and how they work together
to make a complete view of information for designers. This chapter will also
cover how the repository helps with tasks such as managing model-to-model
dependencies and impact analysis. Youll learn the value that architecting, or

269
7 SAP PowerDesigner Defining and Describing Business Information with the Enterprise Glossary 7.2

planning, provides to all organizations that are faced with managing complex known metadata, both operational and architectural, to be visible to the data
change in information systems. steward as he manages the quality of information sources in operation.

PowerDesigners dimensional diagram can create SAP BusinessObjects universes.


PowerDesigner can read a universe and create a new, or merge with an existing,
7.1 SAP PowerDesigner in the SAP Landscape
dimensional diagram.
PowerDesigner provides architecture and modeling capabilities to all organiza- PowerDesigner can reverse engineer Replication Servers catalog to create or
tions and is uniquely integrated into many SAP products. PowerDesigner is inte- merge with an existing data movement model. This data movement model can
grated with SAP Business Suite and the SAP HANA Cloud Platform (HCP). Within generate new replication definitions. Special patterns exist to streamline use cases
the EIM landscape, PowerDesigner is integrated with SAP Information Steward of replication and SAP Data Services (Data Services) together to implement real-
(hereafter Information Steward), SAP BusinessObjects, and SAP Replication time loading and other scenarios.
Server (hereafter Replication Server). PowerDesigner is also a key element of
Intelligent Business Operations powered by SAP.

7.2 Defining and Describing Business Information with


7.1.1 SAP Business Suite the Enterprise Glossary
PowerDesigner can connect to the SAP Business Suite and create a PDM repre-
An enterprise glossary helps everyone define and describe information assets and
senting the data dictionary by reading the business and technical metadata from
related technology. It lists business terms in business language, independent of
SAP Business Suite. This is very useful when looking at SAP Business Suite as the
any data characteristics. One term can relate to multiple data items (atomic data
standard definition for any homemade applications built around common data
elements), and a data item can have multiple terms associated with it.
sets, or for when preparing for an enterprise data warehouse and extracting data
from SAP Business Suite to populate the warehouse as one of the key sources.
Example
NeedsEIM Inc. defines its information model to have a customer entity that can have a
7.1.2 SAP HANA Cloud Platform customer address attribute, which is combining the terms customer and address together
SAP HANA has a repository thats used for the development and implementation to make up its name.
of data structures that is optimized for helping developers get the most out of SAP
HANAs unique in-memory capability. PowerDesigner can write to the SAP In PowerDesigner, the enterprise glossary is a global service provided by the
HANA repository or read from it. Reading the SAP HANA repository creates or repository that is available to all users. It contains all terms, synonyms, and
updates a PDM in PowerDesigner. PowerDesigner can also take a PDM that related terms, grouped by nested term categories. A glossary term identifies the
includes SAP HANA-specific attribute and analytic views and create new, or term (Name) and provides a standard abbreviation for the term (Code) and a def-
merge with existing, repository objects. inition (Description). The glossary term will be created within a category folder
(Category) and may also be further defined in an external system and referenced
7.1.3 SAP Information Steward, SAP BusinessObjects Universes, via a URL (Reference URL). As you can see in Figure 7.1 in the next subsection, the
and Replication business term commission is defined, and every time the word commission
appears in the design (such as a table or column name), the standard abbreviation
PowerDesigners repository is read by Information Steward, enabling people to
read metadata from PowerDesigners PDMs, LDMs, and CDMs. This allows all the

270 271
7 SAP PowerDesigner The Conceptual Data Model 7.3

of CMSN will be used in the name. You can also see that this term is approved 7.2.2 Naming Standards Definitions
in the Status box, so you know its the right definition for this term. PowerDesigner can be configured to use the glossary to ensure all names used
PowerDesigners glossary is meant to be a direct reflection of the business glos- throughout a model are found within the list of terms. To configure PowerDe-
sary in Information Steward. Information Steward is used to capture, define, and signer to use the enterprise glossary, follow these steps:
manage the glossary terms and relate them to the metadata of operational sys- 1. Select Tools  Model Options, and then select Naming Convention.
tems, while in PowerDesigner, the same terms can be imported and then used to
2. Check Enable glossary for autocompletion and compliance checking.
standardize names for all new information assets that are defined in any model.
3. Select the Name to Code tab, and set Conversion Table to glossary terms.

7.2.1 Glossary Terms for Naming Standards Enforcement You can combine multiple terms into one name (e.g., Customer Address using
terms Customer and Address).
Using a common business language ensures that when users collaborate across
business units, or outside the company, theyre all using the same concepts in the You can also enable automatic conversions of names to implementation concept
same way. This is a critical part of establishing enterprise information architecture Code values. In PowerDesigner, the Name field is the business language descrip-
and a key component of any data dictionary. The enterprise glossary (see Figure tor, while the Code field represents the name used for the object when converted
7.1) can be used to manage naming standards for all design models in PowerDe- into any sort of implementation code (e.g., when used in a CREATE TABLE state-
signer. The Name field is used for name lookup, and any name that matches a term ment).
is linked to that term. If there are any aliases associated, when you begin to type
the alias, PowerDesigner detects the use of an alias and indicates that there is a
preferred term to use in lieu of the alias. This helps establish the enterprise use of 7.3 The Conceptual Data Model
the preferred term and further increases understandability and readability of all
models as everyone will be using the standard terms. PowerDesigner supports the definition of a CDM. For an organization to treat
information as a corporate asset, all information sources should be derived from
a common definition, or a core concept. A CDM is meant to model a single defi-
nition of any data asset, independent of both the storage paradigm (relational,
hierarchical) and the physical characteristics of the systems that will ultimately
store them.

The enterprise CDM also represents the sum of all use cases for a given data con-
cept. Any entity defined in the enterprise CDM will have all the attributes needed
for all processes or all applications. For example, the enterprise CDM entity for
customer will have all attributes together, whether used for order, relationship,
support management, and more; while LDMs and PDMs that represent the indi-
vidual systems will have their own subset of these attributes. This will help ensure
that any attributes that are shared between implementations follow a common
standard and will reduce the impedance mismatches found when you later need
to integrate these data sets together.

Figure 7.1 A Glossary Term in SAP PowerDesigner

272 273
7 SAP PowerDesigner The Conceptual Data Model 7.3

Lets review the core components of an enterprise CDM by looking at elements, attributes using that data item). For example, a domain called Name can define the
attributes, data items, and domains in the following sections. data type, length, and other common characteristics of any name type of data item
in the model. Anything using Name (e.g., Product Name, Customer Name, or
Company Name) that is also using the Name domain will share this common char-
7.3.1 Conceptual Data Elements, Attributes, and Data Items
acteristic. The key difference between a domain and a data item in PowerDesigner
PowerDesigner manages enterprise CDM concepts such as entities, attributes, is that the data item is a direct representation of an attribute on one or more enti-
data items, and domains. These four concepts make up the core of the CDM, and ties and carries a name representing a cell of information, while the domain is a
well discuss them in more detail in the following subsections. common set of data characteristics used by one or more data items and doesnt
represent a cell of information itself, just its common structural characteristics.
Entities
Entities are structured elements that define a core business concept that you need to 7.3.2 Separation of Domains, Data Items, and Entity Attributes
keep account of, such as product, customer, or delivery. Anything the business as a The key advantages to this separation of entities, data items, and domains are free-
whole needs to account for and keep records of should be represented by an entity dom of expression and improved standardization.
in the CDM. A CDMs entity should represent a single global view of all possible
attributes that the concept may need for any given use case or business process. Domains standardize common data characteristics for any information you need
to manage for the business, regardless of what you call it. This ensures a consistent
use of data structures for all attributes that are of a common concept, such as
Attributes and Data Items money, name, or phone number. When data items follow a common standard
In PowerDesigner, entity attributes and data items are separate but tightly related domain like this, comparing and integrating data is a lot easier. You wont need to
concepts. Data items in PowerDesigner represent a unique data cella single create complex transformation code to make the two different data elements
value of a specific type for a specific purpose. Examples of data items are Cus- match in form and structure, so you can get right to comparing values.
tomer Name, Delivery Date, Product Description, or Phone Number.

Because data items exist independent of the entity attributes they represent, you 7.3.3 Entity Relationships
can use them as a data dictionary, or list of all atomic data managed in the enter-
The enterprise CDM would not be complete without the relationships that are
prise. This list of data items, or the data dictionary, is useful to communicate with
defined between the entities. The CDM is essentially an Entity-Relationship Dia-
the data stewards to ensure you have the right definition for the data independent
gram (ERD). The relationships between the entities complete the understanding
of any use in an entity or any physical implementation in a database.
of the business data the CDM represents. There are two major types of relation-
Entity attributes are a relationship, or link, between an entity and a data item. For ships in the CDM: the ones that represent how two entities are connected to each
example, when the Customer entity is related to the Customer Name data item, other, and the ones that represent entities that are, in essence, a specialization of
the Customer entity will have an attribute called Customer Name. Any changes another.
made to the data Item will be reflected in the attribute as well.
Relationships that represent the connections between two entities carry cardinal-
ity; that is, the frequency of the instances of each side. You can define relationships
Domains of cardinality types zero- or one-to-many, many-to-many, and one-to-one (see
Domains provide another level of data standardization. A domain is a named set of Figure 7.2, showing a one-to-many between Customer and Order and a many-to-
common data characteristics for any number of data items (and therefore all many between Items and Order). Relationships representing a supertype/subtype,

274 275
7 SAP PowerDesigner The Conceptual Data Model 7.3

also known as an is-a relationship, may also be defined in the CDM using the Version Terms and the Enterprise CDM
inheritance object. When you define an inheritance, or is-a relationship, all Different versions of the enterprise CDM will be attributed to different projects at
attributes of the parent are available attributes of each child. different stages in their lifecycle. You can do this in PowerDesigner by setting up
To define a relationship in PowerDesigner, use the Relationship tool from the a configuration in the repository. Configurations are defined in the Repository
tool palette. Follow these steps: menu, under Configurations. You can create a new configuration and then add
specific model versions to it from a select list. Using a PowerDesigner configura-
1. Select the Relationship tool, click on one of the entities, and drag to the second tion, you can indicate which specific versions of the enterprise CDM are related to
entity to link. which versions of the logical and physical models representing projects and
2. To change the cardinality settings, double-click on the relationship line, and implemented systems.
you can change the following:
Cardinalities, One to Many, Many to Many, or One to One Dont Overload a Single Concept
The Role name (in both directions) to label the relationship, typically with a Let each data item represent a single concept. For example, break address con-
verb cepts into their lowest levels of detail (street number, street name, city, etc.). You
Mandatory (on each end), determining whether a parent can exist without do this manually in PowerDesigner by creating additional data items for the more
any children or not, and whether a child can exist without a parent, or not granular elements and removing the complex one. This way, the language thats
used to identify the data item and the meaning of the information it represents
Employee
will be crisp and clear.
Is A
Employee Identifier <pi>
Employee name <ai>
Employee Description <ai> Keep Definitions Granular
If you need too many examples and too many sentences to describe a single busi-
Stock Clerk Shipper Sales
ness information concept, then it may be too complex for a single entity or data
Hourly Rate Salaries Salaries
Commission item to represent it. You should consider simplifying the concept to a common
denominator or finding some way to separate it into multiple discrete concepts.
Customer
ID <pi>
Order In PowerDesigner, you simply create additional entities and attributes to define
Surname OrderID <pi> these more granular concepts.
GivenName Description
... ...

Figure 7.2 An Example CDM Use Synonyms Where Possible


Make sure a common concept shares a common language. Assign synonyms to a
7.3.4 Best Practices for Building and Maintaining an Enterprise CDM common term in the enterprise glossary so that the preferred term is always
known. You do this in PowerDesigner by double-clicking the term in the glossary
Business details are discovered over time, not all at once. The definitions of busi-
browser and selecting the Synonyms tab. Any word you enter in the Synonyms list
ness terms evolve as the business evolves. New terms are discovered, old terms
will be an alternate term defining the same concept as the term itself (now known
obsoleted, and existing terms redefined. In the following subsections, well dis-
as the preferred term). This way, you dont confuse a different name as something
cuss what to keep in mind when defining an enterprise CDM.
with a completely different concept.

276 277
7 SAP PowerDesigner Detailing Information Systems with Logical and Physical Data Models 7.4

Keep Obsolete Concepts Example


If you have a concept thats no longer needed, its better to leave the definition in In identifying and managing customer metadata, NeedsEIM Inc. creates an entity for the
the enterprise CDM, marked as obsolete. You can do this in PowerDesigner by customer concept that has all attributes, including customer name, address, gender, age
unchecking the Generate checkbox, which prevents the concept from moving range, income bracket, and more. The LDM for the order-to-cash functional area will
forward into LDMs and PDMs. This way, any new concepts that are similar wont only take the name and address attributes. A completely separate LDM for customer
relationship management will take only the demographic attributes.
reuse the old terms and entities, but create new ones. This ensures that there will
be no future confusion with older systems using the original definition of that
concept. LDMs and PDMs design information structures within a given storage paradigm.
When targeting an RDBMS, the LDM represents the relational structures and
includes relational concepts such as migrated foreign keys. The PDM adds the
Dont Redefine and Reuse
vendor- and version-specific RDBMS details such as physical data types, triggers
This complements the idea that you should keep obsolete concepts around. If and procedures, and more. Other types of LDMs exist, such as a hierarchical rep-
something has really changed enough that the definition of the concept deviates resentation in canonical data models (XML structures) or an object-oriented rep-
from the original idea, then a new term, new data item, or new entity should be resentation targeting object-oriented systems design.
defined, and the original one should be kept around for legacy reasons. In Power-
Designer, you can mark the old term as Legacy in the Stereotype field, and
uncheck the Generate checkbox. A good test of this is whether the original con- 7.4.2 Structure and Technical Considerations
cept fits within the new definition, or whether the data sets managed by the con- LDMs and PDMs contain structure definitions that have nothing to do with busi-
cept would have to be deliberately segregated to keep them understood. ness data definitions, and everything to do with technical considerations for imple-
mentation. As shown in Figure 7.3, details such as foreign keys to define how rela-
tionships will be stored, or link entities storing the keys of many-to-many
7.4 Detailing Information Systems with Logical relationships are foreign to the business; they have no meaning when trying to
and Physical Data Models understand a business concept. PDMs may involve denormalizing; for example,
combining multiple tables or duplicating columns in more than one table to reduce
The PowerDesigner LDM and PDM represent the Relational Database Manage- the number of joins needed in a query and improve application performance.
ment Systems (RDBMSs) that implement the data concepts from the enterprise
The LDM helps us prepare for physical implementation, and represents the data
CDM. These models differ fundamentally from the enterprise CDM in three key
structures for a given functional area. It may represent multiple databases, from
ways: scope, structure, and technical considerations.
multiple vendor/version RDBMSs. The PDM is an abstraction from the actual
details of a physical implementation and is useful for application designers and
7.4.1 Scope developers to know what information is available. The PDM is there to develop
LDMs and PDMs are slivers of the enterprise, representing a specific subset of the the actual database and adds details such as indexes, views, referential integrity
concept to be implemented. These models represent a given functional area of the constraints, triggers, stored procedures, and more.
business and their one or more physical databases. While the enterprise CDM has Each PDM is tightly related to a specific relational database vendor and version
a single namespacea name can only be used once for the entire enterprise and is intended to be a 1:1 representation of the actual physical database. The
CDMthe logical and physical layers allow for multiple namespaces, each one PDM can be created by reverse engineering an existing running database. Any
constrained by a given system boundary. PDM can be used to generate new Data Definition Language (DDL) files to create

278 279
7 SAP PowerDesigner Canonical Data Models, XML Structures, and Other Datastores 7.5

a new database, or can be compared to an existing database to update using DDL


and Data Movement Language (DML) to change the schema while keeping the CustomerType
existing data in place. Customer Address ADDRESS
{Customer Type} Identifier IDENTIFIER
Name NAME
Phone PHONE
Order
Order Number <pi> State
Customer ID <fi4> Identifier CustomerType
Employee Identifier <fi2> Identifier Client Address ADDRESS
Shipper Identifier <fi1> Identifier {Customer Type} Identifier IDENTIFIER
Sales Identifier <fi3> Identifier Name NAME
Phone PHONE
Description Long Text
Primary Identifier <pi> Figure 7.4 XML Model in SAP PowerDesigner showing complex type reuse

Order Items Many organizations have worked to standardize the structures of message formats
Item ID <pi,fi2> Identifier by using a Canonical Data Model, which is an XML model that gathers all the ele-
Order ID <pi,fi1> State
ments of all the messages together and creates a series of XML complex types to
Order Items Key <pi> define commonly reused data structures. This Canonical Data Model is a sort of
data dictionary for the messages themselves.
Customer
Items In PowerDesigner, mappings can be created between the complex type defini-
Customer Address Address
Customer ID <pi> Identifier Item ID <pi> Identifier tions and the data model representing how message content can be stored in one
Customer Name Name Description Long Text or more physical databases (see Figure 7.5).
Customer Phone Phone
Primary Identifier <pi>
Customer Key <pi>

Figure 7.3 Logical Data Model with Migrated Foreign Keys

7.5 Canonical Data Models, XML Structures,


and Other Datastores
Enterprise information architecture goes beyond relational databases and
includes information in all structures within the enterprise. One common repre-
sentation of information in nonrelational structures is the XML formatted mes-
sages used to communicate between systems. XML Schema Definitions (XSDs)
represent the messages and the message structure.

PowerDesigner has a special XML model, shown in Figure 7.4, that represents an
XSD directly and can map that model to one or more PDMs to show where the
data in messages is read from or written to. Figure 7.5 XML Model Mappings with a PDM

280 281
7 SAP PowerDesigner Data Warehouse Modeling: Movement and Reporting 7.6

Use the Mapping Editor from the Tools menu to define mappings. Then, create type dropdown. At the physical table level, this helps report designers know what
the mapping definitions by dragging the data elements from the left dropping tables contain the different types information, which ones represents things the
them to the XML structures on the right. business will measure, and the variables by which we partition them.

In PowerDesigner, you can also create a library of commonly reused complex In PowerDesigner, you may select Multidimensional Objects, Retrieve Multidi-
types and then use shortcuts to reuse these in any number of XML models repre- mensional Objects from the Tools menu and automatically detect the dimension
senting different sets of messages. To do this, create a new XML model in Power- type based on key structures of each table. For tables that have a compound pri-
Designer, and either reverse engineer an existing XSD with the complex types mary key made up of foreign keys migrated from other tables, the logic deter-
defined, or use the palette to create new complex types in the model. When you mines that its a likely fact table, and for all other key structures, the table is deter-
check the model into the repository, click the Advanced button, and select mined to be a dimension.
Library in the Folder option.

Dimensional Modeling

7.6 Data Warehouse Modeling: Movement and Reporting In PowerDesigner, dimensional models represent the analytic reports themselves.
The dimensional model is a graphical representation of fact and dimension objects.
When you start trying to define and describe the data warehouse and business As shown in Figure 7.6, fact objects represent one or more fact tables coming
analytics systems, you need to understand data in motion between source systems together to make a single fact concept. Dimension objects represent the dimension
and analytics stores. You also want to know the relationship between analytics tables collapsed into a simpler representation, complete with multiple hierarchies
systems and the underlying data warehouse database. This helps ensure that representing drill-up and drill-down opportunities within the attributes.
youve identified the right data sources, that you can answer the business ques-
tions needed to help in decision making, and that you know what parts of the sys-
Time Location
tem will be affected when changes happen to any given component of the envi-
Time_ID <h:1> Location ID
ronment. Year <h:2> State
Month <h:3> City
PowerDesigner data mappings are captured using the Mapping Editor for easy,
Day <h:4> Postal/Zip Code
drag-and-drop identification of the dependencies between transactional systems OrderTime OrderLocation
Time <Default> <h>
and analytics systems. Follow these steps:
Order
1. Select Mapping Editor from the Tools menu. If this is the first time youve
Measure
started the Mapping Editor, youll be prompted to complete a wizard to iden-
CustomerID
tify the sources for the mappings. ItemsID
2. You may identify one or more PDMs to represent the source for the data ware- Date ID
Product ID
house or master datastore. Product Customer
3. Create mappings by dragging a source data element (table or column) from the ItemsID OrderProduct OrderCustomer CustomerID
Description Name
left-hand side to the destination (table or column) on the right. You can also define Address
mappings between an enterprise data warehouse and a series of data marts. Phone

PowerDesigner table definitions allow you to mark mappings as a Fact or Dimen-


sion. To do this, go to the General tab, and select the option from the Dimensional Figure 7.6 Dimensional Model with Facts, Dimensions, and Hierarchies

282 283
7 SAP PowerDesigner Link and Sync for Impact Analysis and Change Management 7.7

These models are created either by selecting New Dimensional Diagram from the or the business process model. Linking between such models happens naturally
PDMs context menu, or running the wizard from Tools  Multidimensional for the most part; for example, attaching a list of data elements to a process.
Objects  Generate Cube.
When you define a CRUD matrix in a PowerDesigner Business Process Model
(BPM) referencing data in a CDM, youre creating links. When you create any type
Note
of dependency by drawing a reference, relationship, or inheritance, youre creat-
While its useful to mark tables as fact and dimension in order to identify where in the ing a link. You can also create links by opening any objects property sheet, going
database the structures for analytics systems will likely be finding information, its not a
to the Traceability Links tab, and clicking the New button to select any object in
description of a specific report.
any model.

You also establish links when binding requirements to any object through the
requirements traceability matrix. This is easily done in PowerDesigner by simply
7.7 Link and Sync for Impact Analysis and
opening the requirements traceability matrix, selecting any empty cell, and press-
Change Management
ing the (Space) bar. To remove a link, select a cell that contains a checkmark (iden-
PowerDesigner uses the dependencies that are tracked and managed between tifying the presence of a link), and press the (Space) bar. You can create dependen-
models to help facilitate impact analysis and change management. This is known cies between any two objects in PowerDesigner using the dependencies matrix,
as PowerDesigners Link and Sync technology. This allows CDMs, LDMs, and which looks and operates nearly identically to the requirements traceability
PDMs to remain synchronized through iterations of change without requiring matrix, but can be established between any two objects, in the same or in differ-
designers, architects, and developers to redo their work. ent models. To create a new dependency matrix, simply select New Traceability
Matrix from the models pop-up menu in the object browser, and specify the
Link and Sync captures the cross-domain dependencies, such as data used by a
object types to use for the rows and columns. You can also select which attribute
process step or flow, or the applications that access certain data assets. You can
will be used to identify the link, if more than one way to combine these objects is
show all business tasks and all applications that interact with enterprise data.
possible (e.g., reference or inheritance on an entity in a CDM).
In the following sections, well discuss how PowerDesigner can be used to create
links between any objects in any models, and how it automatically manages
Synching
model-to-model synchronization through the model generation engine.
The synchronizing part in PowerDesigner Link and Sync is when one model is
generated from another. PowerDesigner keeps track of the transformed objects
7.7.1 Link and Sync Technology and their source. When you generate a model from another (for example, when
From the name, you see that Link and Sync has two parts: the Link part and the creating a PDM from an LDM), the sync technology remembers everything. If you
Sync part. then make changes to the original model, the second generation isnt a new cre-
ation of a new PDM, but a write into the existing one generated the first time.
Sync technology publishes only the changes made in the LDM since the last gen-
Linking
eration. This way, any changes made to the PDM in areas not affected by the LDM
Linking is when a modeler recognizes a dependency between any two things in change will be preserved.
PowerDesigner and creates the link. You can create links between any PowerDe-
signer model, including models that arent directly used for data modeling but To initiate a synch process, use the model generator from the Tools menu. For
found in information and enterprise architecture, such as the requirements model example, to synchronize an LDM to a PDM, open the LDM first, and select

284 285
7 SAP PowerDesigner Link and Sync for Impact Analysis and Change Management 7.7

Generate Physical Data Model from the Tools menu. This initiates the sync 7.7.2 Impact Analysis Reporting
compare and presents you the Compare/Merge dialog. After accepting the The most important use case for keeping all these models linked and synchro-
changes you want to synchronize, PowerDesigner automatically applies them to nized together is so that you can determine what will happen if you change any-
the selected PDM and opens the PDM model when complete. thing. The Impact Analysis feature in PowerDesigner produces a list of impacted
PowerDesigners Merge Models dialog, shown in Figure 7.7, allows you to man- objects with a tree-like structure. Filters and other tools help scope the analysis to
ually override any preserved changes if needed, simply by checking the empty areas of interest. To begin an impact analysis in PowerDesigner, follow these
checkbox next to the detected difference. This is sometimes useful when imple- steps:
mentation starts to deviate too far from the original concept, and a reset in a pre- 1. Either select Impact Analysis from the Tools menu or right-click on any object
cise area is needed to get the database design back on track. in the browser or diagram area, and select Impact and lineage Analysis from
the pop-up menu.
2. Generate a diagram view from the tree view by clicking the Generate Diagram
button on the Impact and Lineage Analysis dialog box, as shown in Figure 7.8.
This diagram is very useful to collaborate with others in an easy-to-view format
(see Figure 7.9).

Figure 7.7 Compare/Merge Showing Preserved Differences

Synchronization ensures that models derived from each other remain aware of
each other and that dependencies can be tracked at the smallest level. This Sync
technology makes it natural and easy for business analysts, technical analysts,
architects, designers, and developers to remain in lockstep while managing con-
tinuous change at any level of abstraction.

Figure 7.8 SAP PowerDesigner Impact Analysis Dialog

286 287
7 SAP PowerDesigner Comparing Models 7.8

Model comparison is used whenever changes are made to a model and the model
(Order Management Process BPMN Descriptive) is checked into the repository. To initiate a compare in PowerDesigner, open the
Data Access Ship Local Postal Service Ground.Customer
(Order Management Process BPMN Descriptive) model you want to compare, and select Tools  Compare from the menu. You
Sequence Flow Create Order
(Order Management Process BPMN Descriptive)
must choose the other model to compare this one to, and select OK to run the
Data Access Process Order.Customer
comparison.
(Order Management Process BPMN Descriptive)
Sequence Flow Process Ship Ground Service
(Order Management Process BPMN Descriptive)
Figure 7.10 shows a typical Compare Models dialog for two CDMs. This compari-
Data Access Process Corporate Order.Customer sons feature is also used when generating changes from one model into another
(Order Management Process BPMN Descriptive)
Sequence Flow Process Corporate Order when using Preserve Modifications. Compare Between can also be run at any time.
(Order Management Process BPMN Descriptive)
Data Customer

(Corporate Conceptual Data Model) (Order Management MS SQL Data Model) (Order Management MS SQL 2008 Data Model)
Entity Customer Table Customer View V_Orders

(Order Management Relational Logical Data Model) (Order Management Oracle 11g Data Model)
Entity CUST Table Customer

Figure 7.9 SAP PowerDesigner Impact Analysis Diagram

Impact analysis makes sure you wont forget that certain dependencies exist and
will take them into consideration on each and every change request from business
or technical stakeholders. Downstream, you can see what objects will need to be
changed, tested, and verified based on this change. Because you know what data-
bases, applications, and systems will be affected, you can get all the right people
involved, and when the change is made to the operational systems, its done in a
way that minimizes any surprises and minimizes the risk of any unplanned down-
time.

7.8 Comparing Models Figure 7.10 SAP PowerDesigner Compare Dialog

Modeling is a great way to communicate and collaborate with different people on Model comparison is useful for several reasons. Its a great way to see if there are
any complex project. To communicate effectively, its not always practical to open any similarities in models from completely different sources. Its also a great way
a modeling tool, navigate through multiple models, and read screens. To help to see what changes are made between two different versions of a model, or for
share information in any model, PowerDesigner has ways to analyze and report understanding the gap between current and desired future state.
on that information and then share it with all nonmodelers in the enterprise.

288 289
7 SAP PowerDesigner

Options allow you to narrow the scope of the compare by excluding comments,
data types, or other elements. We may force a compare between two objects that
were not found to be the same by using the Manual Synchronization function.

Yellow and red flags indicate differences, bold and grayed out indicate presence
and absence of whole objects, and the detailed compare window at the bottom
shows the exact difference. The compare preview allows you to save the compar-
ison as a Microsoft Excel spreadsheet for further analysis.

7.9 Summary
In this chapter, you learned that using PowerDesigner as an integral part of the SAP
EIM solution gives you the power to successfully navigate the pitfalls of business
transformation. PowerDesigner provides the right tools to manage information as
a corporate asset today and into the future. PowerDesigners unique integration
into the SAP landscape means designs in the models can easily translate directly to
physical artifacts in databases, data movement, and reporting technologies.

In the next chapter, well discuss SAP HANA Cloud Integration capabilities to con-
nect databases and applications on-premise and in the cloud.

290
Contents

Introduction ..................................................................................................... 17

PART I SAPs Enterprise Information Management


Strategy and Portfolio

1 Introducing Enterprise Information Management ................... 25


1.1 Defining Enterprise Information Management ............................... 25
1.1.1 Example of Information Flow through a Company ............ 28
1.1.2 Types of Information Included in Enterprise
Information Management ................................................. 31
1.2 Common Use Cases for EIM .......................................................... 33
1.2.1 EIM for Operational Initiatives ......................................... 33
1.2.2 EIM for Analytical Use Cases ............................................ 35
1.2.3 EIM for Information Governance ...................................... 36
1.3 Common Drivers for EIM ............................................................... 36
1.3.1 Operational Efficiency as a Driver of EIM .......................... 37
1.3.2 Information as an Organizational Asset ............................ 39
1.3.3 Compliance as a Driver of EIM ......................................... 40
1.4 Impact of Big Data on EIM ............................................................ 41
1.5 SAPs Strategy for EIM ................................................................... 43
1.6 Typical User Roles in EIM .............................................................. 44
1.7 Example Company: NeedsEIM Inc. ................................................ 45
1.7.1 CFO Issues ....................................................................... 46
1.7.2 Purchasing Issues ............................................................. 47
1.7.3 Sales Issues ...................................................................... 47
1.7.4 Engineering and Contracts Issues ...................................... 47
1.7.5 Information Management Challenges
Facing NeedsEIM Inc. ...................................................... 47
1.8 Summary ....................................................................................... 48

2 Introducing Information Governance ........................................ 49


2.1 Introduction to Information Governance ....................................... 50
2.2 Evaluating and Developing Your Information Governance Needs
and Resources ............................................................................... 52
2.2.1 Evaluating Information Governance .................................. 53
2.2.2 Developing Information Governance ................................ 58

7
Contents Contents

2.3 Optimizing Existing Infrastructure and Resources ........................... 59


2.4 Establishing an Information Governance Process: Examples ........... 60 4 SAPs Solutions for Enterprise Information Management ....... 113
2.4.1 Example 1: Creating a New Reseller ................................. 62 4.1 SAP PowerDesigner ....................................................................... 115
2.4.2 Example 2: Supplier Registration ...................................... 63 4.2 SAP HANA Cloud Integration ........................................................ 118
2.4.3 Example 3: Data Migration ............................................... 66 4.2.1 SAP HANA Cloud Integration for Process Integration ....... 119
2.5 Rounding Out Your Information Governance Process .................... 70
4.2.2 SAP HANA Cloud Integration for Data Services ................ 120
2.5.1 The Impact of Missing Data .............................................. 70
4.3 SAP Data Services .......................................................................... 120
2.5.2 Gathering Metrics and KPIs to Show Success .................... 72
4.3.1 Basics of SAP Data Services .............................................. 121
2.5.3 Establish a Before-and-After View .................................... 76
4.3.2 SAP Data Services Integration with SAP Applications ....... 123
2.6 Summary ....................................................................................... 76
4.3.3 SAP Data Services Integration with
Non-SAP Applications ...................................................... 127
3 Big Data with SAP HANA, Hadoop, and EIM ........................... 77 4.3.4 Data Cleansing and Data Validation with
SAP Data Services ............................................................ 128
3.1 SAP HANA .................................................................................... 77 4.3.5 Text Data Processing in SAP Data Services ........................ 130
3.1.1 Business Benefits of SAP HANA ........................................ 78 4.4 SAP Replication Server .................................................................. 133
3.1.2 Basics of SAP HANA ......................................................... 81 4.4.1 SAP Replication Server Use Cases ..................................... 133
3.1.3 SAP HANA Components and Architecture ........................ 82
4.4.2 Basics of SAP Replication Server ....................................... 134
3.1.4 SAP HANA for Analytics and Business Intelligence ............ 85
4.4.3 Data Assurance ................................................................ 136
3.1.5 SAP HANA as an Application Platform .............................. 86
4.4.4 SAP Replication Server Integration with SAP
3.1.6 SAP Business Suite on SAP HANA .................................... 86
Data Services and SAP PowerDesigner ............................. 136
3.1.7 SAP HANA and the Cloud ................................................ 87
4.5 SAP Data Quality Management, Version for SAP Solutions ............ 137
3.2 SAP HANA and EIM ...................................................................... 89
4.6 SAP Information Steward ............................................................... 139
3.2.1 Data Modeling for SAP HANA .......................................... 89
4.6.1 Data Profiling and Data Quality Monitoring ..................... 141
3.2.2 Data Provisioning for SAP HANA ...................................... 89
4.6.2 Cleansing Rules ................................................................ 143
3.2.3 Data Quality for SAP HANA ............................................. 94
4.6.3 Match Review .................................................................. 146
3.3 Big Data and Hadoop .................................................................... 96
4.6.4 Metadata Analysis ............................................................ 147
3.3.1 The Rise of Hadoop .......................................................... 96
4.6.5 Business Term Glossary .................................................... 148
3.3.2 Introduction to Hadoop ................................................... 98
4.7 SAP NetWeaver Master Data Management and SAP Master
3.3.3 Hadoop 2.0 Architecture: HDFS, YARN,
and MapReduce ............................................................... 99 Data Governance ........................................................................... 149
3.3.4 Hadoop Ecosystem ........................................................... 101 4.7.1 SAP NetWeaver Master Data Management ...................... 150
3.3.5 Enterprise Use Cases ........................................................ 105 4.7.2 SAP Master Data Governance ........................................... 151
3.3.6 Hadoop in the Enterprise: The Bottom Line ...................... 107 4.8 SAP Solutions for Enterprise Content Management ........................ 154
3.4 SAP HANA and Hadoop ................................................................ 109 4.8.1 Overview of SAPs ECM Solutions .................................... 156
3.4.1 The Vs: Volume, Variety, Velocity ................................... 109 4.8.2 SAP Extended Enterprise Content Management
3.4.2 SAP HANA: Designed for Enterprises ................................ 109 by OpenText .................................................................... 160
3.4.3 Hadoop as an SAP HANA Extension ................................. 109 4.8.3 SAP Document Access by OpenText and
3.5 EIM and Hadoop ........................................................................... 110 SAP Archiving by OpenText .............................................. 164
3.5.1 ETL: Data Services and the Information Design Tool ......... 111 4.9 SAP Information Lifecycle Management ......................................... 165
3.5.2 Unsupported: Information Governance and Information 4.9.1 Retention Management .................................................... 169
Lifecycle Management ...................................................... 111 4.9.2 System Decommissioning ................................................. 170
3.6 Summary ....................................................................................... 112

8 9
Contents Contents

4.10 Information Governance in SAP ..................................................... 173 6.1.9 Role of the Enterprise Information Architecture
4.10.1 Information Governance Use Scenario Phasing ................. 174 Organization .................................................................... 228
4.10.2 Technology Enablers for Information Governance ............. 176 6.2 Managing Data Migration Projects to Support Mergers and
4.11 NeedsEIM Inc. and SAPs Solutions for EIM ................................... 179 Acquisitions ................................................................................... 228
4.12 Summary ....................................................................................... 181 6.2.1 Scoping for a Data Migration Project ................................ 229
6.2.2 Data Migration Process Flow ............................................ 231
6.2.3 Enrich the Data Using Dun and Bradstreet (D&B) with
5 Rapid-Deployment Solutions for Enterprise Information
Data Services .................................................................... 236
Management ............................................................................. 183
6.3 Evolution of SAP Data Services at National Vision ......................... 236
5.1 Rapid-Deployment Solutions for Data Migration ........................... 184 6.3.1 Phase 1: The Enterprise Data Warehouse ......................... 236
5.1.1 Introduction to Data Migration ........................................ 185 6.3.2 Phase 2: Enterprise Information Architecture
5.1.2 Data Migration Rapid-Deployment Content ..................... 187 Consolidating Source Data ............................................... 238
5.1.3 Getting Started with Rapid Data Migration 6.3.3 Phase 3: Data Quality and the Customer Hub ................... 239
Rapid-Deployment Content .............................................. 189 6.3.4 Phase 4: Application Integration and Data Migration ....... 242
5.1.4 SAP Accelerator for Data Migration by 6.3.5 Phase 5: Next Steps with Data Services ............................ 242
BackOffice Associates ....................................................... 196 6.4 Recommendations for a Master Data Program ............................... 243
5.2 Rapid-Deployment Solutions for Information Steward ................... 197 6.4.1 Common Enterprise Vision and Goals ............................... 243
5.2.1 Information Steward Rapid-Deployment Solution 6.4.2 Master Data Strategy ........................................................ 243
Content ............................................................................ 198 6.4.3 Roadmap and Operational Phases .................................... 244
5.2.2 Getting Started with Information Steward 6.4.4 Business Process Redesign and Change Management ....... 244
Rapid-Deployment Solution Content ................................ 201 6.4.5 Governance ...................................................................... 244
5.3 Rapid-Deployment Solutions for Master Data Governance ............. 203 6.4.6 Technology Selection ....................................................... 245
5.3.1 Master Data Governance Rapid-Deployment 6.5 Recommendations for Using SAP Process Integration and
Solution Content .............................................................. 204 SAP Data Services .......................................................................... 246
5.3.2 Getting Started with SAP Master Data Governance 6.5.1 A Common Data Integration Problem .............................. 246
Rapid-Deployment Solution Content ................................ 206 6.5.2 A Data Integration Analogy .............................................. 247
5.4 Summary ....................................................................................... 207 6.5.3 Creating Prescriptive Guidance to Help Choose
the Proper Tool ................................................................ 248
6.5.4 Complex Examples in the Enterprise ................................. 249
6 Practical Examples of EIM ........................................................ 209 6.5.5 When All Else Fails ....................................................... 250
6.6 Ensuring a Successful Enterprise Content Management
6.1 EIM Architecture Recommendations and Experiences by
Project by Belgian Railways ........................................................... 251
Procter and Gamble ....................................................................... 209
6.6.1 Building the Business Case ............................................... 251
6.1.1 Principles of an EIM Architecture ..................................... 210
6.6.2 Key Success Factors for Your SAP Extended Enterprise
6.1.2 Scope of an EIM Enterprise Architecture .......................... 212
Content Management by OpenText Project ...................... 257
6.1.3 Structured Data ................................................................ 213
6.7 Recommendations for Creating an Archiving Strategy .................... 261
6.1.4 The Dual Database Approach ........................................... 214
6.7.1 What Drives a Company into Starting a Data
6.1.5 Typical Information Lifecycle ............................................ 216
Archiving Project? ............................................................ 261
6.1.6 Data Standards ................................................................. 220
6.7.2 Who Initiates a Data Archiving Project? ........................... 262
6.1.7 Unstructured Data ............................................................ 221
6.7.3 Project Sponsorship .......................................................... 263
6.1.8 Governance ...................................................................... 223
6.8 Summary ....................................................................................... 266

10 11
Contents Contents

PART II Working with SAPs Enterprise Information Management 8.2.3 Setting Up Your HCI Tenant ............................................. 299
Solutions 8.2.4 Setting Up Your Datastore ................................................ 300
8.2.5 Creating a New Project .................................................... 301
7 SAP PowerDesigner ................................................................... 269 8.2.6 Moving a Task from a Sandbox to a Production
Environment .................................................................... 304
7.1 SAP PowerDesigner in the SAP Landscape ..................................... 270
8.3 Summary ....................................................................................... 305
7.1.1 SAP Business Suite ........................................................... 270
7.1.2 SAP HANA Cloud Platform ............................................... 270
7.1.3 SAP Information Steward, SAP BusinessObjects 9 SAP Data Services ..................................................................... 307
Universes, and Replication ............................................... 270
9.1 Data Integration Scenarios ............................................................. 307
7.2 Defining and Describing Business Information with the
9.2 SAP Data Services Platform Architecture ........................................ 309
Enterprise Glossary ........................................................................ 271
9.2.1 User Interface Tier ............................................................ 310
7.2.1 Glossary Terms for Naming Standards Enforcement .......... 272
9.2.2 Server Tier ........................................................................ 313
7.2.2 Naming Standards Definitions .......................................... 273
9.3 SAP Data Services Designer Overview ............................................ 314
7.3 The Conceptual Data Model .......................................................... 273 9.4 Creating Data Sources and Targets ................................................. 318
7.3.1 Conceptual Data Elements, Attributes, and Data Items ..... 274 9.4.1 Connectivity Options for SAP Data Services ...................... 318
7.3.2 Separation of Domains, Data Items, and Entity 9.4.2 Connecting to SAP ........................................................... 321
Attributes ......................................................................... 275 9.4.3 Connecting to Hadoop ..................................................... 323
7.3.3 Entity Relationships .......................................................... 275 9.5 Creating Your First Job .................................................................. 324
7.3.4 Best Practices for Building and Maintaining an 9.5.1 Create the Data Flow ....................................................... 324
Enterprise CDM ............................................................... 276 9.5.2 Add a Source to the Data Flow ......................................... 325
7.4 Detailing Information Systems with Logical and Physical 9.5.3 Add a Query Transform to the Data Flow ......................... 325
Data Models .................................................................................. 278 9.5.4 Add a Target to the Data Flow ......................................... 325
7.4.1 Scope ............................................................................... 278 9.5.5 Map the Source Data to the Target by Configuring
7.4.2 Structure and Technical Considerations ............................ 279 the Query Transform ........................................................ 326
7.5 Canonical Data Models, XML Structures, and Other Datastores ..... 280 9.5.6 Create the Job and Add the Data Flow to the Job ............. 327
7.6 Data Warehouse Modeling: Movement and Reporting .................. 282 9.6 Basic Transformations Using the Query Transform and Functions ... 327
7.7 Link and Sync for Impact Analysis and Change Management ......... 284 9.7 Overview of Complex Transformations .......................................... 330
7.7.1 Link and Sync Technology ................................................ 284 9.7.1 Platform Transformations ................................................. 330
7.7.2 Impact Analysis Reporting ................................................ 287 9.7.2 Data Integrator Transforms ............................................... 332
7.8 Comparing Models ........................................................................ 288 9.8 Executing and Debugging Your Job ............................................... 336
7.9 Summary ....................................................................................... 290 9.9 Exposing a Real-Time Service ......................................................... 337
9.9.1 Create a Real-Time Job ..................................................... 338
9.9.2 Create a Real-Time Service ............................................... 340
8 SAP HANA Cloud Integration .................................................... 291 9.9.3 Expose the Real-Time Service as a Web Service ................ 342
9.10 Data Quality Management ............................................................ 343
8.1 SAP HANA Cloud Integration Architecture .................................... 292 9.10.1 Data Cleansing ................................................................. 345
8.1.1 SAP HANA Cloud Platform ............................................... 294 9.10.2 Data Enhancement ........................................................... 366
8.1.2 Customer Environment On-Premise .................................. 294 9.10.3 Data Matching ................................................................. 369
8.1.3 SAP HANA Cloud Integration User Experience ................. 295 9.10.4 Using Data Quality beyond Customer Data ...................... 386
8.2 Getting Started with SAP HANA Cloud Integration ........................ 297 9.11 Text Data Processing ..................................................................... 388
8.2.1 Blueprinting Phase ........................................................... 297 9.11.1 Introduction to Text Data Processing Capabilities in
8.2.2 Predefined Templates ....................................................... 298 SAP Data Services ............................................................ 389

12 13
Contents Contents

9.11.2 Entity Extraction Transform Overview ............................... 391


9.11.3 How Extraction Works ...................................................... 392 11 SAP Master Data Governance ................................................... 467
9.11.4 Text Data Processing and NeedsEIM Inc. .......................... 394 11.1 SAP Master Data Governance Overview ........................................ 468
9.11.5 NeedsEIM Inc. Pain Points ............................................... 394 11.1.1 Deployment Options ........................................................ 470
9.11.6 Using the Entity Extraction Transform ............................... 396 11.1.2 Change Request and Staging ............................................ 471
9.12 Summary ....................................................................................... 403 11.1.3 Process Flow in SAP Master Data Governance .................. 473
11.1.4 Use of SAP HANA in SAP MDG ........................................ 475
10 SAP Information Steward .......................................................... 405 11.2 Getting Started with SAP Master Data Governance ........................ 476
11.2.1 Data Modeling ................................................................. 476
10.1 Cataloging Data Assets and Their Relationships ............................. 406 11.2.2 User Interface Modeling ................................................... 478
10.1.1 Configuring a Metadata Integrator Source ........................ 407 11.2.3 Data Quality and Search ................................................... 478
10.1.2 Executing or Scheduling Execution of Metadata 11.2.4 Process Modeling ............................................................. 480
Integration ....................................................................... 409 11.2.5 Data Replication .............................................................. 481
10.2 Establishing a Business Term Glossary ............................................ 410 11.2.6 Key and Value Mapping ................................................... 481
10.3 Profiling Data ................................................................................ 413 11.2.7 Data Transfer ................................................................... 483
10.3.1 Configuration and Setup of Connections and Projects ....... 414 11.2.8 Activities beyond Customizing .......................................... 483
10.3.2 Getting Basic Statistical Information about the 11.3 Governance for Custom-Defined Objects: Example ........................ 484
Data Content ................................................................... 417 11.3.1 Plan and Create Data Model ............................................ 484
10.3.3 Identifying Cross-Field or Cross-Column 11.3.2 Define User Interface ....................................................... 489
Data Relationships ........................................................... 422 11.3.3 Create a Change Request Process ..................................... 494
10.4 Assessing the Quality of Your Data ................................................ 425 11.3.4 Assign Processors to the Workflow ................................... 495
10.4.1 Defining Validation Rules Representing Business 11.3.5 Test the New Airline Change Request User Interface ........ 496
Requirements ................................................................... 427 11.4 Rules-Based Workflows in SAP Master Data Governance ............... 497
10.4.2 Binding Rules to Data Sources for Data Quality 11.4.1 Classic Workflow and Rules-Based Workflow Using
Assessment ...................................................................... 431 SAP Business Workflow and BRFplus ................................ 498
10.4.3 Executing Rule Tasks and Viewing Results ........................ 433 11.4.2 Designing Your First Rules-Based Workflow in
10.5 Monitoring with Data Quality Scorecards ...................................... 437 SAP Master Data Governance ........................................... 505
10.5.1 Components of a Data Quality Scorecard ......................... 439 11.5 NeedsEIM Inc.: Master Data Remediation ..................................... 508
10.5.2 Defining and Setting Up a Data Quality Scorecard ............ 441 11.6 Summary ....................................................................................... 511
10.5.3 Viewing the Data Quality Scorecard ................................. 448
10.5.4 Identifying Data Quality Impact and Root Cause .............. 452
10.5.5 Performing Business Value Analysis .................................. 454 12 SAP Information Lifecycle Management ................................... 513
10.6 Quick Starting Data Quality ........................................................... 461 12.1 The Basics of Information Lifecycle Management ........................... 515
10.6.1 Assess the Data Using Column, Advanced, and 12.1.1 External Drivers ................................................................ 516
Content Type Profiling ...................................................... 462 12.1.2 Internal Drivers ................................................................ 516
10.6.2 Receive Validation and Cleansing Rule 12.2 Overview of SAP Information Lifecycle Management ..................... 516
Recommendations ............................................................ 462 12.2.1 Cornerstones of SAP ILM ................................................. 517
10.6.3 Tune the Cleansing and Matching Rules 12.2.2 Data Archiving Basics ....................................................... 518
Using Data Cleansing Advisor ........................................... 464 12.2.3 ILM-Aware Storage .......................................................... 523
10.6.4 Publish the Cleansing Solution ......................................... 465 12.2.4 Architecture Required to Run SAP ILM ............................ 527
10.7 Summary ....................................................................................... 465

14 15
Contents

12.3 Managing the Lifecycle of Information in Live Systems .................. 529


12.3.1 Audit Area ....................................................................... 529
12.3.2 Data Destruction .............................................................. 532
12.3.3 Legal Hold Management .................................................. 532
12.4 Managing the Lifecycle of Information from Legacy Systems .......... 534
12.4.1 Preliminary Steps .............................................................. 534
12.4.2 Steps Performed in the Legacy System .............................. 536
12.4.3 Steps Performed in the Retention Warehouse System ....... 537
12.4.4 Handling Data from Non-SAP Systems During
Decommissioning ............................................................. 539
12.4.5 Streamlined System Decommissioning and Reporting ....... 539
12.5 System Decommissioning: Detailed Example ................................. 542
12.5.1 Data Extraction ................................................................ 543
12.5.2 Data Transfer and Conversion ........................................... 548
12.5.3 Reporting ......................................................................... 555
12.5.4 Data Destruction .............................................................. 559
12.6 Summary ....................................................................................... 562

13 SAP Extended Enterprise Content Management by


OpenText ................................................................................... 563
13.1 Capabilities of SAP Extended ECM ................................................. 565
13.1.1 Data and Document Archiving ......................................... 566
13.1.2 Records Management ...................................................... 567
13.1.3 Content Access ................................................................. 568
13.1.4 Document-Centric Workflow ............................................ 568
13.1.5 Document Management ................................................... 568
13.1.6 Capture ............................................................................ 569
13.1.7 Collaboration and Social Media ........................................ 569
13.2 How SAP Extended ECM Works with the SAP Business Suite ......... 570
13.3 Integration Content for SAP Business Suite and
SAP Extended ECM ....................................................................... 572
13.3.1 SAP ArchiveLink ............................................................... 572
13.3.2 Content Management Interoperability Standard and
SAP ECM Integration Layer .............................................. 574
13.3.3 SAP Extended ECM Workspaces ....................................... 575
13.4 Summary ....................................................................................... 582

The Authors ..................................................................................................... 583


Index................................................................................................................. 591

16
Index

A Archiving object
definition, 519
Accelerated reporting, 542 SAP ILM-enabled, 527, 548
Access server, 313 specific customizing, 554
Active area, 473 work center, 520
Address Assessment, 59
cleansing, 125, 128 Asynchronous replication, 135
cleansing/enhancement, 137 Atomic data, 218
correction, 366 Attributes, 274
directories, 362, 363 Audit area
information, 55 definition, 529
parsing, 353 demo, 530
profiling, 423 product liability, 530
validated, 137 set up, 548
Address cleanse, 350, 352 tax, 530
transform, 344 Audit package
Advanced profiling, 422 create, 556
results, 424 extract to BI, 557
AIS, 559 Auditing, 172
Alias, 272 Automated electronic discovery, 169
All-world address directory, 362
Ambari, 104
Analytical use, 309 B
APIs, 337
Application architecture, 117 BAdI, 469
Application integration, 242 change UI for entity type, 478
Application link enabling (ALE), 469, 481 BAPI, 321
Architecture, 209 Bar codes, 573
retention management, 527, 528 Best practices methodology, 183
system decommissioning, 528, 529 Best record strategy, 385
Archival data, 220, 223 BI, 175
Archive, 541, 565 Big data, 41, 42, 43
file, 526 processing and analysis, 323
hierarchy, 525 SAP HANA vs Hadoop, 109
index, 526 Binding, 427, 431, 447, 454
Archive administration data Blueprint, 297
transfer, 552 Break group, 373, 376
Archive Development Kit (ADK), 519 BRFplus, 179, 468, 480, 497, 498, 499, 501
Archive Management, 553, 554 custom validations, 474
Archiving, 165 single value decision table, 502, 503
object, 171, 173 user agent decision table, 502
policies, 176 Bulk data load, 120
scope, 264 Business Address Services, 123, 137
strategy, 261 Business efficiency, 154
using SAP HANA, 262 Business glossary, 115, 148

591
Index Index

Business intelligence (see Cluster, 98 Data (Cont.) Data dictionary


BI), 147, 369, 370 CMC, 408 elements, 345 data items, 274
Business process descriptions, 149 connections, 414 enhancement, 366, 372 Data enrichment, 128
Business process manager, 45 internal scheduler, 434 integrated, 218 Data extract browser, 546
Business process owner, 45 set up Data Insight project, 416 integration, 50 Data flow, 238, 317, 322, 335, 385, 396, 397,
Business rule, 232, 425 CMIS, 574 item, 274 400
Business Rules Framework, 468 Collaboration, 566, 569 lineage, 36, 406, 410 add query transform, 325
Business term glossary, 140, 405, 410 Compliance, 37, 60, 62 loading, 127 add source, 325
Business term taxonomy, 406 monitoring, 179 management, 224, 225 add target, 325
Business user, 142 requirements, 65 move/synchronize across enterprise, 133 add to job, 327
Business Value Analysis, 454 Conceptual data model (CDM), 116, 273 owner, 44 create, 303, 324, 339
Business-complete data, 548 Condition alias, 502 parsing, 356, 387 define, 311
Business-incomplete data, 548 Consolidate, 369 planning, 297 example, 302
Content, 32 policies, 51 GUI, 302
access, 565, 568 profiling, 232, 240 move to production, 304
C Content Data Extractor tool, 539 quality, 34, 39, 40 Data governance, 410
Content Management Interoperability Ser- real-time replication, 92 Data Insight, 406, 409, 414
Canonical Data Model, 281 vices see CMIS retention, 217 Data Insight project, 415, 437, 439, 448, 454
Capture, 566, 569 Context data, 548 source, 71, 447 add table/file, 416
Cata dictionary, 215 extractor, 173, 535 sources, 237 define multiple, 415
CDE, 535, 545 Context information, 545 standardization, 128, 356 set up, 415
archive data, 545 Correction, 345 standardized, 359, 360
set up connection, 415
extraction services, 548 CRM, 124, 137, 159 standards, 54, 220
Data integration, 185, 246, 307, 312, 389
Centers of excellence, 54 content management, 159 steward, 45
bulk data load, 291
Central Management Console see CMC Cross-domain dependency, 284 stewardship, 209
cloud to cloud, 291
Central Management Server see CMS Culture dimension, 54 synchronization, 127, 248
on-premise to cloud, 291
cFolders, 159 Custom extraction rules, 392 transfer, 483
scenarios, 307
Change management, 244, 284 Customer complaints, 253 transformation, 327
Data integrator transform, 332
Change request, 472, 505, 510 Customer information, 59 validation, 127, 128, 312
Data lifecycle, 520
create, 510 Customer relationship management, 62 Data archiving, 167, 168, 169, 173, 256, 261,
Data load, 228
create process, 494 263, 265, 518, 566
Data mart, 126, 210
process, 468 basics, 518
Data migration, 50, 66, 121, 124, 127, 176,
type, 499, 506 D process, 522
231, 242, 308, 369
UI, 496 Data Assurance, 136
activities, 188
Checksum function, 551 DART Browser, 546 Data cleanse, 345, 353, 356, 366, 378
business rules, 232
Cleansing, 232 Data data correction, 362
data standardization, 356 content, 185, 195
package, 360, 378 administrator, 221
data validation, 364 data enrichment, 236
process, 352 analysis, 80
standardization, 358 process flow, 231
rule, 140, 405, 406 analyst, 45, 97, 103, 410, 413, 415, 416
architect, 233 transform, 143, 344 rapid deployment, 184
Cleansing Package Builder, 143, 144, 387 reasons for, 185
assessment, 34 Data cleansing, 121, 127, 231
Cloud, 118 scope, 229
cleansing, 68 Data Cleansing Advisor, 464
applications, 118 Data model
consolidation, 383 Data destruction, 532, 559
bulk data load, 120 activate, 488
correction, 128, 356, 362 in the live database, 532
rapid-deployment, 183 create, 485
distribution, 224 in the retention warehouse, 559
real-time data access, 120 plan, 485
domain, 174, 442, 450 security considerations, 561
SAP HANA database, 292
domains, 55

592 593
Index Index

Data modeling, 407, 408, 476 Data quality scorecard (Cont.) Dual database strategy, 214 Entity relationship, 275
SAP HANA, 89 view, 448 Dun and Bradstreet, 29, 63, 236 define in PowerDesigner, 276
Data movement model (DMM), 116 view Business Value Analysis, 459 Duplicate check, 479 Entity type, 476, 485
Data profiling, 122, 140, 141, 346, 372, 405, Data replication, 481 Duplicate checking, 137 choose for business object, 488
413, 428, 432 framework, 469 create, 485
basic, 419, 422 Data Services relationships, 477
create validation rule, 427 connect to Hadoop, 323 E ERP, 137, 179
project, 415 connect to SAP BW, 322 ETL, 110, 120, 126, 127, 217, 235, 247, 248,
set up task, 417 connect to SAP ERP, 321 Easy Document Management, 159 307, 406, 407, 408, 438, 452, 539
Data provisioning, 84 connect to SAP HANA, 323 ECM, 43, 50, 113, 154, 156, 162, 165, 251 Executive sponsor, 55
SAP HANA, 90 server tier components, 313 integrated, 563 Extract, Transform, Load, 541
Data quality, 67, 68, 127, 131, 132, 138, 139, Data services connectivity, 292 integration layer, 574 Extraction, transformation, and loading see
148, 174, 176, 181, 186, 225, 226, 232, Data Services Workbench, 91 workspace, 575, 576, 578 ETL
239, 240, 307, 312, 378, 387, 389, 392, Data steward, 68, 72, 142, 143, 144, 178, 438, ECMLink, 575 Extractors, 321
414, 436, 437, 451, 452, 453, 465, 478, 443, 448, 502, 507 Editions, 477
479, 509 UI, 311 EIM, 25, 27, 28, 36, 40, 43, 69
assessment, 431 Data warehouse, 126, 214, 218, 308, 333 architecture recommendations, 209 F
dashboard, 435 governance, 220 Hadoop, 110
levels, 51 modeling, 282 strategy, 43 Fact object, 283
management, 343 Database administrators, 262 with SAP HANA, 89 Failed record, 450
measurement, 143 Databases, 126 E-mail Response Management System, 159 Family match, 373
metrics, 226 Datastore, 318, 396 Emails, 563 File source, 320
monitor, 427 create, 300, 319 Enterprise Financial impact, 456
monitoring, 140, 141, 406, 452 import tables, 300 application integration, 217 Financial master data, 152
process, 130 Decision tables, 499 search, 160 Floor Plan Manager (FPM), 489
requirements, 225, 405 Decommission, 26 services, 469 Flume, 102
root-cause analysis, 453 Decommissioning, 168, 539 workspace, 159 Form UIBB, 491
score, 431, 435, 437 De-duplication, 125, 231, 234, 235, 240, 365, Enterprise CDM, 273
scorecard, 426, 442, 446, 448, 452 370, 372 best practices, 276
scores, 448 Demographic data, 368 concepts, 274 G
telephone patterns, 241 Dependency profiling, 423 obsolete definition, 278
Data Quality Advisor, 145, 461 Derivation, 478 versions, 277 Generic object services, 582
Data quality dimension, 438, 443, 445, 448, Digital asset management, 155 Enterprise Content Management see ECM Geo directories, 368
450 Dimension object, 283 Enterprise data warehouse, 236, 344 Geocode, 240, 241
accuracy, 443, 445 Dimensional model, 283 Enterprise glossary, 271 Geocoding, 367, 370
completeness, 443 Direct linkage, 453 naming standards, 272 Geolocation, 219
conformity, 443, 446, 448 Direct marketing, 370 synonyms, 277 Geospatial, 366
consistency, 444 Discovery, 59 Enterprise information architecture, 269 data, 367
integrity, 444 Discrete format, 347 consolidate source data, 238 Global address cleanse, 350, 353, 357
timeliness, 444 Document, 563 details, 210 parse data, 350
uniqueness, 444 archiving, 565, 566 role of, 228 Global address cleansing, 423, 425
Data quality scorecard, 141, 437 management, 155, 566, 568 scope, 212 Global data manager, 45
bind data sources to, 447 Document-centric workflow, 566, 568 Enterprise Information Management see Global standards, 41
components, 439 Domain, 274 EIM Governmental regulations, 40
drill into details, 449 Drawing management, 252 Entity, 274, 391 Governmental standards, 55
key data domain, 442 DRF, 469 attribute, 274 Grammatical parsing, 390
tile, 439 DSO, 557 data item, 274 GS1, 144
extraction, 391, 397 Guidelines, 227

594 595
Index Index

H HCI Agent, 294 Information governance (Cont.) L


HCP customized, 153
Hadoop, 42, 96, 100, 323 service layers, 292 develop, 58 LDM
as SAP HANA extension, 109 Hive, 102 establish process, 60 structure definition, 279
bulk data transfer, 101 evaluate, 53 Legacy System Migration Workbench
cluster, 98 framework, 69 (LSMW), 186
collect logs, 102 I preventative, 61 Legacy systems, 172
common use, 105 technology enablers, 177 extract data, 173
ecosystem, 101 IDEA, 559 Information lifecycle management also see Legal compliance, 261
HBase, 103 IDoc, 316, 321, 469, 481, 483 SAP ILM Legal hold, 34, 170, 179
HDFS, 99 ILM, 165, 168, 210 Information lifecycle management see ILM management, 169
Hive, 102, 108, 109 definition, 515 Information management, 220 setting, 533
in the enterprise, 107 drivers and pain points, 515 scope, 212 Legal hold management
introduction, 98 external drivers, 516 Information management strategy, 37, 154, overview, 532
machine learning libraries, 104 for legacy data, 534 165, 166 Legal requirements, 51, 168
Mahout, 104 in live systems, 529 Information platform services (IPS), 313 Lifecycle, 27
MapReduce, 99, 106 internal drivers, 516 Information Steward Link and Sync, 284
master node, 98 work centers, 519 Data Insight module, 145 parts, 284
online archive, 106 ILM object metadata, 140 technology, 269
Pig, 103, 106, 108 definition, 519 Metapedia, 410 Linking, 284
SAP HANA, 109 ILM-aware storage, 523 In-memory cloud platform, 292 Local reporting, 555, 559
scripting, 103 system, 523 In-memory computing, 78 Logical data model (LDM), 278
SQL interface, 102 ILM-BC 3.0 Integrated ECM, 563
strengths and weaknesses, 108 certification, 523 Integration flow
Images, 32 web-based UI, 291
Tez, 100
Impact analysis, 228, 284, 410, 453 Integration Platform as a Service (iPaaS), 118
M
worker node, 98
Hadoop Distributed File System (HDFS), 99 reporting, 287 Intelligent Driver Assistant, 254 Maintenance notification, 162
HANA Cloud Integration for data integration, Implementation methodologies, 183 iPaaS, 93 Management reporting, 214
120 Individual match, 373 IPS, 313
Management reporting and analytics, 217
Industry standards, 34 IRM, 530
HCI Managing content, 32
InfoCube, 557 IT administrator, 414, 416
blueprinting phase, 297 Manual rule binding, 431
connectivity, 291 Information
Map reduce, 100
create project, 302 access, 51
Mapping, 326
create task, 301 discovery, 50, 76, 175 J Mapping Editor, 282
lifecycle, 216
data flow editor, 296 MapReduce
platform services, 312, 314 Java Message Service, 316
datastore, 300 text data processing, 323
policies, 52 Job
define data extraction, 294 Master data, 29, 34, 37, 64, 68, 149, 151, 176,
retention manager, 530 create, 324, 327, 340
integration steps, 297 177, 181, 215, 216, 477
security, 209, 227 execute/debug, 336
logs, 305 consolidation, 150
strategy, 54 real-time, 338
on-premise component (HCI Agent), 294 customer, 152
Information asset
Predefined template, 298 export, import, convert, 483
Data Quality Advisor, 145
set up prerequisites, 299 harmonization, 150
set up tenant, 299
Information governance, 26, 28, 33, 49, 52, K manage centrally, 224
55, 58, 63, 67, 68, 76, 108, 110, 114, 139,
set up user roles, 299 management, 213, 308, 370
173, 177, 189, 209, 223, 225, 232, 239, Key mapping, 481, 482
transform type, 303 material, 153
244, 260 Key words, 412
tutorial, 297 program recommendations, 243
committee, 68 Knowledge worker, 72
user experience, 295 strategy, 243
council, 67 KPI, 73

596 597
Index Index

Master data governance, 499 Monitoring, 437 Physical data models, 116 Rapid-deployment solutions
application framework, 469 Multiline data, 349 Pig, 103 data migration, 184
Master record, 374 Multiline format, 347 scripts, 323 Information Steward, 197
Match, 369 Multiline hybrid format, 347 Platform transformation, 330 SAP MDG, 203
comparison options, 373 PLM, 162 Real-time data replication and synchroniza-
configuration, 374 Point-of-interest, 368 tion, 133
criteria, 373, 380 N Policies, 227 Real-time service, 340
group, 374, 382, 384 Policy expose as web service, 342
level, 373 No-match thresholds, 383 define, 549 Records management, 155, 179, 261, 565, 567
performance, 377 Nondiscrete data components, 349 definition, 178 Redundancy profiling, 423
scenario, 373 Nonparty data, 387 engine, 169 Reference data, 213, 215
score, 381 Nonrelational data, 109 implementation, 178 Regulatory compliance, 40
set, 373 Non-SAP systems, 534 set status to live, 550 Replication Server
standards, 378 NoSQL, 96, 103 Policy category Data Assurance, 136
threshold, 374 residence rules, 549 Reporting
datastore, 103
Match Criteria Editor, 381 retention rules, 549 increase performance, 542
Match Editor, 374, 380 Portal Site Management, 157 local, 559
Match method PowerDesigner models, 116 Repository tier, 314
weighted scoring, 375 O Predefined template, 298 Requirements traceability matrix, 285
Match transform, 344 Predictive analytics, 52, 175 Residence time
OLAP, 78, 97, 126
Match Wizard, 374, 379, 380, 381 algorithm, 80 definition, 521
OLTP, 78, 97
Matching, 128, 129, 132 Pre-parsed data, 353 Retention, 176
On-premise
process, 368 Principle, 211 limits, 220
rapid-deployment, 183
routine, 240 Print list, 531 management, 168, 169
Oozie, 104
score, 479 retrieve, 524 policies, 28, 171, 173, 178, 261
Open hub, 322
standards, 361 Procedures, 227 time unit, 550
OpenText, 113, 156, 157, 565
strategy, 371, 372 Process modeling, 480 Retention management, 256, 518
OpenText Knowledge center, 580
techniques, 372 Procurement, 162 capabilities, 529
Operational analytics, 214
Matching method Product liability, 173 unstructured data, 531
Operational data, 215, 216
combination, 374, 376 Product lifecycle management (see Retention Management Cockpit
Operational efficiencies, 37, 39, 245
rule-based, 374 PLM), 159 Administrator, 537
Operational master data management, 150
weighted scoring, 374 Product quality, 394 Line of Business, 537
MDG communicator, 493 Operational reporting, 214
Operational use, 308 Profiling task, 418 Retention period
Mergers and acquisitions, 228, 516 view results, 421 definition, 521
Metadata, 147, 213, 215, 221, 252, 551 Optimical character recognition (OCR), 254
Organizational change management, 175 Project, 302 maximum, 550
analysis, 140, 147 minimum, 550
apply, 222 Organizational ownership, 221
Output management, 155 Retention rules, 223
management, 139, 147, 452, 453
Metadata integration Output schema, 400 Q basics, 550
Retention warehouse, 170, 172, 542
execution, 409 set up, 173
Quality, 37
Metadata integrator, 407, 409 Retirement, 33
dimension, 440, 509
configure, 407 P Row data
Query transform, 325, 326, 327
Metadata management, 405, 406, 413
PaaS, 88 report discrepancies, 136
Metapedia, 148, 149, 410
Parallel processing architecture, 96 Rule binding, 441, 448, 450
synonym/keyword, 413
techniques, 411 Parsed data, 350, 353, 360 R Rule tasks
execute, 433
Migration, 28, 38 Parsed output, 352, 356
Rapid Data Migration Rule-based, 374
Missing data, 72 Parsing, 345
content, 189 Rules-based workflow, 496, 499
Model comparison, 289 Physical data model (PDM), 270, 278
Rapid Mart, 452 design, 505
Monitor, 139 structure definition, 279

598 599
Index Index

S SAP Data Services (Cont.) SAP Data Services Designer, 310 SAP HANA (Cont.)
architecture, 309 SAP Digital Asset Management, 157 native advanced features, 85
SAP Accelerator for Data Migration by Back- batch jobs, 316 SAP Document Access, 157, 164, 169 real-time trigger-based replication, 92
Office Associates, 196 breakpoints, 336 SAP Document Access by OpenText, 514, 524 SAP Business Suite, 86
SAP ArchiveLink, 165, 169, 514, 572, 575 built-in functions, 327 SAP Document Presentment, 157 the cloud, 87
attachments, 551 call as external service, 480 SAP ECC, 124 with EIM, 89
documents, 524 central repository, 314 SAP Employee Management, 157 with SAP MDG, 475
SAP Archiving by OpenText, 157, 164, 165, cleansing transformation, 233 SAP Enterprise Asset Management, 162 XS server, 84
167, 169, 173, 514 CMC, 408 SAP Enterprise Portal, 157, 159, 161, 164, SAP HANA Cloud Integration (HCI), 93, 118
SAP Audit Format, 559 470, 483, 575
connect to file source, 320 SAP HANA Cloud Integration for process inte-
SAP Business Process Management, 61, 179 SAP ERP, 121, 123, 152, 170, 172, 173, 190
data enhancement, 366 gration, 119
SAP Business Suite, 153, 162, 252, 255, 571
data quality, 240 document access, 164 SAP HANA Cloud Platform (HCP), 118, 294
standard business processes, 258
data validation, 364 migrate data to, 193 SAP HANA Enterprise Cloud, 87
validations, 474
Designer, 314 migration content, 190 SAP HANA One, 88
SAP Business Suite on SAP HANA, 86
enrich data, 236 SAP Extended ECM, 64, 66, 158, 161, 162, SAP HANA Studio, 84
SAP Business Warehouse (SAP BW), 67, 123,
125, 126, 127, 169, 170, 322, 407, 452, 536 ETL, 237 164, 165, 178, 179, 251, 252, 254, 256, SAP Identity Management, 121
connect to retention warehouse, 173 ETL capabilities, 539 260, 565, 569, 571 SAP ILM
reporting, 556 evolution, 236 ArchiveLink, 572 architecture, 527
SAP Business Workflow, 66, 74, 151, 179, extract legacy data, 242 capture, 569 cockpit roles, 537
259, 468, 494, 497, 498, 499, 573 function categories, 328 customer complaints, 254 conversion, 537, 551
configuration, 505 functions, 327 customize workspace, 579 conversion, replace old sessions, 554
SAP BusinessObjects BI, 147 history preservation, 333 integration with the SAP Business Suite, 570 cornerstones, 517
platform, 121, 125, 126, 312, 408 job, 316, 327, 336 metadata, 576 data archiving, 518
SAP BusinessObjects Business Intelligence, job server, 311, 313 migrate invoices to, 257 database storage option, 525
126, 179, 407 lineage analysis, 312 OpenText, 574 object, 548
SAP BusinessObjects Business Intelligence Local Object Library, 315 printout, 253 retention management, 518
(SAP BusinessObjects BI), 74 UI options, 259
local repository, 314 retention rules, 531
SAP BusinessObjects universe, 271 WebGUI, 578
lookup function, 328 Store Browser, 561
SAP BusinessObjects Web Intelligence, 124, workspace types, 577
major components, 309 system decommissioning, 518
189 SAP Extended Enterprise Content Manage-
management console UI, 311 SAP IMG, 476
SAP Cloud Operations, 299 ment by OpenText
SAP Content Server, 160 mappings, 311 SAP Information Lifecycle Management (ILM),
metadata, 311 success factors, 257 113, 159, 164, 165, 166, 168, 169, 170,
SAP CRM, 123, 138, 344
migration content, 190 SAP Folders Management, 159 173, 178, 179, 182, 186, 256, 567
Customer Interaction Center, 259
object types, 316 SAP GUI, 579 legacy functions, 173
document access, 164
overlap with SAP PI, 249 SAP HANA, 42, 48, 77, 106, 123, 125, 126, retention warehouse, 172
SAP Customer Relationship Management (SAP
parsing, 350 127, 170, 309, 323 SAP Information Steward, 61, 74, 113, 122,
CRM), 121, 190, 253
SAP Data Quality Management, 95, 128 Project Area, 315 analytics and BI, 85 139, 143, 147, 148, 149, 150, 153, 178,
SDK, 127 query transform, 325 archiving, 262 179, 186, 188, 232, 233, 235, 309, 312,
version for SAP solutions, 137 Rapid Data Migration, 187 as an application platform, 86 387, 407, 409, 416, 427, 431, 447, 466,
SAP Data Services, 61, 64, 66, 67, 74, 120, real-time job, 316, 338 basics, 81 508, 509, 510
121, 122, 124, 125, 126, 127, 129, 150, business benefits, 78
real-time service, 337, 340 Business Value Analysis, 456
151, 153, 168, 178, 181, 193, 229, 246, components and architecture, 82
SAP HANA, 90 CMC, 408
307, 318, 322, 325, 360, 363, 367, 368, data modeling, 89
server tier, 313 Data Insight project, 414
372, 387, 394, 407, 409, 423, 425, 452, data provisioning, 84, 89
tool palette, 315 hyperlinked numbers, 420
454, 467, 514 data quality, 94
update source system, 234 metadata management, 410
address check, 137 Hadoop, 109
use Hadoop, 111 Quality Dimension attribute, 444
administration, 311 index server, 83

600 601
Index Index

SAP Information Steward (Cont.) SAP NetWeaver Master Data Management Scripting, 304 System decommissioning (Cont.)
rapid-deployment solutions, 197 (SAP NW MDM) (Cont.) Semantic disambiguation, 390 enable system for SAP ILM, 535
read repository, 270 trigger workflow, 66 Sentiment, 32 extract data, 543
SAP HANA, 94 UI modeling, 478 Sentiment analysis, 131 non-SAP systems, 539
statistical information, 413 with SAP HANA, 475 Service-level agreement, 72, 414 preliminary steps, 534
UI, 311 SAP Plant Maintenance (PM), 162 normal, reverse, 72 report on legacy data, 537
SAP Invoice Management, 157, 257, 259 SAP Portal Content Management, 157 Similarity scoring, 372 reporting, 555
SAP IQ SAP Portal Content Management by Open- Single Instruction, Multiple Data (SIMD), 81 set up audit areas and rules, 548
store archive file, 526 Text, 255 Single-object maintenance transfer and convert files, 551
store archive index, 526 SAP PowerDesigner, 115 UI, 489 transfer archive administration data, 552
SAP Landscape Transformation, 186 compare dialog, 286 Slowly changing dimensions, 333 transfer data, 537
SAP Landscape Transformation Replication data mapping, 282 SN_META System Decommissioning Cockpit
define relationship, 276 file, 551
Server, 92 Administrator, 538
dimensional modeling, 283 Snapshot, 524, 545
SAP LT Replication Server, 514, 539 Line of Business, 538
glossary, 272 Social media, 41, 569
SAP Master Data Governance System landscape harmonization, 516
glossary, configure, 273 SPRO, 476
rapid-deployment solution, 203 System of record, 216
impact analysis, 287 Sqoop, 101
SAP HANA, 95
library, complex types, 282 SRM, 62
SAP NetWeaver Application Server ABAP,
Link and Sync technology, 284 SRS, 528
137, 186, 580 linking, 284 T
SAP NetWeaver Business Client, 470, 483, Staging, 472
mapping, 281 area, 473
537, 538, 548, 551, 575 Task, 301
model compare, 289 Standardization, 345
SAP ILM cockpits, 539 move to production, 304
realize value, 269 rules, 387
SAP NetWeaver Master Data Management SAP Business Suite, 270 start with web service, 305
Standards, 227 template, 302
(MDM), 467 SAP HANA, 270
Step type, 506, 507 Tax
SAP NetWeaver Master Data Management synchronizing, 285
Storage, 168, 172 audit, 166
(SAP NW MDM), 63, 66, 113, 123, 125, table definition, 282
Storage and retention service, 528 reporting, 173
137, 150, 151, 153, 178, 179, 181, 467, XML model, 280
Storage system Technical requirement, 425
468, 470, 508 SAP Process Integration (PI), 246
ILM-aware, 523 Tenant, 292
assign processors to workflow, 495 SAP Process Orchestration, 64, 65, 66, 74,
Structured data, 32, 33, 213 set up, 299
business activity, 480 246, 343
Subordinate record, 374 Term
change request ID, 66 SAP Rapid Data Migration, 186
Supplier, 28 hierarchies, 412
configuration steps, 476 SAP Rapid Deployment solutions, 183
Sybase, 318 related, 412
custom-defined object, 484 SAP Replication Server, 92, 114, 133
Sybase IQ, 127 Text
data quality, 138 Integration with SAP Data Services, 136
Synchronizing, 285 analytics, 388
define UI, 489 Integration with SAP PowerDesigner, 136
Synonym, 412 data, 32, 394, 395
flex mode, 473 SAP River, 81
SAP Smart Business, 476 assign to common term, 277 mining, 36
generic workflow template, 499 System consolidation, 186
SAP solutions for information lifecycle man- Text data processing, 121, 130, 131, 132, 181,
import master data, 483 System decommissioning, 43, 169, 170, 172,
agement, 513 307, 389, 390, 393, 394, 399, 400
maintain SAP ERP attributes, 65 518, 539
overview, 513 dictionary, 393
master data changes, 234 archive transactional data, 543
SAP StreamWork, 159 entity, 392
master data hub, 471 configure retention warehouse system, 535
SAP Travel Receipt Management, 157 entity types, 399
multi-attribute drill-down, 475 convert data, 537
SAPUI5, 84 extraction, 392
process flow, 473 data analysis, 534
Scaling, 96 rule, 393
reuse mode, 473 data transfer, 551
Scanned invoice, 169 transform configuration, 396
rules-based workflow, 497 data transfer and conversion, 548
Schema, 338 use cases, 388
run on SAP ERP, 471 Scope, 278 define audit areas, 537 Time reference, 550
searches, 479 Scorecard, 139 detailed example, 542

602 603
Index Index

TOAx Unstructured data, 105, 109, 168, 212, 401 X Y


tables, 524 lifecycle, 221
Transaction retention management, 531 XML YARN, 99
ILM, 545 text, 389 data archiving service, 528
ILM_DESTRUCTION, 559 turn into structured data, 110 export/import master data, 483
ILM_TRANS_ADMIN_ONLY, 552 Unstructured information, 33, 563 schema, 338, 342 Z
IRM_CUST, 550 User interface, 571 XML DAS, 528
IRMPOL, 526, 548 XML Schema Definition (XSD), 280 ZooKeeper, 104
SARA, 543
TAANA, 535 V
Transactional application, 218
Transactional data, 213, 543 Validation, 471, 478
Transform, 318, 397, 399 rule, 509, 510
address cleanse, 363, 365, 367 transform, 364
case, 331 Validation rule, 139, 142, 405, 406, 420, 425,
data cleanse, 233, 353, 360, 367, 387 427, 432, 433, 441, 443, 446, 448, 450,
entity extraction, 389, 393, 394, 396 452, 454
geocoder, 367, 368 add, 445
global address cleanse, 354 associate with data source, 447
history preserving, 334 create in rule editor, 429
key generation, 335 test, 430
Map_Operation, 332 Value mapping, 481, 482
match, 372, 373, 380, 385
merge, 331
query, 331 W
Row_Generation, 332
Web content management, 155
SQL, 331
WebDAV
table comparison, 334
ILM-enhanced interface, 523
transform configuration, 397
Weight scoring, 374
user defined, 332
Weighting, 446, 448
validation, 332, 437
What-if analysis, 460
Transformation
Work center
complex, 330
archiving, 520
reporting, 520
Work order, 163
U Workflow, 316, 469
distribute data maintenance, 471
UI
Hadoop, 104
building blocks (UIBBs), 489
rules-based, 496
configuration, 494
Workspaces, 161, 162, 163, 254, 256, 257,
Unified business language, 115
315, 581
Uniqueness profiling, 423
binder workspace, 577
Universal data cleanse, 241 business workspace, 577
Universe, 271 case workspace, 577
UNSPSC, 144 Write program
Unstructured content, 212, 213, 571 log, 544

604 605
First-hand knowledge.

Corrie Brague is the director of Data Quality Product


Management for SAP, where she defines software solutions
that help businesses assess, improve, and monitor their data
quality.

David Dichmann is director of product management for


SAPs enterprise architecture and modeling tool, Power-
Designer.

George Keller has more than 20 years of experience in


the field of information management, having worked within
engineering, business applications, and product management
organizations. He has also served as a professional delivery
project manager for a number of Fortune 100 clients.

Markus Kuppe is vice president and chief solution architect


for SAP Master Data Governance. He led various programs
across the SAP Business Suite in topics such as analytics, user
experience, or architecture. He is a frequent author
and speaker at business events.

Phillip On is an industry veteran for Enterprise Information


Management with more than 13 years of experience on this
topic working for SAP, Business Objects, and Oracle.

Brague, Dichmann, Keller, Kuppe, On


Enterprise Information Management with SAP
605 Pages, 2014, $69.95/69.95 We hope you have enjoyed this reading sample. You may recommend
ISBN 978-1-4932-1045-9 or pass it on to others, but only in its entirety, including all pages. This
reading sample and all its parts are protected by copyright law. All usage
www.sap-press.com/3666 and exploitation rights are reserved by the author and the publisher.

Das könnte Ihnen auch gefallen