You are on page 1of 37

Implementing Data Governance

at Grifols: Best Practices and

Lessons Learned
Praneeth Padmanabhuni, Grifols Inc.
Richard Hauser, Decision First

Discuss how SAP Information Steward can assist in
establishing a Data Governance program
Enable power users in the business to own data
processing and be responsible for data quality
Remove manual steps to automate data processing as
much as possible
Extend the out-of-the-box visualizations available in
Information Steward scorecards by utilizing repository
Involve data stewards directly in de-duplication efforts
via Match Review Tasks in Information Steward

Who Is Grifols?
International healthcare company based in
Barcelona, Spain with offices the Raleigh, NC as
well as Los Angeles
Develop and distribute life-saving protein therapies
derived from human plasma
Have experienced rapid growth over the past few
years as a result of mergers and acquisitions

Challenges as a Growing Company

70+ files to collect data from on a monthly basis
70+ varying degrees of data quality!!!
Only want to count sales at the closest point to an actual

Data warehouse had previously been outsourced,

but volumes had reached a point where insourcing
became a more attractive option
Data cleansing was being performed manually via
Excel files, but using a tool to process large
volumes became a necessity

Decision First Technologies

Who we are
Atlanta-based SAP Business Objects specialists
Partnered with SAP
7x Business Objects Partner of the Year
SAP Business Objects, SAP EIM, and SAP HANA

What we do
Strategize and implement Data Governance solutions
BI Nirvana 90 day Business Intelligence on HANA
Full lifecycle data warehouse implementations
Data visualizations and standard reporting

Data Governance Defined

Core business process that ensures data is treated
as a corporate asset and is formally managed
throughout the enterprise
Marriage of the following programs:
Data Quality
Information Management policies
Business process management
Risk management

Information Steward
Information Steward was chosen to be used as the
tool to help implement initial DG policies
Integrates nicely with DS, which was already in use

Gives visibility to data quality issues

Easy for business users to pick-up and run with
Not a fully blown master data solution, more of an

Challenges at time of enlisting DFT

Cluttered ETL environment
Many manual steps needed for weekly processes
Data issues popping up weeks after loading of flat
Users not trustworthy of account master data

Solutions Put Forward

Implement best practices in ETL environment
Multiple developer repositories, central repositories, and
best practices naming conventions

Combine and automate common ETL jobs to the

fullest extent possible
Give visibility to data quality by developing an
Information Steward scorecard
Improve the customer account matching process
and utilize DS cleansing transforms to build user
trust in the data warehouse

ETL Coding Best Practices

Multiple repos and landscapes
Previously just PRD
One repository per developer
Fully fleshed-out DEV, QA, and PRD to properly test

Central repo for each environment

Allows for versioning and rollback in event of unintended
Moving objects to central forces developers to fully
understand the impacts they are having to all objects

Naming standards
Objects properly named, data that is being sourced from
or written to, initial/delta load, number in sequence if

ETL Automation
Combine objects into jobs, workflows, etc
Went from 15 steps down to 3-4 depending on data

Code objects for reusability, not one-off executions

Standardize variables across all jobs and conform
to a template job format
Job Execution Table, Job Start Script
Give power users authority to process data when
ready by allowing them to run certain DS jobs that
they are responsible for

DQ Visualizations
Needed a way to assess DQ before it became an
IS Data Insight was the best solution for our purposes

Same data validation rules could be applied to all

Limit the data being analyzed to only most recent month

Built an event-based process chain in the CMC to

seamlessly integrate this step into the normal
weekly ETL jobs

Original Sales Staging Process

New Sales Staging process with DQ

DQ Reporting Enhancements
Extract data from appropriate tables/views in the IS
repository database every time new DQ data is
Historical scores are readily available from the
following database views:
Contains project names, among many other things

Key Data Domain descriptions

Historical scores for every active quality object

Quality dimension descriptions

Contains scores for KDDs, QDs, Rules, Bindings,
by key data domain, which is attached to a
Column to select score type is
TOTL = Key Data Domain Score
KDDQ = Quality Dimension Score
KDDR = Rule Score
KDDB = Rule Binding Score

Information Steward Repo Joins

MAIN_ID where score_type_cd = TOTL (for KDD
score_type_cd = KDDQ( for Quality Dimension

Automated DQ Chain

Automated DQ Chain


Scorecard Drilldown

DQ Webi Report

DQ Webi Report Drilldown

Account Master Cleanup Requirements

Needed to prove to the business that account
master data was trustworthy
Too many overmatch and undermatch scenarios existed
in the old account master

Could not start from scratch because internal data

had been matched to an external data source by a
third party
Needed the cleanup effort to have data steward
input for uncertain matches
Little impact as possible on all current processes

Account Master Cleanup, Step 1

Identify overmatch scenarios, i.e. accounts that
had been incorrectly matched together
Run all current accounts with their children
through a data quality match transform
Break key is on Data Warehouse ID
Child can only match to their parent, not to other parent

Pass all potential overmatches to a review task in

Information Steward for data steward input
Use data stewards input to determine how to
handle the record
Leave alone or create a new account master

Account Master Overmatch Cleanup

Account Master Cleanup, Step 2

Improve the current delta matching logic that was
part of the sales weekly data warehouse load
Should see a gradual decrease in number of new
accounts created over time
3K per week initially

New children accounts must be matched first

against existing account masters, only after that
can they be considered a match with each other
Account master data was frozen for one month
to accomplish this task
Short enough timeline to not have a critical impact on
business decisions

Account Delta Process

Account Master Cleanup, Step 3

Identify undermatched accounts
Accounts that should be merged together but havent
been for whatever reason

Run all existing account master records through a

DS match dataflow to determine if they should be
merged into one
If a potential match is found between 2 or more
accounts, pass this match group along to an IS
Match Review task for data steward review
Utilize data stewardship results to determine a
winning account master and deprecate the
others in the group

Account Master Undermatch Cleanup


Ultimately would like to associate
CRM data with actual sales data coming from
Provides backward-looking analysis of sales rep
Capability to start performing some predictive analysis
Find more ideal customers
Identify prototypical customers
Focus on these accounts to grow business

Foundation is now in place to be in compliance

with Sunshine Act when it goes into effect


Yearly savings resulting from initial DW project:
Savings resulting from reduced time to process weekly
records: $13,000/month or $156,000/year
Customer targeting and predictive analytics is next
No upper bound on revenue potential

Involve the business often to showcase improvements
and ask for further suggestions
Necessary for all DG/DQ projects

Keep history of IS Match Review results

OK to leave in same table in 4.1, issues have been found in
early versions of 4.2
Just fine to move to another table if too confusing

Have separate Reviewer and Approver roles for

Match Review tasks
Easy to get fatigued when going through hundreds or
thousands of records
Also a good idea to allow a few days to pass between review
and approval

SAP Information Steward can assist in establishing a
Data Governance program and gaining momentum
within your organization
Empower your power users to own data processing
and be responsible for data quality. Actively involve
business users in all steps of the process
Eliminate manual intervention to automate data
processing as much as possible. This is where a large
portion of ROI can be found


Praneeth Padmanabhuni
Rich Hauser


Follow the ASUGNews team:

Tom Wailgum: @twailgum
Courtney Bjorlin: @cbjorlin
For all things SAP


Please provide feedback on this session by completing a
short survey via the event mobile application.
For ongoing education on this area of focus,