Sie sind auf Seite 1von 16

White Paper

The Genesis of Data Quality: The Emergent Data Steward


By: Cheri Mallory, Data Quality Consultant, Business Objects

Introduction

There has been a recent surge of interest in data integration. Organizations are pursuing
lofty goals, such as customer relationship management (CRM), customer data integration
(CDI), master data management (MDM) and the plethora of data integrations required for
data warehousing, business intelligence (BI) and data migrations (often driven by mergers
and acquisitions (M&A)). IT professionals are being asked to migrate, re-use, aggregate and
share data at unprecedented rates.
At the same time, current research, statistics, and certainly experience illustrate the
notion that the quality of data is impacting the success of these efforts. In 2005, Gartner
suggested that more than 50 percent of data warehouse projects will have limited
acceptance or will be failures through 2007.
As organizations pursue new goals such as MDM, CDI, data migrations and compliance, one
thing becomes very clear: these are more than technology issues. These efforts will not
succeed without addressing data quality issues. Data quality can only be addressed with
an in-depth personal understanding of data. This understanding can be facilitated with
data profiling tools, but data profiling tools are only as effective as the data analyst using
them. Who in your organization knows enough about the data to support enterprise data
integration goals?
Numerous white papers highlight emergent technologies this paper will cover the
nascent data stewardship shift. In this age of data integration, the role of the data steward
is growing. The data stewards responsibilities are moving from a single application or
database focus to a more broad enterprise, collaborative change management focus. How
does this impact data quality? For any data quality effort to be effective there must be
recognition and acknowledgment of the data steward, the person who understands the
complexities and abstractions of corporate data.
Organizations are rapidly moving through data quality maturity, from addressing ad
hoc issues to funding projects, to establishing data governance programs or centers of
excellence. The mature data steward can be exceedingly effective as a team member
addressing any data quality project need or issue. The vision of robust data integration and
data quality is accomplished by leveraging the knowledge and talents of data stewards.
These are the folks who are getting the work done. Their evolution directly impacts an
organizations ability to improve the quality of data.

From an information management and data quality perspective, an organization needs to


know


who the data steward is, both past and present,

what the boundaries are (if any) of this newly evolving role, and

in what ways changes in data stewardship can be leveraged going forward.

Like so many terms used in IT, the role of data steward has had a diverse array of meanings
for many years. In the past few decades, it has been defined as the person responsible for
data content and quality, primarily from an administration and maintenance perspective. The
question on everyones mind was who owns the data? For instance, Who is responsible
to maintain obscure reference models and to monitor data loads? In IT, the objective was to
push data stewardship into the business, in effect, to move responsibility for the data quality
and content out of ITs domain. Has this worked? Are your business users capable of data
management that meets your industry standards and enterprise goals?
Now, years later, it is clear that there cannot be one person who is accountable for all data
content and processes. The issues are too complex. Why is this? Lets take a relatively
simple example.
A small cell phone company (lets call it VillageCall) offered three months of free
Internet access to new customers. The business processes involved billing the
customer each month, while providing an adjustment equal to the billed charge and
applying that to the account for the first three months.
This was a customer acquisition program. The number of customers who
participated in the program, and the increased cost of customer acquisition, was
reported daily to customer service and marketing executives. Unfortunately, a
problem arose, and billing adjustments for a subset of new customers were not
applied on the second billing cycle.
Was this a data quality issue? Yes, in this case, a key data element in the
adjustment application was incorrectly defined. What makes this story interesting
is that the issue was identified not by the applications team in IT, but by the
customer service escalation team not an uncommon occurrence in the world of
data quality. There were enough customer service calls and escalations to warrant
a review by the application team in IT. Their review revealed that several thousand
adjustments were missed.
The applications team in IT addressed the data issue, applied the adjustments to
the next billing cycle, and seeded the interactive voice recognition (IVR) system to
proactively inform the customer that the issue was identified and rectified when
any call was received from the affected customers. In addition, IT and the customer

service team worked closely together to develop monitoring reports so that any
subsequent issues would be identified earlier.
This was a significant issue for several reasons. One issue was the obvious
displeasure of new VillageCall customers and the resulting decline of quality of care
performance measures. Also, both from an accounting and marketing perspective,
the numbers didnt add up. Adjustments that should have been applied to the
second billing month were actually applied in the third billing month. Finally, the
marketing reports that illustrated the effectiveness of the new customer offer were
skewed. The first month showed the number of customers and the cost of the
adjustment offer; the second month showed a significantly lower cost per customer,
and the third month showed a significantly higher cost per customer.
There were several teams with some level of responsibility for fixing the data and
implementing a monitoring process going forward. The customer service team, with
the help of IT analysts, had incorrectly defined the adjustment date calculation, and
they were the team that identified the error. Unfortunately, it was when customers
started calling to complain. The applications team fixed the code and applied new
account adjustments. The IVR team set up a special message group to automatically
respond to customer complaint calls. The IT team assisted in developing and
reviewing the monitoring reports. The CIO reviewed the monitoring reports on
a daily basis for a long period of time, realizing the materiality of the issue. The
accountants and marketing staff needed to understand and restate their reporting to
explain the skewed dollar trends.
In this example, did one team or one person own the data? Not really. There was not
a single, distinct team at VillageCall responsible for the data. The lifecycle of the data
consisted of sales, billing, customer service, monitoring, reporting and analysis. No one
person can feasibly own the data and this is a relatively simple example.
Consider your medical history as another example. Who has a stake in that data? You, your
doctors, nurses, radiologists, insurance provider, actuaries, your employers insurance plan
administrator and the list goes on. There are many more examples.
Everyone in the organization bears some responsibility for the health and well-being of
enterprise data. In the arena of data management, cooperation and collaboration across
many groups and individuals is imperative to success both within IT and across the enterprise.

The Data Steward in the Context of Information Quality Maturity


If the data steward is not the person who owns the data, then the question becomes, who is the
data steward? The answer has several layers. Really, nothing in IT can be simple. Identifying the
data steward, and his or her role, is specific to how you are managing data in your organization
today. What are your data related issues? How are you addressing data quality?

Frank Dravis, the Vice President of Information Quality at Business Objects and a prolific
researcher and writer in the data quality field, has introduced the concept of an information
quality maturity model similar to the Capability Maturity Model that exists for software
development. Every organization approaches data quality and data management differently
and with differing levels of commitment. This is not good or bad, it is just evolutionary.
Most IT organizations have some level of data management, but it is varied and may or may
not address data quality specifically. This is an important point to consider when identifying
the vision for data stewardship; before defining where you want to be, define where you are
today.
Within this context of information quality maturity, there are different kinds of data
stewards. One of the primary distinctions between the various phases of maturity is the
behavior, and recognition, of the data stewards. Dravis has identified five distinct levels of
maturity, but for the sake of simplicity, and brevity, lets coalesce them into three categories:


Ignorant/Ad Hoc

Project/Process

Information Center of Excellence

Ignorant/Ad Hoc
All organizations have data quality issues. Few argue that point, but even so most
organizations do not approach data quality as an integral part of their data management
strategy. Data quality projects are often one-off initiatives based on the occasional issue
that must be resolved, either as part of an internal project in IT, or based on a complaint
from the business. Within the information quality maturity model these organizations are
considered in the ignorant or ad hoc phase.
Prior to its acquisition by Business Objects, Firstlogic had the opportunity to survey
approximately 130 data management professionals about data stewardship in February
2006. The research indicates that more than 50 percent of organizations either do not
consciously manage the quality of data, or do so only as issues arise. When faced with
problematic data, the developer (and yes, it is often a lost, uninformed developer) may be
lucky enough to know the business user, or an expert in IT, who has knowledge of the data
content. This could very well be the person who originally raised the data quality issue. It
may be a data modeler or analyst within IT, or it may be a reporting analyst working within
the business (accounting and customer service are always likely candidates). Usually, it will
be a person who understands the data from a content and usage perspective. For the issue,
or fix, at hand, it is this person who will verbalize requirements and may even accomplish
testing and validation of the fix. This is your data steward. In this case this is the incognito
data steward, who has another job title and another full time job.

In the VillageCall adjustment application example, the data quality issue was discovered on
an ad hoc basis, and was initially addressed as a production broke issue, managed by the
operational support team. The incognito data stewards were the customer escalation team
members who recognized a trend in data errors and reported it to the application manager
in IT. This is a very common approach to information management. Does it sound familiar?

Project/Process
With the current environment of CRM implementations, pressures on data warehouses to
validate incoming data, and the genesis of CDI and MDM, there are many opportunities to
identify data quality as a recurring issue. It is addressed at the project level where business
processes are identified that impact the quality of data. At this point, the concept of data
quality starts to take on a life of its own and becomes a topic of discourse throughout the
organization. At the project level, it is likely that your data quality issues are quite serious
indeed, and demand attention from business executives in addition to IT management. It is
the serious issues that demand project level management. This is the project or process level
of maturity. At this level the data steward is still tasked to work on a project as a temporary
arrangement. There is not yet a formal structure or commitment for these key resources.
For each project that comes up, whether a data quality project, new custom development
or an off-the-shelf implementation, there is more involvement by the data steward. This
organic (or homegrown) data steward will increasingly become an in-demand, over-booked
resource. The data steward is at every design meeting, and he or she is intimately involved
in validating corporate data. These are the folks who know enough about the data content
and data usage to identify the associated business rules, define valid domains, and
communicate reporting impacts.
In the VillageCall example there was a clear shift in information quality maturity. As
adjustment application issues reoccurred, a project team was formed. The focus of the
team was twofold. First, the team was responsible for ensuring that the data that drove
the adjustment application was correct, and second, the team was responsible for setting
up data monitoring applications. Once monitoring was in place, there were project team
members, in addition to the CIO, who were responsible for data quality monitoring, and
reviewing daily reporting. These team members were acting as data stewards.

Information Center of Excellence


Some (albeit few) organizations have established formalized data management and have
dedicated staff assigned to data definition, quality and architecture. These organizations
have a Center of Excellence or established data governance. These organizations create data
quality requirements for data feeds into their enterprise and can actually impact the data
quality of external organizations. Research shows that this is happening at fewer than 10
percent of organizations who have an interest in data quality.

The formal data governance organization assigns dedicated staff to data stewardship and
data quality. The information management group spans across IT and the business, with a
wealth of knowledge about tools, best practices, and corporate data content. The function
of the data steward is an accepted part of the data management group. There is dedicated
staff in IT, and business staff that are formally acknowledged as decision makers. Finally, this
group has the authority to approve and implement new projects and changes, supported by
executives or a data governance organization.

Evolution of Relationships and Reporting Structure in the


Maturity Model
The evolution from one phase to another can be expressed in terms of the data stewards
relationship to the rest of the organization. Initially, for an ad hoc project, the developer and
data steward will work together, informally. They have an association, both aware of the others
knowledge and capabilities. It is a supportive relationship. At the project level or maturity
phase, there is a more formal structure, at least for the length of the project. This is an alliance
between the data stewards and other project team members. The alliance is defined by the
project objectives, a common purpose. Finally, for the center of excellence level of information
quality maturity, the data stewards have a syndicate relationship to the rest of the organization.
They now have associations, a clear purpose, and the authority to enforce data requirements.

Figure 1: The data steward within the evolving organization

As the organization matures, so do the organizational structures around data stewardship.


Below are a few examples of the reporting structures for each maturity level.

Figure 2: Possible organization structure for Center of Excellence

Figure 3: Possible project organization structure

Figure 4: Data stewards organized via unstructured associations

Who Are The Data Stewards?


There are many incognito data stewards in your enterprise; they just need to be recognized.
Data stewards can be found in any area of the organization, including, but not limited to,
IT. There are some likely places to look, such as the accounting department as a whole,
any business intelligence group, and the customer service issue escalation team. These
are the folks who maintain configuration data for your CRM or financials application. They
are actuaries and application managers. The data stewards are out there, likely staring at a
spreadsheet at this very moment.
The same Firstlogic survey of data management professionals that was referenced earlier
also revealed that 94 percent of organizations have data stewards, and that 71 percent of
these report that they have incognito data stewards.
In the VillageCall example, the data stewards were all incognito. They included customer
service staff, reporting analysis, a development manager, a data architect and a CIO!
Regardless of your information quality maturity level, you have data stewards in your
organization. How you manage the stewardship function, and how you define your
information management strategy as a whole, is directly impacted by the maturity of your
organization. Early on, you need to know enough to recognize the existing data steward
as a valuable resource. As data quality projects recur and eventually turn into data quality
programs, you need to manage the changing definition of the data steward.

What Traits Should You Seek in Order to Find a Good Data Steward?
There are six key attributes of a good data steward. Organizations can use these traits of the
data steward to identify high-potential resources, to create a plan for a data management
team, or to evaluate new hires. Good data stewards:


have an innate sense of data structure, data flow or data management concepts
(sometimes without any formal data management education). At the same time he
or she is capable of some very detail level research. This combination is rare and
powerful.

can speak to IT and business requirements, easily verbalizing topics including


storage, reporting (replication and aggregation), and the life cycle of information
(such as the downstream business impacts of data storage or content changes).

realize the full business impact of poor data quality and care about the issues. They
see not only the immediate issue but also the downstream impacts of the issue. The
real clue that you are talking to a data steward is that they are so often the voice of
doom. A data steward asks tough questions.

are known throughout the organization for their understanding of a set of content;
the data steward is a walking encyclopedia about one set of content. For example,

these are the people who must be at an application design meeting, or are the only
person who can validate a report. The data steward is a true knowledge worker,
and carries a wealth of information about data content and relationships that
cannot be fully captured by any model or dictionary.


ideally know data modeling or have some basic SQL skills, but this is not
imperative. There is no better match than a data steward and a robust data
profiling tool. You should attempt to provide data profiling capabilities and
secure access to the data (either production or replication) to ensure that the data
stewards skills are used to the fullest.

will be a diplomat. Especially as the organization matures and interaction


increases between IT and the business and as the issues turn into projects that
turn to programs, the data steward can be a facilitator. You want someone who can
communicate complex issues to management, both in IT and on the business side.

What Is The Role of The Data Steward?


As an organization moves through the various phases of information quality maturity, and
as its data management capabilities grow, the changes in data stewardship are distinct.
The first apparent change is the shift in authority. The incognito data steward is frustrated
and unsupported, but as the information management organization matures, the data
steward develops an increased level of responsibility and authority. A formal information
management organization will be governed by a team of executives who will ensure that the
entire life cycle of data is considered in the decision making process. As the role becomes
more formalized, so does the respect and value that other team members perceive in the
contribution of the data steward.
The role of the data steward will shift from one of annoyance to one of inspiring ambition.
There is nothing like a bit of authority to make a job attractive. Even more likely, the
information management team will have increasing scope over time, as they prove their
success in facilitating decisions and action across diverse organizational groups. The
objective of the highly evolved data management team is to overcome the damaging
impacts of working within silos. It will not be long before it is apparent that this approach
can apply to other initiatives.
Within either a project structure or a data governance structure, the data steward is allowed to
be, and recognized as, the authority on the data. Additionally, the data is recognized to be a
valuable asset within the organization. The data steward can have these responsibilities:


Defining the valid data content criteria (requirements). This may include metadata,
the definition of the corporate data ontology, data registry, or online technical
dictionary. Data definition is the genesis of data quality. The data definition drives
the data quality requirement. If these two do not align, there is a problem.

Understanding and communicating the full life cycle of the data as it is used as
information. This is an enterprise-wide role, and would require that the data
steward be included in a diverse set of business groups and initiatives.

Resolving issues surrounding the best source of data or the system of record.

Monitoring data quality. Development of requirements for data monitoring reports


or software.

Developing requirements, testing and giving approval for data fixes and data cleansing.

Providing data security analysis and recommendations.

Offering recommendations for data compliance (Sarbanes-Oxley (SOX), Basel II,


Federal Information Processing Standards (FIPS), Federal Information Security
Management Act (FISMA), the Data Quality Act, etc.).

As the organization moves through maturity levels, so does the nature of the data
stewardship roles. The table below illustrates these changes.
Data steward role/ Ad hoc
IQ maturity level

Project

Center of Excellence

Data definition

Knows valid data

As integral part of team,

Captures metadata,

definitions, may have some

documents data integration

facilitates the infrastructure

documentation, participates

and reporting requirements,

required for sharing

in data modeling exercises,

participates in analysis and

information within and

assists developers in

design of data fixes, by

between enterprises,

defining data integration

providing, and sometimes

participates in creation of an

and reporting requirements

documenting, valid data

open technical dictionary,

criteria

ontology or data registry

Full life cycle

Tasked after the fact to

Reviews project

Participates in team reviews,

analysis

assist with solving data

documentation and design

by subject area, to ensure

issues caused by a lack

to ensure that downstream

that all data content and data

of understanding of

impacts are considered and

model changes are made

downstream impacts

understood

with consideration of the full


life cycle of the data

Issue resolution

Responds to urgent issues

Assists project teams in

as they occur, assists the

resolving issues, such as

of data issues, has the

developers with data fixes,

identifying the system of

authority to initiate projects

often will be the person who

record, where data errors

to resolve issues

raises a data issue

are occurring

10

Determines the materiality

Data quality

Reviews and validates

On a project level, assists

Tracks data quality

monitoring

reports and data extracts,

in defining data quality

dashboards and reports,

requests additional

monitoring reports, with

assists in defining data

information about data

ongoing monitoring difficult

quality monitoring reports

quality

to maintain

and applications

Testing and approval Assists development

Responsible for data content Tests and validates all

and reporting teams in

testing against project level

new data integrations and

understanding data

data quality requirements

applications, and based on


organization policy, has the
authority to veto new software

Data security

Has reactionary response to

Evaluates data security

Defines, evaluates and

data security lapses

needs project by project

approves data security

Has reactionary, one time

Is member of project team,

Is accountable for data

effort

such as SOX compliance

management practices and

project

policies being compliant

policies
Data compliance

Once you start considering your incognito data stewards as part of the overall information
management environment, it becomes clear that a great deal of data stewardship work is
being done. The Firstlogic primary research also indicates that 70 percent of organizations
have data stewards that are monitoring data quality, providing information about data
content to project team members, and participating in the design of applications. The same
Firstlogic research reports that only 18 percent of organizations have data stewards with
the authority to set data-related policies and make decisions and that 43 percent of projects
have a data steward assigned who is actively participating in data management work.
Finally, the question turns to how to manage data stewardship. This can be done in three
distinct ways:
1. The data stewardship function is entirely a business function and is managed
within the business. This tends to be a diverse and fragmented collection of staff,
each focused on a particular subject area. Data stewardship alone will not work for
a mature data quality organization that needs to make data management decisions
at an enterprise level.
2. Data stewards are only recognized as such if they are part of IT and the
function is managed within the data management organization within IT. This
is not uncommon. However, the IT group could miss out on some very powerful
opportunities if the business data stewards are not recognized or supported.
3. The IT data management organization includes data stewards that work closely
with the business data stewards. The IT data management organization also
leverages the business data stewards, formalizing and recognizing their input.

11

Which of these models will work best for you? It is closely dependent upon the information
quality maturity of your organization and your project goals. Ideally, you should strive to have
some version of the third model, where you leverage the data stewardship expertise from the
business as you manage a project or build a data management organization within IT. The
data stewardship function falls within a data management organization, and is managed
in conjunction with data architects and database administrators. The objectives of this
organization are to manage the content of information and to manage the quality of the data
that is provided by and provided to external organizations.
For a mature organization, IT should take the lead on each and every aspect of data
stewardship, with varying degrees of involvement by the business data stewards. This does
not mean that IT owns data quality. It means that IT should own the business processes
associated with data management. For example, on issues of understanding the life cycle of
data, the business should be the primary source of information, with IT coordinating the effort
and providing necessary profiling or monitoring tools. For the data security analysis, it is likely
that IT will take the lead, with approval and support coming from the business.
Firstlogics primary research also indicates that 70 percent of organizations have data
stewards in both IT and in the business. Only 15 percent have data stewards exclusively in IT,
and 15 percent exclusively in the business. Consider managing to reality and take your data
stewards wherever you can find them.

How to Leverage This Growth Going Forward


Organizations are faced with information management and data quality challenges from every
corner. The data steward, whether incognito or part of a formal team, can provide tremendous
support to a project or program manager attempting to resolve issues. Some of these are high
level, like the increased success of correctly understanding your data content. But some are
very specific, such as:


In these days of tight budgets, identifying the incognito data steward will save the
cost of hiring and training a new resource. Bring the business user on board, include
the expert in IT, and you will have instant expertise, buy-in, and priority. For the
business data steward it is important that you have buy-in from their management, so
that they have the ability to focus some amount of their time on your project.

A business data steward will communicate to business managers and executives both
the importance of the project and the success of the project. This is one of your best
sales tools.

You will reduce the risk of a small number of IT staff understanding the data
processes when your team is more inclusive. You will have less negative impact due
to staff turnover.

Monitoring data quality and running occasional fixes will be much more

12

straightforward if the business teams are working with IT directly. They know what is
right and wrong, what the downstream impacts are, and the most appropriate way to
run fixes.


As you grow a more formal data management team, a business data steward is one of
your best new hires.

The data steward, with an understanding of downstream impacts of data changes, will
often divert the project team from making bad data decisions. They can effectively
save you from yourself. For example, a data steward may be able to alert the team if a
proposed source system change would negatively affect business intelligence efforts.
Knowing this before the changes are rolled into production can prevent serious
problems later on.

Firstlogic's research indicates that 76 percent of organizations are not leveraging their existing
data stewardship resources as a way to communicate project justifications and to advertise
project successes. It is time to start thinking outside the box. These are existing, talented
resources available to assist you immediately.

Conclusion
This ongoing evolution of data stewardship parallels the growth of data management and
governance as a whole. The role is becoming more prevalent and more important. The
majority of the people who understand the data at an enterprise level are generally the
business users because they see the business impacts. If you are a project manager in
IT facing a data quality issue, then you will need to increasingly rely upon data stewards
to provide direction and requirements. If those resources are primarily coming from the
business, you are faced with a very clear opportunity. Yes, an opportunity. How better to
communicate the priority of your project, and ultimately the success of your project, than
to have business users included? The last, and perhaps most opportune, role for the data
steward, is to communicate to his or her business management and executives how effectively
IT is meeting the needs of the organization.
This is a new model in data stewardship. The two critical points are that 1) you have data
stewards in your organization today, whether or not you recognize them as such, and 2) a
mature data quality organization has a collaborative approach around data management, and
one of the keys factors is leveraging the talents of your data stewards.
These resources have very distinct personalities, and can be extremely effective in supporting
information management objectives. The roles change as your organization matures in its approach
to data management. However, you do not have to wait to become a highly evolved organization
in order to take advantage of the talents of your data stewards. Each individual can be a valued,
recognized part of any data quality or information management endeavor. You can access data
stewards in your organization now and you can leverage their skills, talent, and commitment to make
your projects and information management efforts as successful as possible.

13

About the Author


Cheri Mallory is a Strategic Data Quality Consultant for Business Objects, concentrating
on government and financial solutions. She provides data quality analysis, strategy and
consultation for an extensive list of industry-leading clientele. Cheri researches data quality
trends, data stewardship and best practices, presenting her findings at industry educational
events and contributing articles and white papers to the genre. Previously the IT Manager of
Data Quality at EchoStar Satellite, LLC (Dish Network), she brings a wealth of knowledge and
experience in telecommunications and satellite broadcasting, in addition to background in
state and federal government, manufacturing and healthcare from a combined 17 years in
data quality, data management and data warehousing. She holds a BS in Computer Science
from the University of Maryland.

About Business Objects


Business Objects is the worlds leading business intelligence (BI) software company. With
more than 35,000 customers worldwide, including over 80 percent of the Fortune 500,
Business Objects helps organizations gain better insight into their business, improve
decision making, and optimize enterprise performance. The companys business intelligence
platform, BusinessObjects XI, offers the BI industrys most advanced and complete platform
for performance management, planning, reporting, query and analysis, and enterprise
information management. BusinessObjects XI includes Crystal Reports, the industry
standard for enterprise reporting. Business Objects has built the industrys strongest and
most diverse partner community, and also offers consulting and education services to help
customers effectively deploy their business intelligence projects. More information about
Business Objects can be found at www.businessobjects.com.

14

Das könnte Ihnen auch gefallen