Sie sind auf Seite 1von 22

MISY 930 - Business Information Systems & Technologies

Fall 2011
Instructor: Dr. Ramesh Konda
E-mail: kramesh@tu.edu
Office ours: By appointment (Please send me an email for appointment)
!e"uire# Te$t Boo%:
ana!ement "nformation #ystems$ 12%&
Ken 'audon ( )ane 'audon
"#B*+10, 01-21.2/01
"#B*+1-, 23/01-21.2/0.
Pu4lisher, Prenti5e 6all7 8opyri!ht, 2012
&lass Sche#ule:
#ept 2. ( 20$ 2011 (#at ( #un)
95t 22 ( 2-$ 2011 (#at ( #un)
*o: 21 ( 23$ 2011 (#at ( #un)
O'er'ie( of the &ourse
;his 5ourse pro:ides inte!rati:e 5o:era!e of essential ne< te5hnolo!ies$ information system
appli5ations$ and their impa5t on 4usiness models and mana!erial de5ision makin!. =e <ill
dis5uss "nformation #ystems and >lo4al &+Business and 8olla4oration. Part of this$ <e <ill
5o:er "nformation #ystems$ 9r!ani?ations$ and #trate!y. =e <ill also 5o:er key aspe5ts of
"nformation ;e5hnolo!y "nfrastru5ture$ &mer!in! ;e5hnolo!ies$ Foundations of Business
"ntelli!en5e$ @5hie:in! 9perational &A5ellen5e$ and Buildin! and ana!in! #ystems.
)earning O*+ecti'es
;his 5ourse pro:ides you <ith the opportunity to,
1. Be a4le to understand the insi!hts of "nformation #ystems$ 9r!ani?ations$ and #trate!y
2. Be a4le to define "nformation #ystems in >lo4al Business ;oday
-. Be a4le to identify strate!ies >lo4al &+Business and 8olla4oration
.. 8learly define "nformation ;e5hnolo!y "nfrastru5ture alon! <ith &mer!in! ;e5hnolo!ies
0. Be a4le to lay Foundations of Business "ntelli!en5e in terms of Data4ases and "nformation
ana!ement
1. Bnderstand the importan5e of ;ele5ommuni5ations$ the "nternet$ and =ireless ;e5hnolo!y
3. Define strate!y for a5hie:in! 9perational &A5ellen5e and 8ustomer "ntima5y usin! Di!ital
arkets$ Di!ital >oods
/. Be a4le to define and ana!in! Kno<led!e and enhan5e data dri:en de5ision makin!
Key Factors for an Effective Quality Assurance in Data Warehousing
2. Be a4le to 4uild and ana!in! #ystems in terms of "nformation #ystems and ProCe5ts
&lass Sche#ule:
=eekend 5lass @ssi!nments
,0am-,-:30.m ,:30.m-/.m
2.+#ept 5hap 1$ 2 8hap -$ .
20+#ept 5hap 0$ 1 Re:ie< 8#'9 " (three Duestions)
Due, 8#'9 " (11
th
95t)
22+95t 5hap 3$ / 5hap 2$ 10 8#'9 "" (three Duestions)
2-+95t 5hap 11$ 12 Re:ie< id+term (online%take home)
Due, id+term (1
th
*o:)
Due, 8#'9 "" (1
th
*o:)
21+*o: 5hap 1-$ 1. 8hap 10$ Re:ie< Due, 8lass ProCe5t (12
th
*o:)
-0-1o' 2inal E$am 3In-class4


5ra#ing 6ssignments
8#'9s (;hree 8#'9s) -0E
id+term 10E
8lass ProCe5t%"ndi:idual ProCe5t 20E
Final &Aam -0E
@ttendan5e ( 8lass Parti5ipation 0E
&lass 5ra#ing &riteria:
1. 8#'9 (;hree 8#'9 assi!nments7 ea5h <ill ha:e four essay Duestions) + approA. -0E
2. id+term &Aam (@pproAimately 20+-0 multiple 5hoi5e Duestions + take home eAam) +
approA. 10E
-. "ndi:idual ProCe5t (#tudents su4mit paper on a spe5ifi5 topi5 from the 5lass sylla4us$
minimum / pa!es (dou4le+spa5e) lon! and must use at least 0 referen5es and use @P@
format for the paper) + approA. 20E
.. Final &Aam (@pproAimately .0+00 multiple 5hoi5e Duestions + eAam <ill 4e !i:en in the
5lass + students must attend the 5lass to take this eAam) + approA. -0E
0. 8lass Parti5ipation (Based on the attendan5e as <ell as parti5ipation in 5lass dis5ussions)
+ approA. 0E
Key Factors for an Effective Quality Assurance in Data Warehousing
5ra#ing 2ormula
@ 20 F 100
@+ 20 F 2.
BG /3 F /2
B /- F /1
B+ /0 F /2
8G 33 F 32
8 3- F 31
8+ 30 F 32
D 10 F 12
F 02 or H
&lass 7artici.ation
8lass parti5ipation <ill 4e 4ased on the :alue you add to the 5lass throu!h your Duestions$
statements$ and 5omments. "t is the Duality of these 5ontri4utions that is more important
than the Duantity.
6tten#ance
@ttendan5e is mandatory and <ill 4e 5he5ked for ea5h 5lass session. "n addition$ an
uneA5used 5lass a4sen5e <ill affe5t your 5lass parti5ipation !rade. Please si!n+in the
attendan5e sheet in e:ery time the 5lass meets.
6ca#emic Miscon#uct
@5ademi5 mis5ondu5t or 5heatin! <ill not 4e tolerated. ;he follo<in! definition of
a5ademi5 mis5ondu5t has 4een de:eloped 4y ";B
@5ademi5 mis5ondu5t is defined as re5eipt or transmission of unauthori?ed aid on
assi!nments or eAaminations$ pla!iarism$ unauthori?ed use of eAamination materials$ or
other forms of dishonesty in a5ademi5 matters. @5ademi5 mis5ondu5t is a maCor offense
at ";B 4e5ause it diminishes the Duality of s5holarship in our a5ademi5 5ommunity and
5heats those <ho may e:entually depend upon our kno<led!e and inte!rity.
S.ecial &ircumstances
"f you ha:e a do5umented disa4ility and <ish to dis5uss a5ademi5 a55ommodations$
please 5onta5t me as soon as possi4le.
Key Factors for an Effective Quality Assurance in Data Warehousing
&lass 7ro+ect 5ui#elines
;he purpose of the 5lass proCe5t is to demonstrate the appli5ation of kno<led!e that you ha:e
!ained from the 5lass. Iou 5an pi5k any topi5 from the 5lass 5urri5ulum$ and 5an <ork on it
usin! the follo<in! options. Iou 5an pi5k one of the follo<in! options for your 5lass proCe5t.
(;o<ards end of this se5tion$ atta5hed is the sample paper that you 5an use as referen5e for the
format.)
O.tion ,: 3)iterature re'ie(4
'iterature re:ie< of on any spe5ifi5 topi5 from the 5lass 5urri5ulum (&Aample, importan5e of
"nformation #ystems in health5are$ "nformation #ystems metri5s in health5are$ "nformation
#ystems proCe5t s5ope mana!ement$ &mer!in! te5hnolo!y in "nformation #yst!ems$ et5.).
Iou should find and study minimum 0 papers from your topi5 area. 9n5e you re:ie< the papers$
please re<rite the kno<led!e that you ha:e !ained from the a4o:e in a paper format as follo<s.
The length of your .a.er shoul# 8 .ages #ou*le s.ace 3.lease follo( 676 format49
Format for the paper,
a) @4stra5t, Des5ri4e at hi!h+le:el a4out the information that you are !oin! to 4e
presentin! in the paper.
4) "ntrodu5tion, Pro:ide introdu5tion a4out your topi5 and <hy it is important% interestin!
to study and its appli5ations% 5hallen!es.
5) 'iterature re:ie<, Pro:ide information that is a:aila4le in the literature that is related
to your topi5.
d) #i!nifi5ant findin!s%learnin! from the 'iterature re:ie<, Des5ri4e the kno<led!e that
you ha:e !ained from the literature re:ie< and make any ar!uments and dis5ussion.
e) #ummary and potential topi5s for future resear5h, Pro:ide your 5omments on ho< the
a4o:e study (that you ha:e found in the literature) is useful and ho< it 5ould ha:e done to
make it 4etter.
O.tion -: 7ractical :or% 7ro+ect
Iou may 5hoose a proCe5t from your <ork eAperien5e more that is rele:ant to the 5lass
5urri5ulum. Please <rite this proCe5t in the follo<in! format. "t is re5ommended that you use
appropriate referen5es from the literature.
@lso$ list 5hallen!es and ho< you 5ould ha:e addressed no< ha:in! that you may ha:e more
kno<led!e in topi5 from this 5lass. ;ry to use any pu4lished papers to support your
ar!uments%dis5ussion. The length of your .a.er shoul# 8 .ages #ou*le s.ace 3.lease follo(
676 format49
Format for the paper, (#ee in option 1 for des5ription for some of the follo<in!)
a) @4stra5t
4) Des5ription
5) Plan%#teps%ethodolo!y follo<ed
d) #i!nifi5ant findin!s%learnin! from the ProCe5t
e) #ummary and lessons learned and potential topi5s for future resear5h
Key Factors for an Effective Quality Assurance in Data Warehousing
Detailed 5hapters,
++++++++++++++++++++++++++++++++++++++++++++++++
Part 1, 9r!ani?ations$ ana!ement$ and the *et<orked &nterprise
8hapter 1, "nformation #ystems in >lo4al Business ;oday
8hapter 2, >lo4al &+Business and 8olla4oration
8hapter -, "nformation #ystems$ 9r!ani?ations$ and #trate!y
8hapter ., &thi5al and #o5ial "ssues in "nformation #ystems
Part 2, "nformation ;e5hnolo!y "nfrastru5ture
8hapter 0, "; "nfrastru5ture and &mer!in! ;e5hnolo!ies
8hapter 1, Foundations of Business "ntelli!en5e, Data4ases and "nformation
ana!ement
8hapter 3, ;ele5ommuni5ations$ the "nternet$ and =ireless ;e5hnolo!y
8hapter /, #e5urin! "nformation #ystems
Part -, Key #ystem @ppli5ations for the Di!ital @!e
8hapter 2, @5hie:in! 9perational &A5ellen5e and 8ustomer "ntima5y, &nterprise
@ppli5ations
8hapter 10, &+8ommer5e, Di!ital arkets$ Di!ital >oods
8hapter 11, ana!in! Kno<led!e
8hapter 12, &nhan5in! De5ision akin!
Part ., Buildin! and ana!in! #ystems
8hapter 1-, Buildin! "nformation #ystems
8hapter 1., ana!in! ProCe5ts
8hapter 10, ana!in! >lo4al #ystems
++++++++++++++++++++++++++++++++++++++++++++++++
Key Factors for an Effective Quality Assurance in Data Warehousing
Sample Paper
Key Factors for an Effective Quality Assurance
in Data Warehousing
4y
Ramesh Konda
1
$ Rao R. *emani
2
and )amuna R. *emani
-
1
*o:a #outheastern Bni:ersity
;he >raduate #5hool of 8omputer and "nformation #5ien5es
Fort 'auderdale$ F' ---1.
Konda1221@yahoo.5om
2
9pus 8olle!e of Business
Bni:ersity of #t. ;homas
1000 'a#alle @:enue $ ;6 .00
inneapolis $ * 00.0-
*ema//11@stthomas.edu
-
Primetherapeuti5s ''8
1-00 8orporate 8enter Dr$
&a!an$ * 00121
*emani2@Iahoo.5om
#ili5on Jalley @meri5an #o5iety for Kuality 8onferen5e (95to4er 2002)
=ritten on, 22 )uly 2002
Key Factors for an Effective Quality Assurance in Data Warehousing
6*stract
#trate!i5 and data+dri:en de5ision makin!$ in tur4ulent en:ironments$ has 4een
pushin! or!ani?ations to 4uild 4usiness related Data =arehouse (D=) en:ironment to
store and mana!e :ast amounts of data. ;he main premise of ha:in! a D= is to pro:ide a
sin!le point of truth and 5oherent data at one pla5e. D= 5an 4e defined as a 5olle5tion of
su4Ce5t+oriented$ inte!rated$ non+:olatile data that supports the mana!ement de5ision
Key Factors for an Effective Quality Assurance in Data Warehousing
pro5ess. #u55essful D= implementation helped the 4usinesses to store$ analy?e$ and
share 5riti5al and 5onfidential data on+line amon! their 4usiness partners and 5ustomers.
6o<e:er$ ineffi5ien5ies in data Duality <ithin a !i:en D= 4een the main 5on5ern of the
4usiness users that ha:e not 4een addressed adeDuately. Bnless a defined and planned
approa5h for data Duality is follo<ed durin! the different phases of D=$ the
or!ani?ations may suffer from data Duality issues$ 5onseDuently$ any efforts to fiA the
Data Kuality (DK) issues 4e5ome :ery eApensi:e and time 5onsumin!. DK 5an 4e
attri4uted to se:eral fa5tors su5h as data a55ura5y$ 5ompleteness$ timeliness$ 5oheren5y$
5onsisten5y$ 5onformity$ and re5ord dupli5ation. ;his paper presents a su55essful
approa5h for implementation of key Duality fa5tors into D= durin! the de:elopment and
deployments phase. &Aamples from 4usiness are used to demonstrate the pra5ti5al aspe5ts
of the proposed approa5h that yield positi:e results in D= de:elopment and deployment.
;ey(or#s
Data =arehousin!$ Data Kuality$ Kuality @ssuran5e$ Kuality @ssuran5e ;estin!$
Kuality @ssuran5e Plannin!$ Kuality @ssuran5e Deployment
,9 Intro#uction:
)ean+Pierre (200.) 4elie:es that an un5lear definition of DK itself leads to la5k of solid
methodolo!y to deal <ith DK. Kuality is a relati:e statement and :aries 4y indi:iduals 4ased
upon their per5eptions. "n simplisti5 terms DK is per5ei:ed as Ltrue and a55urateM. ;his makes
DK hard to define and measure. ;o understand ho< to ta5kle the pro4lem$ DK needs to 4e
understood thorou!hly from the or!ani?ational point of :ie<$ and then a pro5ess 5an esta4lished
to deal <ith DK <ithin the or!ani?ation. "n simplisti5 terms$ DK 5an 4e defined as an a4sent of
undesira4le 5hara5teristi5s or presen5e of desira4le 5hara5teristi5s in the data.
Key Factors for an Effective Quality Assurance in Data Warehousing
Bu5kley and Poston (12/.) defined soft<are Duality assuran5e (#K@) as a planned and
systemati5 pattern of all a5tions ne5essary to pro:ide adeDuate 5onfiden5e that the soft<are
5onforms to the defined reDuirements. 8ho< (12/0) ar!ued that failure to pay enou!h attention
to #K@ has often resulted in s5hedule delays$ 4ud!et o:erruns$ and failure to meet the 5ustomer
satisfa5tion. @4del+6amid (12//) arti5ulated in his resear5h that #K@ not only holds the key to
5ustomer satisfa5tion$ 4ut also has a dire5t impa5t on the 5ost and the s5hedulin! of a proCe5t.
;here has 4een !reat pro!ress and impro:ement in the 5ore te5hnolo!y of D=7 ho<e:er the
DK aspe5ts are one of the 5ru5ial issues that <ere not adeDuately addressed. "n a sur:ey 4y
Friedman$ *elson$ and Rad5liffe (200.)$ it <as stated that 30 per5ent of sur:ey respondents
reported si!nifi5ant pro4lems stemmin! from defe5ti:e and fra!mented data$ o:er 00 per5ent has
in5urred 5ost for data re5on5iliations$ and -- per5ent <ere delayed "; systems o<in! to data
Duality pro4lems. @ se5ond sur:ey 4y @m4ler (2001) reported se:eral metri5s from the response
that indi5ate Data Kuality has 4een the maCor issue and reDuires 5onsiderate attention to sol:e
this pro4lem. For eAample$ the follo<in! 5hart illustrates only 2 per5ent of the respondents feel
!ood a4out the data Duality in their data <arehousin! and rest of the 2/ per5ent indi5ate some
kind of data Duality issues that need 4e addressed.
Fi!ure 1. 8urrent #tate of Data Kuality (@m4ler$ 2001)
9ne of the maCor fa5tors of influen5in! the DK is user per5eption. Furthermore$ if user
Key Factors for an Effective Quality Assurance in Data Warehousing
assumptions or per5eptions are un5he5ked$ then o:er time it starts to 4e5ome Nthe truthO <hether
or not it has an o4Ce5ti:e or fa5tual 4asis$ from 4oth 4usiness and te5hni5al perspe5ti:es (Bryan$
2002). DK indi5ates ho< <ell enterprise data mat5hes up <ith the real <orld at any !i:en time.
;here are many sour5es of Pdirty dataP. ;hese sour5es 5onsist of a) Poor data entry$ <hi5h
in5ludes misspellin!s$ typo!raphi5al errors and transpositions$ and :ariations in spellin! or
namin!$ 4) data missin! from data4ase fields$ 5) la5k of 5ompany+<ide or industry+<ide data
5odin! standards$ d) multiple data4ases s5attered throu!hout different departments or
or!ani?ations$ <ith the data in ea5h stru5tured a55ordin! to the rules of that parti5ular data4ase$
and e) older systems that 5ontain poorly do5umented or o4solete data (@ndrea ( iriam$ 2000).
*ord (2000) mention that the DK has 4e5ome an in5reasin!ly 5riti5al 5on5ern and it has 4een
rated as a top 5on5ern to data 5onsumers in many or!ani?ations. *ord (2000) 5ontinued statin!
that the data Duality is !ainin! its importan5e <ithin resear5h and amon! the 5onsumer
or!ani?ations.
&nsurin! hi!h le:el DK is one of the most eApensi:e and time+5onsumin! tasks to
perform in data <arehousin! proCe5ts. any data <arehouse proCe5ts ha:e failed half<ay
throu!h due to poor DK. ;his is often 4e5ause DK pro4lems do not 4e5ome apparent until the
proCe5t is under<ay. @ny 5han!es to D= at the implementation sta!e are eAtremely 5ostly and
may push proCe5t 4ud!et limits. "f all the 5onsiderations are eAamined thorou!hly at the strate!y
and desi!n sta!e of D=$ the plans and 5ontrols 5an 4e formulated into the desi!n for DK that 5an
de5rease operational 5osts$ in5rease 5ustomer satisfa5tion$ impro:e effe5ti:e de5ision+makin!$
and employee 5onfiden5e in usin! the data (@ndrea ( iriam$ 2000). ;he Duality of information
systems ("#) is 5riti5ally important for 5ompanies to deri:e return on their in:estments.
;herefore$ de:elopin! !ood Duality in Data =arehousin! that meets user needs is 4e5omin! a
Key Factors for an Effective Quality Assurance in Data Warehousing
5riti5al theme for information te5hnolo!y mana!ement (>uimaraes$ #taples ( 5Keen$ 2003).
&n!lish (2001) listed se:eral eAamples in his paper that dra< attention to the ne!ati:e impa5t of
the DK issues in D=. #ome of them in5lude errors in students Basi5 #tandards ;est s5ores$
pension <ithholdin!s$ in:oi5in!$ and food pro5essin! that led to the loss of 4illions of dollars as
<ell as loss of reputation of those 4usinesses.
#e5tion 2 of this paper presents a 4rief literature re:ie<. "n se5tion -$ the authors eAamine
the pro5ess and 5riti5al fa5tors of DK. ;hen$ the follo<in! se5tion dis5usses the 5urrent pra5ti5es
of DK in D=$ and proposes a solution 4ased on the pra5titioners point of :ie< for impro:in! the
Duality of data. ;he last se5tion summari?es the paper.
-9 )iterature !e'ie(
#oft<are Duality assuran5e (K@) is one of the 5riti5al fun5tions in the soft<are
de:elopment and maintenan5e of soft<are systems. Be5ause K@ is a ri!orous fun5tion that adds
si!nifi5ant effort and 5ost to the total soft<are de:elopment 5ost$ the K@ pro5ess is often
5ompromised durin! the soft<are de:elopment. 6o<e:er$ the 5on5ern has not 4een adeDuately
addressed in the literature. ;here are many fa5ets of K@ in a D= proCe5t7 this paper is primarily
intended to fo5us on K@ pro5ess and fa5tors in:ol:ed in D=Os Data Kuality aspe5ts. @s the
Duality assuran5e aims to dete5t systemati5 risks in order to a:oid them$ the authors <ill dis5uss
:arious Duality assuran5e fa5tors in this paper. @s K@ aims at systemati5 5o:era!e of 4usiness
reDuirements to system reDuirements to test plan and test eAe5ution$ the pro5ess ensures data
Duality is a5hie:ed to the a55epta4le le:el.
"ain ( Don (2000) ar!ue that in order to ta5kle this diffi5ult issue$ or!ani?ations need
4oth a top+do<n approa5h to DK sponsored 4y the most senior le:els of mana!ement and a
5omprehensi:e 4ottom up analysis of data sour5in!$ usa!e and 5ontent in5ludin! an assessment
Key Factors for an Effective Quality Assurance in Data Warehousing
of the enterprisePs 5apa4ilities in terms of data mana!ement$ rele:ant tools$ and people skills. Qu$
*ord$ Bro<n$ and *ord (2002) 4elie:e that for or!ani?ations 5onsiderin! implementin! of D=$
it is essential that DK issues 4e thorou!hly understood and the or!ani?ations should o4tain
kno<led!e of the 5riti5al su55ess fa5tors essential to ensure DK durin! the implementation
pro5ess. ;he main 5omponents of data that determines the DK are$ 5ompleteness$
appropriateness$ a55ura5y$ !roupin! a55ura5y$ a55ess$ 5onfiden5e$ 5urren5y$ re!ulators$ le!al
5omplian5e$ and meta+linkin!. Data interfa5e$ data repli5ation and data mi!ration and mo:ement
all share 5ommon 5hara5teristi5s su5h as :olume of data$ timeliness of mo:ement and
pro5essin!$ dire5tion of flo< 4et<een sour5es and tar!ets (Bryan$ 2002).
DK tools !enerally fall into one of three 5ate!ories, auditin!$ 5leansin! and mi!ration.
Data auditin! tools apply predefined 4usiness rules a!ainst a sour5e data4ase. ;hese tools
enhan5e the a55ura5y and 5orre5tness of the data at the sour5e. #ome of the data 5leansin! tools
5ompare the data a!ainst an independent sour5e e.!. B# Postal 8odes for :erifyin! the data.
Data is typi5ally mo:ed from the sour5e to intermediate sta!in! area <here the data 5leansin!
a5ti:ities are performed.
Data mi!ration is an a5ti:ity <here data is eAtra5ted and transported from one sour5e to
another. Data mi!ration tools perform the a5ti:ity of eAtra5tion$ transportation and mappin! for
data from one platform to another. Poor DK impa5ts the typi5al enterprise in many <ays su5h as
5ustomer dissatisfa5tion$ in5reased 5ost$ and lo<ered employee Co4 satisfa5tion. ;he sli!htest
suspi5ion of poor DK often hinders mana!ers from rea5hin! any de5ision. "n order to ensure DK
assessment$ 6ufford (1221) proposed a model <hi5h 5onsists of definin! DK eApe5tations and
metri5s$ identifyin! and assessin! risks$ miti!atin! risks$ and monitorin! and e:aluatin! results
on an on+!oin! 4asis.
Key Factors for an Effective Quality Assurance in Data Warehousing
39 7rocess an# the &ritical 2actors of <=
Data <arehousin! depends on inte!ratin! data Duality assuran5e into all <arehousin!
phases plannin!$ implementation$ and maintenan5e (Ballou and ;ayi$ 1222). &Aperts in Duality
5ontrol methodolo!y al<ays re5ommend addressin! the Lroot 5auseM duly 5onsiderin! the
follo<in! Duality eApe5tations,
1) @55ura5y
2) 8ompleteness
-) ;imeliness
.) "nte!rity
0) 8onsisten5y
1) 8onformity
3) Re5ord Dupli5ation
*emani and Konda (2002) ha:e presented an eAtended :ersion of Data =arehouse
De:elopment 'ife 8y5le (D=D'8) 'ayers$ <hi5h lists 5omprehensi:e phases and links the Data
Kuality fa5tors as follo<s. ;he maCor theme in ea5h of the D=D'8 layers 5an 4e des5ri4ed as
follo<s,
Key Factors for an Effective Quality Assurance in Data Warehousing
Fi!ure 2. Data =arehouse De:elopment 'ife 8y5le (D=D'8) 'ayers$ adopted
from *emani and Konda (2002)
1) Plannin!, @part from DK proCe5t su55ess$ it is e:ident that definin! and mana!in! the
proCe5t s5ope influen5es the proCe5tOs o:erall su55ess. &:ery D= proCe5t reDuires a 5areful
4alan5e data sour5es$ pro5esses$ pro5edures$ and other fa5tors are s5oped as 5ommensurate <ith
the proCe5tOs si?e$ 5ompleAity$ and importan5e.
2) @nalysis, "n this layer$ one should 5onsider analy?in! the data from :arious a:aila4le
data sour5es. "n this phase it is re5ommended to perform the data profilin! of the data.
-) ReDuirements, "n this layer$ D= professional <ill 5olla4orates <ith the 4usiness
stakeholders to understand the 4usiness pro4lem 4y definin! and do5umentin! the reDuired data
Duality fa5tors for the D= proCe5t.
Key Factors for an Effective Quality Assurance in Data Warehousing
.) De:elop, "n this phase$ the D= professional <ill de:elop and test the D= solution
keepin! in mind the DK fa5tors defined in the reDuirement phase.
0) "mplement, "n this phase$ the DK solution <ill 4e implemented after duly si!ned off 4y
the Duality assuran5e team.
1) easure, "n this phase$ a data samplin! is done and a measure to understand 5urrent
pro5ess 5apa4ility is <orked out on DK fa5tors defined in the reDuirements phase. ;his a5ti:ity
<ill ensure to minimi?e the data Duality pro4lem.
@s a 4asi5 pro5ess of pra5ti5in! the Data Kuality$ or!ani?ations need to understand and
define the pro5ess of data flo<$ data transformation and data stora!e. ;he pro5ess should 5onsist
of the sour5e$ sta!in! area$ data pro5essin!$ data transformation$ and data stora!e as follo<s,
Fi!ure -. Foundational Data =arehouse 'oad Pro5ess #ta!es
=e ha:e identified four different kinds of DK assessment 5lassifi5ations, Data #our5e$
Data 'oad pro5ess$ Data ;ransformation and Data 'oad to ;ar!et ;a4les. =e further defined
multiple DK assessment 5riterions for ea5h DK assessment 5lass. ;hese DK assessment
5riterions ha:e 4een linked to a Duality assuran5e method from a pra5titionerPs perspe5ti:e and
summari?ed in ;a4le 1 4elo<.
Data #our5e #ta!in!
@rea
Data 'oad
Pro5ess
Data
;ransformation
Data 'oad to
tar!et ta4les
Pre:ention from data
5orruption should 4e
the fo5us
@pply the 4usiness
lo!i5 to the data to meet
the desired form.
Repair and reload the
data as needed.
@udit%"nspe5t the data
usin! 4usiness rules for
data Duality. Build$
&Ae5ute$ and Report the
DK rules%metri5s.
&nsure all the files are
4ein! pro5essed.
Repro5ess the failed
ones$ and lo! ( dis5ard
the 5orrupt ones
;a4le 1. 8lassifi5ation of DK assessment 5lass$ 5riteria and method
<= 6ssessment &lass <= 6ssessment &riterion =uality 6ssurance Metho#
Data #our5e #our5e 'o5ation. Jalidate #our5e 'o5ation
#eDuen5e of Data Files Jalidate #eDuen5e of Files
#enders @ddress 8onfirm #enders @ddress
File #i?e Jalidate File #i?e
File re5eipt @5kno<led!ement 8onfirm File Re5eipt
Re5ord 8ount Re5on5ile <ith #our5e
Data 'oad Pro5ess 'oadin! ;ime Jerify 'oad ;ime
*otify 'oad Pro5ess Jerify 'oad #tatus
*otifi5ation
File #tatus ;ra5kin! Jalidate @5tion
Data 'oad Jerify 8omplete 'oad
;ra5k Failed Data 'oad Re5on5ile Failed Data 'oad
Data ;ransformation Business Rules Jalidate Business Rules
;ra5k Failed Business Rules Re5on5ile Failed Business
Rules
;ar!et Data 'oads Jalidate Data 'oads
Data 'oad to ;ar!et ;a4les Data ;ypes Jalidate 8onsistent Data
;ypes
Business Rules Jalidate ;ar!et Data
Data 8ompleteness Jalidate Data 8ompleteness
Data 'oad Jerify 8omplete 'oad
;ra5k Failed Data 'oad Re5on5ile Failed Data 'oad
*otify 'oad Pro5ess Jerify 'oad #tatus
*otifi5ation
;he sour5e 5an 4e defined as the sour5e of the data. For eAample$ if an or!ani?ation has
se:eral lo5ations <here data is 4ein! 5aptured$ then ea5h of the lo5ations <ill 4e5ome a sour5e.
"n the pro5ess of loadin! the sour5e data into the D=$ the data <ill 4e held in a sta!in! area of
D=. ;he home !ro<n or off+the+shelf soft<are 5an 4e used to load data from sta!in! area into
the D= stora!e ta4les. ;ypi5ally <ithin the pro5ess$ the sour5e data is transformed to meet the
4usiness lo!i5 prior to loadin! into the tar!et ta4les of D=. 9n5e the data is transformed into
tar!et ta4les % stora!e$ an audit%inspe5tion plan must 4e de:ised 4ased on the 4usiness rules that

10
are 4ased on the desira4le 5hara5teristi5s of the final data. ;he desira4le 5hara5teristi5s 5an 4e
:erified 4y 4uildin! the DK rules$ eAe5utin! the tests$ reportin! out any issues <ith data
5hara5teristi5s for 5orre5tin!%repairin! and for pre:enti:e a5tion in the upstream pro5ess.
>9 &lassification of <=-&riteria for a =6 solution
#e:eral resear5h proCe5ts ha:e ta5kled the pro4lem of assessin! s5ores for information
Duality 5riteria. *aumann ( RolkerPs (2000) present a 5lassifi5ation of "K 5riteria <hi5h is
desi!ned to help or!ani?ations to assess the status of their or!ani?ational information Duality and
monitor their "K impro:ements. ;he authors of this paper ha:e eAtended *aumann ( RolkerPs
(2000) 5lassifi5ation of "K 5riteria to the DK 5riteria. *emani and KondaOs (2002) Data
=arehouse De:elopment 'ife 8y5le (D=D'8) 'ayers model <as also le:era!ed to further
stren!then the a5tiona4le and detailed tasks to a55omplish the data Duality. Kuality @ssuran5e is
a frame<ork in 4road sense that en5ompasses understandin!$ plannin!$ and eAe5ution of test
plans 4efore soft<are appli5ations are deployed for intended use durin! ea5h phase of the
D=D'8. ;ypi5ally$ the pro5ess starts <ith studyin! the 4usiness reDuirements and systems
reDuirements do5uments to understand the o:erall s5ope of the appli5ation%soft<are as <ell as to
define the s5ope of the K@. *eAt step in the pro5ess is to de:ise the test strate!y$ test plan and
test 5ases. 9ne of the key tasks in de:elopin! test 5ases is to understand the 4usiness and systems
reDuirements$ and formulate the test 5ases for ea5h of the reDuirements. ;he test 5ase de:eloper
must also in5lude the information a4out the test en:ironment$ and 4efore and after results from
the test. "n the ;a4le 1 a4o:e$ <e ha:e defined the K@ assessment 5riterion and the respe5ti:e
Duality methodolo!y. Belo<$ <e <ill pro:ide the 4rief definition of ea5h and the respe5ti:e

11
details on K@ methodolo!y.
<ata Source: aCor emphasis durin! this phase is to 5he5k and :alidate that the files are
4ein! re5ei:ed from the pre+identified lo5ations$ and from the desi!nated sender. Files are also
5ross+:alidated to ensure that the file si?e and seDuen5e of dimension and fa5t data are in order.
;he follo<in! detailed K@ steps <ill ensure the a4o:e o4Ce5ti:e 5an 4e met.
Validate Source Location: "dentify the sour5e lo5ation and its :alidation$ for eAample$
files must 4e re5ei:ed only from the pre+identified lo5ations and in pre+identified format
su5h as DB2$ #K'$ @#.00 or any other eAternal sour5e file.
Validate Sequence of Files: 'o!i5al seDuen5e of the entire data files identified are
:alidated su55essfully. For eAample$ mem4er file needs to 4e loaded prior to pro5essin!
any 5laims.
Confirm Senders Address: Jerifi5ation of senders address7 it is 5riti5al to kno< the
sour5e sender information for tra5kin! and feed4a5k purpose.
Validate File Size: 8ross :erify the si?e of the re5ei:ed files to sour5e files to ensure that
the entire eApe5ted file has 4een re5ei:ed.
Confirm File Receipt: Jerify that a re5eipt a5kno<led!ment 5onfirmation is sent to the
sour5e for re5on5iliation%tra5kin! purpose.
Reconcile with Source: Jerify that all the re5ords in all files are pro5essed 4y :alidatin!
the follo<in! three steps,
File header validation: Jerify that header displays the re5ord type for eAample
L0O for header alon! <ith date and time stamped.

13
File detail validation: Jerify that total R of re5ords displayed in trailer 5ount
eDual to the total re5ords eAist in detail se!ment and re5ord type is N2O
File Trailer validation: Jalidation of re5ord type e.!. trailer re5ord displays N/O
and also displays total num4er 5ount in detail re5ord eA5ludin! the header and
trailer re5ord.
<ata )oa# 7rocess: &mphasis durin! this phase is to ensure that the files re5ei:ed ha:e
4een pro5essed. =ithin this$ the key measures in5lude the loadin! time$ user notifi5ation7 file
status tra5kin!$ and re5on5ilin! the failed data load. ;he follo<in! detailed K@ steps <ill ensure
the a4o:e o4Ce5ti:e 5an 4e met.
Verify Load Time: Jalidate that the estimated data load time has not enormously eA5eed
the time.
Verify Load Status otification: Jalidate that email notifi5ation pro5ess is fun5tional as
eApe5ted. &mail%status notifi5ations are sent periodi5ally indi5atin! the status of the load.
Validate Action: File status tra5kin! :alidation is to :erify that the failed data load are re+
pro5essed after identifyin! and 5orre5tin! the issue <ithin the spe5ified stipulated time.
Verify Complete Load: 8riti5al :alidation is 5ompleteness of data load. ake sure that all
the fields$ <ith spe5ified si?e and 5riteria ha:e su55essfully loaded.
Reconcile Failed !ata Load: Jerify that failed data load re5ords durin! data load pro5ess
are 4ein! tra5ked$ :alidated$ re:ie<ed$ updated and re+pro5essed if ne5essary.
<ata Transformation: ;ypi5ally$ the sour5e data is transformed in order to meet the
4usiness needs as <ell as standardi?ation a5ross the data4ase. "n this phase of K@$ the emphasis

1/
is to :erify the 4usiness rules are 4ein! used$ and re5on5ile the failed data loads. ;he follo<in!
detailed K@ steps <ill ensure the a4o:e o4Ce5ti:e 5an 4e met.
Validate "usiness Rules: Jalidate the 4usiness rules transformation$ for eAample if the
sour5e ta4le reDuires 5han!in! data :alue from LFemaleM to num4er L2M in
transformation. Jerify that female :alues are all transformed to numeri5 L2M.
Reconcile Failed "usiness Rules: Jerifyin! that transformation of any failed 4usiness
rule or any unidentified 4usiness rules are 5aptured$ re:alidated$ re+5onsidered and
5han!ed per 4usiness reDuirements.
Validate !ata Loads: Re+:alidate that the data transformation file is su55essfully loaded
and mat5hed to the identified 5ounts in sour5e tar!et.
<ata )oa# to Target Ta*les: ;his is the 5riti5al phase <here one 5an :erify and :alidate
the final data. ;his <ill in5lude :alidatin! the 5onsistent usa!e of data types$ data 5ompleteness$
ri!ht data in ri!ht tar!et ta4les$ and re5on5iliation of failed data loads. ;he follo<in! detailed K@
steps <ill ensure the a4o:e o4Ce5ti:e 5an 4e met.
Validate Consistent !ata Types: 'astly :erify in the tar!eted ta4les and the data field
types are 5onsistent throu!hout the data4ase. For eAample$ 8ustomer"D is num4er
datatype a5ross all ta4les <here:er 5ustomerid <as used.
Validate Tar#et !ata: Jerify that the 4usiness rules are 5urrent and produ5in! the
reDuired data in the tar!et ta4les.
Validate !ata Completeness: Jerifyin! data 5ompleteness$ <hi5h is to ensure that the
ri!ht data is loaded into the ri!ht tar!et ta4les.

12
Verify Complete Load: Jalidate that tar!eted data load is 5omplete in si?e and ri!ht in
data field :alues <hen 5ompared from the sour5e and transformation files.
Reconcile Failed !ata Load: Jerify that the failed data load is re5onsidered for re+
pro5essin! after the reDuired 5han!es made or modified in the ori!inal data file.
Verify Load Status otification: Jerify that the status of the load pro5ess has 4een
pu4lished periodi5ally durin! data loadin! pro5ess$ for eAample$ if the Co4 a4orts$ LFile
@B8 a4orted durin! identified time or else L'oad su55essfully 5ompleted (in5ludin! the
re5ord 5ount)M.
?9 &onclusions
"n this paper$ the authors ha:e eAtended the D=D'8 (*emani ( Konda$ 2002) approa5h
4y 5om4inin! su4Ce5ti:e and o4Ce5ti:e DK assessments <hi5h are applied in pra5ti5e. ;he main
o4Ce5ti:e of any D= is to pro:ide de5ision makers a Lsin!le :ersion of the truthM of hi!h Duality
data. ;his ena4les de5ision mana!ers and employees to make informed and 4etter de5isions.
Data Kuality (DK) 5an 4e attri4uted to se:eral fa5tors su5h as data a55ura5y$ 5ompleteness$
timeliness$ 5oheren5y$ 5onsisten5y$ 5onformity$ and re5ord dupli5ation. 6o<e:er$ lo< Duality
data has se:ere effe5ts on an or!ani?ation performan5e. Bnless a defined and planned approa5h
for data Duality is follo<ed durin! the different phases of D=$ the or!ani?ations may suffer from
data Duality issues$ and any efforts to fiA the DK issues 4e5ome :ery eApensi:e and time
5onsumin!. "n this paper$ <e ha:e identified the DK @ssessment 8lasses$ DK @ssessment
8riterions$ and the respe5ti:e Kuality @ssuran5e ethods. @ detailed eAplanation is pro:ided for
ea5h of the DK @ssessment 8riterion <ith related K@ ethod and K@ test 5ases that <ill ensure

20
the a5hie:ement of eApe5ted Duality le:el in D= de:elopment and deployment.
!eferences
@4del+6amid$ ;. (122/). ;he &5onomi5s of #oft<are Kuality @ssuran5e, @ #imulation+Based
8ase #tudy. "# Kuarterly$ 12(-)$ -20+.11.
@m4ler$ #. =. (2002). Data Kuality #ur:ey Results.
http,%%<<<.am4ysoft.5om%do<nloads%sur:eys%DataKuality200102.ppt$ a55essed on )uly
20$ (2002).
@ndrea$ R.$ ( iriam$ 8. (2000). "n:isi4le Data Kuality "ssues in a 8R "mplementation.
)ournal of Data4ase arketin! ( 8ustomer #trate!y ana!ement$ Jol. 12$ *o. .$ pp.
-00+-1..
Ballou$ D.$ ( ;ayi$ >. (1222). &nhan5in! Data Kuality in Data =arehouse &n:ironments.
8ommuni5ations of the @8$ .2(1)$ pp. 3-+3/.
Bryan$ F. (2002). ana!in! ;he Kuality and 8ompleteness of 8ustomer Data. )ournal of
Data4ase ana!ement$ Jol. 10$ *o. 2$ pp. 1-2F10/.
Bu5kley$ F. and Poston$ R. (12/.). S#oft<are Kuality @ssuran5e$S "&&& ;ransa5tions on
#oft<are &n!ineerin!$ pp$ -1+.1.
&n!lish$ '.P. (2001). "nformation Kuality ana!ement, ;he *eAt Frontier. @nnual Kuality
8on!ress Pro5eedin!s$ @meri5an #o5iety for Kuality$ il<aukee$ ="$ pp.022+--.
8ho<$ ;.#. (ed.) (12/0). #oft<are Kuality @ssuran5e, @ Pra5ti5al @pproa5h$ "&&& 8omputer
#o5iety Press$ #il:er #prin! $ D.
Friedman$ *elson$ and Rad5liffe (200.). 8R Demands Data 8leansin!. >artner Resear5h.
>uimaraes$ ;.$ #taples$ D.#.$ ( 5Keen$ ).D. (2003). @ssessin! the "mpa5t from "nformation
#ystems Kuality$ Kuality. ana!ement )ournal$ Jol. 1.$ *o. 1$ pp. -0+...
6ufford$ D (1221). Data =arehouse Kuality$ Data ana!ement Re:ie<$ Fe4%ar.
"ain$ 6.$ ( Don$ . (2000). Prioriti?in! and Deployin! Data Kuality "mpro:ement @5ti:ity.
)ournal of Data4ase arketin! ( 8ustomer #trate!y ana!ement$ Jol. 12$ *o. 2$ pp.
11-.
)ean+Pierre$ D. (200.). "nte!ratin! DK into Iour Data =arehouse @r5hite5ture. Business
"ntelli!en5e )ournal$ 2(2)$ 1/.
*aumann$ F. ( Rolker$ 8. (2000). @ssessment ethods for "nformation Kuality 8riteria. "n,
Pro5eedin!s of the 2000 8onferen5e on "nformation Kuality$ 8am4rid!e$ @ 1222$ pp.
1./+112.
*emani$ R. R$ ( Konda$ R. (2002). @ Frame<ork for Data Kuality in Data =arehousin!.
Pro5eedin!s of the third "nternational Bnited "nformation #ystems 8onferen5e$
B*"#89*$ #ydney$ @ustralia$ 20(1)$ pp. 222+223.
*ord$ >. D$ (2000). @n "n:esti!ation of the "mpa5t of 9r!ani?ation #i?e on Data Kuality "ssues.
)ournal of Data4ase ana!ement$ Jol. 11$ *o. -$ pp. 0/+31.
Qu$ 6.$ *ord$ ).6.$ Bro<n$ *.$ *ord$ >.D. (2002). Data Kuality "ssues in "mplementin! an &RP.
"ndustrial ana!ement ( Data #ystems$ Jol. 102$ *o.1$ pp. .3+10.

21

22

Das könnte Ihnen auch gefallen