Beruflich Dokumente
Kultur Dokumente
Data Warehousing
Modern Database
Management
8th Edition
Jeffrey A. Hoffer, Mary B. Prescott,
Fred R. McFadden
2007 by Prentice Hall
Objectives
Definition of terms
Reasons for information gap between
information needs and availability
Reasons for need of data warehousing
Describe three levels of data warehouse
architectures
List four steps of data reconciliation
Describe two components of star schema
Estimate fact table size
Design a data mart
Chapter 11
Definition
Data Warehouse:
Data Mart:
Chapter 11
Chapter 11
Chapter 11
Data Warehouse
Architectures
L
T
One,
companywide
warehouse
E
Periodic extraction data is not completely current in warehouse
Chapter 11
Data marts:
Mini-warehouses, limited in scope
T
E
Separate ETL for each
independent data mart
Chapter 11
T
E
T
E
Near real-time ETL for
Data Warehouse
Chapter 11
10
Chapter 11
11
Figure 11-7
Example of DBMS
log entry
Data Characteristics
Status vs. Event Data
Statu
s
Event =
a database
action
(create/update/delete)
that results from a
transaction
Statu
s
Chapter 11
12
Figure 11-8
Transient
operational data
Chapter 11
Data Characteristics
Transient vs. Periodic Data
With
transient
data,
changes
to
existing
records
are
written
over
previous
records,
thus
destroyin
g the
previous
data
13
content
Figure 11-9:
Periodic
warehouse data
Chapter 11
Data Characteristics
Transient vs. Periodic Data
Periodic
data are
never
physicall
y
altered
or
deleted
once
they
have
been
added
to the
store
14
Chapter 11
15
Transientnot historical
Not normalized (perhaps due to denormalization for
performance)
Restricted in scopenot comprehensive
Sometimes poor qualityinconsistencies and errors
Chapter 11
16
or data cleansing
Transform
Load and Index
ETL = Extract, transform, and load
Chapter 11
17
Figure 11-10:
Steps in data
reconciliation
Incremental extract =
capturing changes that
have occurred since the
last static extract
18
2007 by Prentice Hall
Static extract =
capturing a snapshot of
the source data at a
point in time
Chapter 11
Figure 11-10:
Steps in data
reconciliation
(cont.)
Fixing errors:
Also: decoding,
misspellings, erroneous
reformatting, time stamping,
dates, incorrect field usage,
conversion, key generation,
mismatched addresses,
merging, error
missing data, duplicate data,
detection/logging, locating
inconsistencies
Chapter
11
missing
19
2007 by Prentice
Hall data
Figure 11-10:
Steps in data
reconciliation
(cont.)
Record-level:
Field-level:
Selectiondata partitioning
single-fieldfrom one field to one
Joiningdata combining
field
Aggregationdata
multi-fieldfrom many fields to
summarization
one, or one field to many
Chapter 11
20
2007 by Prentice Hall
Figure 11-10:
Steps in data
reconciliation
(cont.)
21
Algorithmic transformation
uses a formula or logical
expression
Table lookupanother
approach, uses a
separate table keyed by
source record code
Chapter 11
22
1:Mfrom one
source field to
many target
fields
Chapter 11
23
Derived Data
Objectives
Characteristics
24
25
Chapter 11
26
Chapter 11
27
Dimension table keys must be surrogate (nonintelligent and non-business related), because:
Chapter 11
28
29
Chapter 11
30
OLAP Operations
Chapter 11
31
Chapter 11
32
Figure 11-24
Example of drill-down
Starting with
summary data,
users can obtain
details for particular
cells
Chapter 11
Summary
report
Drill-down with
color added
33
Techniques
Chapter 11
34