Sie sind auf Seite 1von 40

The Data Warehouse

Chapter 6

6.1 Operational Databases

Data Modeling and Normalization


One-to-One Relationships One-to-Many Relationships Many-to-Many Relationships

Data Modeling and Normalization


First Normal Form Second Normal Form Third Normal Form

Type ID

Make Year

Customer ID Income Range

Vehicle - Type

Customer

Figure 6.1 A simple entity-relationship diagram

The Relational Model

6.2 Data Warehouse Design

The Data Warehouse


A data warehouse is a subject-oriented, integrated, time-variant, and non-volatile collection of data in support of managements decision making process (W.H. Inmon).

Granularity
Granularity is a term used to describe the level of detail of stored information.

Dependent Data Mart External Data Extract/Summarize Data

Operational Database(s)

ETL Routine
(Extract/Transform/Load)

Data Warehouse

Decision Support System

Independent Data Mart

Report

Figure 6.2 A data warehouse process model

Entering Data into the Warehouse


Independent Data Mart ETL (Extract, Transform, Load Routine) Metadata

Structuring the Data Warehouse: Two Methods


Structure the warehouse model using the star schema Structure the warehouse model as a multidimensional array

The Star Schema


Fact Table Dimension Tables Slowly Changing Dimensions

Purchase Key 1 2 3 4 5 6 . . .

Purchase Dimension Category Supermarket Travel & Entertainment Auto & Vehicle Retail Restarurant Miscellaneous . . .

Time Dimension Time Key Month Day Quarter Year 10 Jan 5 1 2002 . . . . . . . . . . . . . . .

Cardholder Key Purchase Key Location Key 1 2 1 15 4 5 1 2 3 . . . . . . . . .

Fact Table Time Key Amount 10 14.50 11 8.25 10 22.40 . . . . . .

Cardholder Key Name 1 John Doe 2 Sara Smith . . . . . .

Cardholder Dimension Gender Income Range Male 50 - 70,000 Female 70 - 90,000 . . . . . .

Location Key Street 10 425 Church St . . . . . .

Location Dimension City State Region Charleston SC 3 . . . . . . . . .

Figure 6.3 A star schema for credit card purchases

The Multidimensionality of the Star Schema

Cardholder Ci

Purchase Key

, Ci A(

0) 1 , 1 ,2

Ti

e m

y Ke

Location Key

Figure 6.4 Dimensions of the fact table shown in Figure 6.3

Additional Relational Schemas


Snowflake Schema Constellation Schema

Promotion Key 1 . . .

Promotion Dimension Description Cost watch promo 15.25 . . . . . .

Time Dimension Time Key Month Day Quarter Year 5 Dec 31 4 2001 8 Jan 3 1 2002 10 Jan 5 1 2002 . . . . . . . . . . . . . . .

Purchase Key 1 2 3 4 5 6 Promotion Fact Table Cardholder Key Promotion Key Time Key 1 1 5 2 1 5 . . . . . . . . .

Purchase Dimension Category Supermarket Travel & Entertainment Auto & Vehicle Retail Restarurant Miscellaneous

Response Yes No . . .

Purchase Fact Table Cardholder Key Purchase Key Location Key 1 2 1 15 4 5 1 2 3 . . . . . . . . .

Time Key Amount 10 14.50 11 8.25 10 22.40 . . . . . .

Cardholder Key Name 1 John Doe 2 Sara Smith . . . . . .

Cardholder Dimension Gender Income Range Male 50 - 70,000 Female 70 - 90,000 . . . . . .

Location Key Street 5 425 Church St . . . . . .

Location Dimension City State Region Charleston SC 3 . . . . . . . . .

Figure 6.5 A constellation schema for credit card purchases and promotions

Decision Support: Analyzing the Warehouse Data


Reporting Data Analyzing Data Knowledge Discovery

6.3 On-line Analytical Processing

OLAP Operations
Slice A single dimension operation Dice A multidimensional operation Roll-up A higher level of generalization Drill-down A greater level of detail Rotation View data from a new perspective

Month = Dec. Category = Vehicle Region = Two Amount = 6,720 Count = 110

Dec. Nov. Oct. Sep. Aug. Month Jul. Jun. May Apr. Mar. Feb. Jan.
On e Tw o Fo ur Th ree

Supermarket

Restaurant

Travel

Vehicle

Retail

Figure 6.6 A multidimensional cube for credit card purchases

Category

Miscellaneous

n gio Re

Concept Hierarchy
A mapping that allows attributes to be viewed from varying levels of detail.

Region

State

City

Street Address

Figure 6.7 A concept hierarchy for location

Month = Oct./Nov/Dec. Category = Supermarket Region = One

Q4 Time Q3 Q2 Q1 Miscellaneous Supermarket Restaurant Vehicle Travel Retail


On e Tw o Fo ur Th ree

g Re

io

Category

Figure 6.8 Rolling up from months to quarters

6.4 Excel Pivot Tables for Data Analysis

Creating a Simple Pivot Table

Figure 6.9 A pivot table template

Figure 6.10 A summary report for income range

Figure 6.11 A pie chart for income range

Pivot Tables for Hypothesis Testing

Figure 6.12 A pivot table showing age and credit card insurance choice

Figure 6.13 Grouping the credit card promotion data by age

Figure 6.14 PivotTable Layout Wizard

Creating a Multidimensional Pivot Table

Watch Promo = No Life Insurance Promo = Yes Magazine Promo = Yes

Watch Promo

No

Yes
No

Ye s

Yes

No

e zin a g o Ma rom P

Life Insurance Promo

Figure 6.15 A credit card promotion cube

Figure 6.16 A pivot table with page variables for credit card promotions

Das könnte Ihnen auch gefallen