Beruflich Dokumente
Kultur Dokumente
SS ZG515
BITS Pilani
Pilani Campus
PC Reddy
Guest Faculty WILP, BITS Pilani
BITS Pilani
Pilani Campus
Lecture 3 Outline
Review Lecture 3
Dimensional modeling
Fact Table
Measurements associated with a specific business
process
Grain: level of detail of the table
Process events produce fact records
Facts (attributes) are usually
Numeric
Additive
Dimension Tables
Entities describing the objects of the process
Conformed dimensions - cross processes
Attributes are descriptive
Text
Numeric
Surrogate keys
1:m with the fact table
Null entries
Date dimensions
Bus Architecture
An architecture that permits aggregating data across
multiple marts
Conformed dimensions and attributes
Bus matrix
BKCustID
CustName
CommDist
Gender
HomOwn?
1552
31421
Jane Rider
Fact Table
Date
CustKey
ProdKey
Item Count
Amount
1/7/2004
1552
95
1,798.00
3/2/2004
1552
37
27.95
5/7/2005
1552
87
320.26
2/21/2006
1552 2387
42
19.95
BKCust
ID
Cust
Name
Comm
Dist
Gender
Hom
Own?
Eff
End
1552
31421
Jane Rider
1/7/2004
1/1/2006
2387
31421
Jane Rider
31
1/2/2006
12/31/9999
Type 2
Type 3
Hybrid
ProductKey
Description
Category
SKU
21553
LeapPad
Education
LP2105
ProductKey
Description
Category
SKU
21553
LeapPad
Toy
LP2105
ProductKey
Description
Category
SKU
21553
LeapPad
Education
LP2105
44631
LeapPad
Toy
LP2105
ProductKey
Description
Category
OldCat
SKU
21553
LeapPad
Toy
Education
LP2105
ProductKey
Description
Category
OldCat
SKU
21553
LeapPad
Education
Electronics
LP2105
44631
LeapPad
Toy
Education
LP2105
68122
LeapPad
Education
Electronics
LP2105
Date Dimensions
One row for every day for which you expect to
have data for the fact table (perhaps
generated in a spreadsheet and imported)
Usually use a meaningful integer surrogate
key (such as yyyymmdd 20060926 for Sep.
26, 2006). Note: this order sorts correctly.
Include rows for missing or future dates to be
added later.
Aggregates
Precalculated summary tables
Improve performance
Record data an coarser granularity
Fact Tables
Transaction
Track processes at discrete points in time when they occur
Periodic snapshot
Cumulative performance over specific time intervals
Accumulating snapshot
Constantly updated over time. May include multiple dates representing
stages.
Case Study:
Retail Grocery Store
Process: Retail Sales
Grain: POS line item
Dimensions: Date, Store, Product, Promotion
DATE
DateKey
Attributes
STORE
StoreKey
Attributes
POS FACT
DateKey
ProductKey
StoreKey
PromotionKey
POSTransactionNumber
SalesQuantity
SalesDollarAmount
CostDollarAmount
GrossProfitDollarAmount
PRODUCT
ProductKey
Attributes
PROMOTION
PromotionKey
Attributes
Fiscal week
Year
Month
Fiscal year
Holiday ?
Holiday name
Day of holiday
Weekday ?
Selling season
Major event
etc.
Weight units of
measure
Storage type
Shelf unit type
Shelf width
Shelf height
Shelf depth
etc.
Region
Floor plan type
Photo processing type
Financial service type
Square footage
Selling square footage
First open date
Last remodel date
etc.
BITS Pilani, Pilani Campus
Conformed Dimensions:
Inventory Snapshot Model
Process: Store inventory
Grain: Daily inventory by product and store
Dimensions: Date, product, store
Fact: quantity-on-hand
Dimensional Model
DATE
DateKey
Attributes
Inventory Fact
ProductKey
DateKey
StoreKey
QuantityOnHand
QuantitySold
ValueAtCost
ValueAtSellingPrice
PRODUCT
ProductKey
Attributes
STORE
StoreKey
Attributes
Conformed Dimensions
Common dimensions for different processes should be
the same.
Note: Dimensions for roll-up or aggregated fact tables
my add or eliminate attributes based on the aggregation
Where attributes apply, they should mean the same
thing.
Product
Store
Promotion
Warehouse
Vendor
Retail Sales
Retail Inventory
Retail
Deliveries
Warehouse
Inventory
Warehouse
Deliveries
Purchase Orders
Contract
Shipper
Process