Sie sind auf Seite 1von 31

Data Vault

RMOUG Training Days


2006
Colorado Convention Center
Denver, Colorado
February 15-16
Data Vault;
Whats The Combination?
Jeff Meyer
Enterprise Data Integration Oracle DBA
Department of Technology Services
Denver Public Schools
Data Vault
Who are we?
DBAs
Managers
Analysts
Enterprise Data Warehouse Projects
Currently in process
Planned
Data Marts
Data Vault
Brief History and Revisit Some
Definitions
Three Basic Building Blocks of the
Data Vault
Advanced Features
Questions
Data Vault
Brief History and Revisit Some
Definitions
Three Basic Building Blocks of the
Data Vault
Advanced Features
Questions
Data Vault
Brief History and Revisit Some Definitions

1970 Dr. E.F. Codd of IBM
1979 First Working Relational
Database by Relational Software
Incorporated
Oracle v2
1991 William H. Inmon published
Building the Data Warehouse
Data Vault
Brief History and Revisit Some Definitions

Legacy System
any system that has been put into production.
(para-phrased W.H. Inmon)
Operational Data Store
a subject-oriented, integrated, volatile, current
or near current collection of operational data.
W.H. Inmon

Data Vault
Brief History and Revisit Some Definitions

Data Warehouse
a subject-oriented, integrated, time-variant, non-volatile
collection of data designed for support of business decisions
W.H. Inmon
Data Vault
a detail-oriented, historical tracking and uniquely linked set
of normalized tables that support one or more functional
areas of business.
Dan Linstedt
Data Vault
Brief History and Revisit Some Definitions

Data Mart
a subset of a data warehouse, for use by a single
department or function.
www.e-formation.co.nz/glossary.asp
Corporate Information Factory
the framework that exists that surrounds the data
warehouse; typically contains an ODS, a data warehouse, data
marts, DSS applications, exploration warehouses, and so forth.
W.H. Inmon
Data Vault
Brief History and Revisit Some Definitions

* Source: Bill Inmon and Claudia Imhoff
Data Vault Why?
Why do we need it?
We finally have a Data Model that will work for small,
medium, or large business
Anyone building a Data Warehouse can use these techniques.
Weve got issues in constructing the data warehouse
from 3
rd
normal form, or star schema form.
There are inherent road blocks to each method that we must
solve technically through our Data Model.
Data Vault
Brief History and Revisit Some
Definitions
Three Basic Building Blocks of the
Data Vault
Advanced Features
Questions
Data Vault
Three Basic Building Blocks
Hub stand alone table; list of unique business
keys; used for business identification
Satellite descriptive data; historical data; used
for descriptive information for the HUB or LINK
Link associative table; list of unique
relationships between keys; used for relationships
between HUBs and LINKs
Data Vault Three Basic Building Blocks
Preview
Hub Employees
Hub Schools
ELA Name
EEOC Dates
Hub Students
EEOC Name
Shots Addrs
Assign Enrollments
Data Vault Three Basic Building Blocks
HUB
Primary Key
<Business Key>
Load DTS
Record Source
Primary Key
<Business Key>
Load DTS
Record Source
Sample Data Set CUSTOMER
CONTRACTS 2-2-2000 93KFLLA 10
CONTRACTS 2-2-2000 929ABC2985 9
CONTRACTS 2-2-2000 PAFJG2895 8
FINANCE 2-2-2000 PPRU_3259 7
SALES 8-3-2001 HUJI_BFIOQ 6
SALES 6-4-2001 LLOA_82J5J 5
CONTRACTS 3-7-2000 KKO92854_dd 4
CONTRACTS 1-25-2000 DKEF 3
CONTRACTS 10-2-2000 ABC925_24FN 2
RCRD SRC LOAD DTS CUSTOMER # ID
MANUFACT 10-12-2000 ABC123456 1
CONTRACTS 2-2-2000 93KFLLA 10
CONTRACTS 2-2-2000 929ABC2985 9
CONTRACTS 2-2-2000 PAFJG2895 8
FINANCE 2-2-2000 PPRU_3259 7
SALES 8-3-2001 HUJI_BFIOQ 6
SALES 6-4-2001 LLOA_82J5J 5
CONTRACTS 3-7-2000 KKO92854_dd 4
CONTRACTS 1-25-2000 DKEF 3
CONTRACTS 10-2-2000 ABC925_24FN 2
RCRD SRC LOAD DTS CUSTOMER # ID
MANUFACT 10-12-2000 ABC123456 1
A Hub is a list of unique business keys.
Data Vault Three Basic Building Blocks
SATELLITE
Primary Key
Load DTS
Detail
Business Data
Aggregation Data
{Update User}
{Update DTS}
Record Source
Primary Key
Load DTS
Detail
Business Data
Aggregation Data
{Update User}
{Update DTS}
Record Source
CONTRACTS 10-2-2000 ABC925_24FN 2
MANUFACT 10-12-2000 ABC123456 1
RCRD SRC LOAD DTS CUSTOMER # ID
CONTRACTS 10-2-2000 ABC925_24FN 2
MANUFACT 10-12-2000 ABC123456 1
RCRD SRC LOAD DTS CUSTOMER # ID
CONTRACTS Worldwide Suppliers Inc 10-14-2000 2
CONTRACTS WorldPart 10-2-2000 2
CONTRACTS ABC DEF Incorporated 12-2-2000 1
MANUFACT ABC Worldwide Suppliers, Inc 10-31-2000 1
MANUFACT ABC Suppliers, Inc 10-14-2000 1
MANUFACT ABC Suppliers 10-12-2000 1
RCRD SRC NAME LOAD DTS CSID
CONTRACTS Worldwide Suppliers Inc 10-14-2000 2
CONTRACTS WorldPart 10-2-2000 2
CONTRACTS ABC DEF Incorporated 12-2-2000 1
MANUFACT ABC Worldwide Suppliers, Inc 10-31-2000 1
MANUFACT ABC Suppliers, Inc 10-14-2000 1
MANUFACT ABC Suppliers 10-12-2000 1
RCRD SRC NAME LOAD DTS CSID
CUSTOMER NAME SATELLITE
A Satellite is a time-dimensional table housing detailed
information about the hubs business keys.
Data Vault Three Basic Building Blocks

Hub Employees
ELA Name
EEOC Dates
Employees HUB and some of its Satellites
Data Vault Three Basic Building Blocks
LINK
Primary Key
Load DTS
Record Source
CONTRACTS 10-2-2000 ABC925_24FN 2
MANUFACT 10-12-2000 ABC123456 1
RCRD SRC LOAD DTS CUSTOMER # ID
CONTRACTS 10-2-2000 ABC925_24FN 2
MANUFACT 10-12-2000 ABC123456 1
RCRD SRC LOAD DTS CUSTOMER # ID
FINANCE 10-14-2000 CONT259 101
FINANCE 10-14-2000 CONT212 100
RCRD SRC LOAD DTS CONTACT # ID
FINANCE 10-14-2000 CONT259 101
FINANCE 10-14-2000 CONT212 100
RCRD SRC LOAD DTS CONTACT # ID
FINANCE 10-14-2000 101 2
FINANCE 10-14-2000 100 1
RCRD SRC LOAD DTS CONTACT ID CSID
FINANCE 10-14-2000 101 2
FINANCE 10-14-2000 100 1
RCRD SRC LOAD DTS CONTACT ID CSID
A Link is an associative or intersection table, representing the
connection between information between business elements.
Link Table
Data Vault Three Basic Building Blocks

Hub Employees
ELA Name
EEOC Dates
Hub Schools
Geo Cd Addr
Floor Bldg
Assign
Sat
Hub and Satellites Hub and Satellites
Link and Satellites
Data Vault
Brief History and Revisit Some
Definitions
Three Basic Building Blocks of the
Data Vault
Advanced Features
Questions
Data Vault Advanced Features
Point-In-Time
A structure which sustains integrity of joins across time to all
the SATELLITES that are connected to the HUB or LINK.
Bridge
A single row table that contains the latest Load Date Time
Stamp (DTS). Similar to Point-In-Time except it spans a
subject-area or a schema.
User Grouping Link
The information provides the user with a customized view
from a reporting standpoint and does not affect the
underlying information.
Data Vault Advanced Features
Point-In-Time (PIT)
MANUFACT 10-12-2000 ABC123456 1
RCRD SRC LOAD DTS CUSTOMER # ID
MANUFACT 10-12-2000 ABC123456 1
RCRD SRC LOAD DTS CUSTOMER # ID
ABC DEF Incorporated 12-2-2000 1
ABC Worldwide Suppliers, Inc 10-31-2000 1
NAME LOAD DTS CSID
ABC DEF Incorporated 12-2-2000 1
ABC Worldwide Suppliers, Inc 10-31-2000 1
NAME LOAD DTS CSID
123 World Drive 12-5-2000 1
123 World Dr 10-14-2000 1
ADDRESS LOAD DTS CSID
123 World Drive 12-5-2000 1
123 World Dr 10-14-2000 1
ADDRESS LOAD DTS CSID
12-5-2000 12-2-2000 12-5-2000 1
10-14-2000 12-2-2000 12-2-2000 1
10-31-2000
10-14-2000
NAME_LOAD_DTS
10-14-2000
10-14-2000
ADDRESS_LOAD_DTS
10-31-2000 1
10-14-2000 1
LOAD DTS CSID
12-5-2000 12-2-2000 12-5-2000 1
10-14-2000 12-2-2000 12-2-2000 1
10-31-2000
10-14-2000
NAME_LOAD_DTS
10-14-2000
10-14-2000
ADDRESS_LOAD_DTS
10-31-2000 1
10-14-2000 1
LOAD DTS CSID
A structure which sustains integrity of joins across time to
all the satellites that are connected to the hub.
Customer Name Satellite Customer Address Satellite
Hub Key
Load Date
{Sat Load DTS}
{Sat Load DTS}
{Rec Source}
Hub Key
Load Date
{Sat Load DTS}
{Sat Load DTS}
{Rec Source}
Data Vault Advanced Features
Bridge
A single row table that contains the latest
Load DTS with multiple columns. A Bridge
is not a helper table.
Similar to a PIT Table except it spans or
applies to a subject-area or schema. A PIT
Table is HUB (LINK) and SATELLITE
specific.
Data Vault Advanced Features
User Grouping Link
Primary Key
Load DTS
Record Source
Primary Key
Load DTS
Record Source
EXCEL 10-2-2000 Small Customers 2
EXCEL 10-12-2000 Big Customers 1
RCRD SRC LOAD DTS Grouping Label ID
EXCEL 10-2-2000 Small Customers 2
EXCEL 10-12-2000 Big Customers 1
RCRD SRC LOAD DTS Grouping Label ID
FINANCE 10-14-2000 ABC-1 101
FINANCE 10-14-2000 ABC295882 100
RCRD SRC LOAD DTS Customer # ID
FINANCE 10-14-2000 ABC-1 101
FINANCE 10-14-2000 ABC295882 100
RCRD SRC LOAD DTS Customer # ID
EXCEL 10-14-2000 101 1
EXCEL 10-14-2000 100 1
RCRD SRC LOAD DTS Customer # Grp#
EXCEL 10-14-2000 101 1
EXCEL 10-14-2000 100 1
RCRD SRC LOAD DTS Customer # Grp#
The User Grouping Link, allows users to state how they want
roll-ups to occur in situations where source data doesnt exist.
BASE TABLE:
Data Vault How is DPS using DV
Hub_Students
Student_ID
SIS_Code
Load_DTS
Rec_SRC
Hub_Students
Student_ID
SIS_Code
Load_DTS
Rec_SRC
Hub_Schools
School_ID
School_Number
Load_DTS
Rec_SRC
Hub_Schools
School_ID
School_Number
Load_DTS
Rec_SRC
Hub_Employees
Employee_ID
HR_Emp_ID
DPSID
Load_DTS
Rec_SRC
Hub_Employees
Employee_ID
HR_Emp_ID
DPSID
Load_DTS
Rec_SRC
Lnk_School_Enrollments
Sch_Enr_ID
School_ID
Student_ID
Grade_Name
Load_DTS
Rec_SRC
Lnk_School_Enrollments
Sch_Enr_ID
School_ID
Student_ID
Grade_Name
Load_DTS
Rec_SRC
Lnk_Teacher_Schools
Teacher_School_ID
School_ID
Employee_ID
Load_DTS
Rec_SRC
Lnk_Teacher_Schools
Teacher_School_ID
School_ID
Employee_ID
Load_DTS
Rec_SRC
The direction
of the arrows
equate to
crows feet.
Data Vault Why is DPS using DV
Storage considerations.
Vertical partitioning of data (rate of
change).
All the FACTS all the TIME.
Scalability and Extensibility.

Data Vault What was not covered.
How to apply Data Vault Modeling.
Best practices.
Lessons Learned.
Dan Linstedts use of DECODE in
determining changed data capture.
Whos data is it? SLAs?
The new regulations / compliance that will
affect all of us.
Data Vault Questions?
Data Vault - References
DATA VAULT OVERVIEW: THE NEXT EVOLUTION IN DATA MODELING
Dan Linstedt - Core Integration Partners, Inc.
http://www.tdan.com/i021hy01.htm
DATA VAULT OVERVIEW THE NEXT EVOLUTION IN DATA MODELING SERIES 2
Dan Linstedt - Core Integration Partners, Inc.
http://www.tdan.com/i023hy02.htm
DATA VAULT - SERIES 3 END-DATES AND BASIC JOINS
Dan Linstedt - Core Integration Partners
http://www.tdan.com/i024hy02.htm
DATA VAULT - SERIES 4 LINK TABLES
Dan Linstedt - Core Integration Partners
http://www.tdan.com/i027ht04.htm
DATA VAULTTM OVERVIEW THE NEXT EVOLUTION IN DATA MODELING SERIES 5
LOADING TABLES
Dan Linstedt - Core Integration Partners
http://www.tdan.com/i027ht04.htm
Data Vault Modeling Class Materials and Notes; copyright 2002-2003
Dan Linstedt Core Integration Partners
http://www.coreintegration.com
Home of the Data Vault; www.danlinsedt.com
Audit the Data or Else. Un-audited Data Access Puts Business at High Risk; Bloor, Robin
and Baroudi, Carol; Lumigent, Inc.; copyright 2004
Data Vault Contact Information
JEFFREY MEYER
jeffrey_meyer@dpsk12.org
Data Vault
RMOUG Training Days
2006
Colorado Convention Center
Denver, Colorado
February 15-16

Das könnte Ihnen auch gefallen