Sie sind auf Seite 1von 30

Raugh kimball

In simplest terms Data Warehouse can be defined as collection of Data marts. -Data marts : Subjective collection of Data.

Bill Inmon
A data warehouse is a subject-oriented, integrated, timevariant,and nonvolatile collection of data in support of managements decision-making process. ERP will Run the Business - like how Tyres Run the Car BI (Reports,Data mining,Dashboards,kpis) will help you to take business decisions based on your historical data. - like Steering, mirrors, breaks, dashboards will help, how smoothly you can run the Car or reach the Destination.

In What way a Data warehouse helps any Business Lets say A producer wants to know.
Which are our lowest/highest margin customers ? What is the most effective distribution channel? Who are my customers and what products are they buying?

What product prom-otions have the biggest impact on revenue? What impact will new products/services have on revenue and margins? 4

Which customers are most likely to go to the competition ?

Data, Data everywhere yet ...


I cant find the data I need
data is scattered over the network many versions, subtle differences

I cant get the data I need


need an expert to get the data

I cant understand the data I found


available data poorly documented

I cant use the data I found


results are unexpected data needs to be transformed from one form to other
5

A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context.

[Barry Devlin]
6

What are the users saying...


Data should be integrated across the enterprise Summary data has a real value to the organization Historical data holds the key to understanding data over time What-if capabilities are required

Information

A process of transforming data into information and making it available to users in a timely enough manner to make a difference
[Forrester Research, April 1996]

Data
8

Data Warehousing -It is a process


Technique for assembling and managing data from various sources for the purpose of answering business questions. Thus making decisions that were not previous possible A decision support database maintained separately from the organizations operational database
9

Data Mining works with Warehouse Data


Data Warehousing provides the Enterprise with a memory

Data Mining provides the Enterprise with intelligence


10

We want to know ...


Given a database of 100,000 names, which persons are the least likely to default on their credit cards? Which types of transactions are likely to be fraudulent given the demographics and transactional history of a particular customer? If I raise the price of my product by Rs. 2, what is the effect on my ROI? If I offer only 2,500 airline miles as an incentive to purchase rather than 5,000, how many lost responses will result? If I emphasize ease-of-use of the product as opposed to its technical capabilities, what will be the net effect on my revenues? Which of my customers are likely to be the most loyal?

Data Mining helps to extract such information

Oracle 10g

IBM DB2

Base Product

$ 25K

$ 40K

$ 25K

Tuning $3K Diagnostics $3K Partitioning $10K


(included)

Performance Expert $10K

Manageability Base Product

$ 25K

$ 40K 56K

$ 25K 35K

OLAP $20k Mining $20k BI Bundle $20k

DB2 OLAP $35K DB2 Warehouse $75K Cube Views $9.5K

Business Intelligence
(included)

Manageability Base Product

$ 25K

$ 116K $ 56K

$ $ 35K 154.5K

Data Guard $116K

Recovery Expert $10k

High Availability Business Intelligence


(included)

Manageability Base Product

$ 25K

$ 232K 116K

$ 154.5K 164.5K

$116K $232K

$164.5K

Multi-core
High Availability

Business Intelligence
(included)

Manageability Base Product

$ 25K

$348k $$464k232K

$$164.5K 329K

What happened?

Why did it happen?

What happened why and how?

What will happen?

Additional Benefit

Number of Users

OLTP Online Transaction Processing OLAP Online Analytical Processing MOLAP Multidimensional OLAP ROLAP Relational OLAP HOLAP Hybrid OALP Dimensions De-normalized master tables Attributes Columns of Dimensions Hierarchies sequential order of attributes Facts (Measure group) Transactions tables in DWH Fact (Measures) Cubes Multidimensional storage of Data KPIs Key performance indicator Dashboards combination of reports,kpis,charts Data Marts Subjective Collection of Data SCDs Slowly changing Dimensions Perspectives Child Cube

Data Analysis

Reporting, OLAP, Data Mining

Data Storage

Repository
Data-Migration

Middleware (Populations-Tools)

Operational Data Sources

OLTP O
Stage DB Optional

A
CUBE

ROLAP

MOLAP

SSAS SSIS SSRS Integration Services Analysis Services Reporting Services Data Marts

1. OLTP (on-line transaction processing) 2. Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc. 3. The tables are in the Normalized form.

1. OLAP (on-line analytical processing) 2. Data analysis and decision making

3. The tables are in the De-Normalized form. 4. We Called the Storage objects as Dimension and Facts. i.e., All the masters Are dimension and the Transactions are Facts. 5. For Designing OLTP we used Dimension modeling. OLAP is classified into two i.e., MOLAP & ROLAP

4. We Called the Storage objects as Tables. i.e., All the masters and the Transactions are stored in the tables.
5. For Designing OLTP we used data modeling.

Normalized Tables Product Prod_Id Prod_Name Base_Rate Cat_Id

De-Normalized Tables Product_Dim Prod_Id Prod_Name

Base_Rate Category
Cat_Id Cat_Name Cat_Desc Cat_Name Cat_Desc Group_Name

Group

Group_Desc
Topics Later We will Cover 1. Types of Dimensions 2. Slowly changing Dimensions 3. Hierarchies

Group_Id
Group_Name Group_Desc

Group_Id

SalesOrderDetails Cust_Id SalesPerson Prod_Id Order_Date Booked_Date Delivery_Date Unit_Price Qty Tax Created_By

SalesOrder_Fact
Cust_Id Prod_Id Order_Date Reference keys of Dimensions

Delivery_Date
Unit_Price Qty Total_Amount Numeric fields called as Fact or measure

Tax

Qty*Unit_Price+Tax=Total Amount Usually calculate all the calculations before storing into OLAP

Prod_Dim Prod_Id SalesOrder_Fact Cust_Id Prod_Id

Org_Dim Org_Id

Order_Date
Delivery_Date Org_Id Unit_Price Time_Dim Date Year Month

Cust_Dim Cust_Id

Qty Total_Amount Tax

STAR Schema

Product_Dim
Prod_Id Prod_Name Base_Rate

SalesOrder_Fact
Cust_Id Prod_Id Order_Date

Cat_Name
Cat_Desc Group_Name Group_Desc

Delivery_Date
Unit_Price Qty Total_Amount

Tax

1. Dimensions will have only relation with the Fact. (Normalized model) 2. One to many or One to One relation will Occur. 3. Performance is fast but required huge storage space.

1. Dimension will have a relation other than Fact. (DeNormalized model) 2. Used for many to many relation. 3. Performance is Low but required Less storage space.

Das könnte Ihnen auch gefallen