Sie sind auf Seite 1von 28

Data Warehousing

Vidhya Pillai
A producer wants to know….
Which are our
lowest/highest margin
What is the most customers ?
effective distribution
channel? Who are my customers
and what products
are they buying?

What product prom-


-otions have the biggest
impact on revenue?
Which customers
are most likely to go
What impact will to the competition ?
new products/services
have on revenue
and margins? 2
Data, Data • I can’t find the data I need
everywhere • data is scattered over the network
yet ... • many versions, subtle differences

 I can’t get the data I need


need an expert to get the data

 I can’t understand the data I found


available data poorly documented

 I can’t use the data I found


results are unexpected
data needs to be transformed from one form
to other
3
What are the users saying...
• Data should be integrated across the enterprise
• Summary data has a real value to the
organization
• Historical data holds the key to understanding
data over time
• What-if capabilities are required

4
Data warehouse
• Data warehouses began to be developed in the 1980s .
• They typically contain data extracted from various sources, including external
sources such as the Internet, organized in such a way as to facilitate decision
making.
Data Warehouse
A single, complete and consistent store of
data obtained from a variety of different
sources made available to end users in a
what they can understand and use in a
business context.

[Barry Devlin]

6
Data Warehousing features
A data warehouse is a subject-oriented, integrated, time-variant and non-volatile
collection of data in support of management's decision-making process.
SITN
• Subject oriented
• Integrated
• Time variant
• Non-volatile
Features
• Subject oriented- data is stored by subject not by application
Eg- ecommerce company
Applications Subject

Order processing sales


Customer billing Customer
Features
• Integrated data- for removing inconsistency ( code/measurement) eg-G/M
female/male , transformation and integration of some data
• Time variant- it takes historic data not just current data (analysis of the
past, relates information to the present and enables forecast to the future
• Nonvolatile data- data is not updated or deleted from data warehouse
Data warehousing processes
• 1. Data Cleaning (filling in missing values, resolving inconsistencies)
• 2. Data Integration ( integration of multiple databases or files)
• 3. Data Transformation (Convert data from host format to warehouse
format)
• 4. Data Loading
• 5. Periodic data refreshing
DATA WAREHOUSE USAGE
Three kinds of data warehouse applications
• 1)Information processing:-Supports querying, basic statistical analysis, and
reporting using crosstabs, tables, charts and graphs.
• 2)Analytical processing:- • Multidimensional analysis of data warehouse data
• Supports basic OLAP operations, slice-dice, drilling, pivoting
• 3)Data mining:- • Knowledge discovery from hidden patterns • Supports
associations, constructing analytical models, performing classification and
prediction, and presenting the mining results using visualization
Components of Data warehouse
Source Data Information
Management and Control Delivery
Production
data

Date Multidimensional
Internal data data based
warehouse
External data
Data mart Data mining
Archived OLAP
data Report query

Data Staging Data Storage

Extract, Transform and Loading


Data Mining
Data Mining works with Warehouse Data
• Data Warehousing provides the Enterprise with
a memory

Data Mining provides the


Enterprise with intelligence

14
We want to know ...
• Given a database of 100,000 names, which persons are the least likely to default on their credit cards?
• Which types of transactions are likely to be fraudulent given the demographics and transactional history of a
particular customer?
• If I raise the price of my product by Rs. 2, what is the effect on my ROI?
• If I offer only 2,500 airline miles as an incentive to purchase rather than 5,000, how many lost responses will
result?
• If I emphasize ease-of-use of the product as opposed to its technical capabilities, what will be the net effect on
my revenues?
• Which of my customers are likely to be the most loyal?

Data Mining helps extract such information


15
Data mining
• Data mining refers to extracting or “mining” knowledge from large
amounts of data.
• Also referred as Knowledge Discovery in Databases. It is a process of
discovering interesting knowledge from large amounts of data stored either
in databases, data warehouses, or other information repositories.
Data Mining
Architecture
Data Mining

• Data in data warehouses are analyzed to reveal hidden patterns and


trends
• Market-basket analysis to identify new product bundles
• Find root cause of qualify or manufacturing problems
• Prevent customer attrition
• Acquire new customers
• Cross-sell to existing customers
• Profile customers with more accuracy
18
Application Areas
Industry Application
Finance Credit Card Analysis
Insurance Claims, Fraud Analysis
Telecommunication Call record analysis
Transport Logistics management
Consumer goods promotion analysis
Data Service providers Value added data
Utilities Power usage analysis

19
Data Mining in Retail
Conduct shopping cart analysis
• customer purchasing preferences
• develop more pointed actions designed to cross-sell products that are
frequently purchased together, up-sell customers at check-out.
• to optimize assortment planning and validate promotions
• to guide marketing and sales activities to hit the right customers with the
right offers at the right times.
Data Mining in Retail
Learn who your best customers are
• Know who their best customers are, what pushes them to shop, how frequently they
buy, how much they spend per order
• divide customers into high-spend, medium-spend and low-spend customer
segments
• understand the spending patterns, communication preferences, and merchandising
preferences of their customers
• retailers can increase profit and revenue by understanding which customers and
products really drive their business.
Data Mining in Retail
Measure marketing campaign effectiveness
• can track all their various marketing campaigns or promotions to see which
ones have the biggest return
• Analyze customer profitability to determine which marketing campaigns
bring in higher-spending customers
• Assess how well sales increased over a promotional period
Data Mining can be used in the following
manufacturing domain
• • Data Mining in product design
• • Data Mining in manufacturing lead time estimation
• • Data Mining in quality
• • Data Mining in supply chain management
• • Data Mining in Just In Time manufacturing environment
Data Mining in Bank
(1) Risk Management
• Data mining technique helps to distinguish borrowers who repay loans promptly
from those who don’t.
• It also helps to predict when the borrower is at default, whether providing loan to a
customer will result in bad loans etc.
• Bank executives by using Data mining technique can also analyze the behavior and
reliability of the customers while selling credit cards too.
• It also helps to analyze whether the customer will make prompt or delay payment if
the credit cards are sold to them.
Data Mining in Bank
(2) Marketing
• Bank analysts can also analyze the past trends, determine the present demand and
forecast the customer behavior of various products and services in order to grab
more business opportunities and anticipate behavior patterns.
• Data mining technique also helps to identify profitable customers from non-
profitable ones.
• Another major area of development in banking is Cross selling i.e banks make an
attractive offer to its customer by asking them to buy additional product or service
Data Mining in Bank
(3) Fraud Detection
• In Banking Sector Sometimes the given demographics and transaction history of
the customers are likely to defraud the bank.
• Data mining technique helps to analyze such patterns and transactions that lead to
fraud.
• Banking sector gives more effort for Fraud Detection. Fraud management is a
knowledge-intensive activity. It is so important in fraud detection is that finding
which ones of the transactions are not ones that the user would be doing.
Data Mining in Bank
(4) Automatic Credit Approval
(5) Customer Retention
• Today, customers have so many opinions about where they can choose to do
their business. Executives in the banking industry, therefore, must be aware
that if they are not giving each customer their full attention, the customer
can simply find another bank that will. Early data analysis techniques were
oriented toward extracting quantitative and statistical data characteristic
• Data mining can help in targeting ‘new’ customers for products and services
and in discovering a customer’s previous purchasing patterns so that the
bank will be able to retain existing customers by offering incentives that are
individually tailored to each customer’s needs.
• Churn in the banking sector is a major problem today. Losing the customers
can be very expensive as it costs to acquire a new customer.
• Predictive data mining techniques are useful to convert the meaningful data
into knowledge

Das könnte Ihnen auch gefallen