Sie sind auf Seite 1von 25

Ajeet Kumar Yadav

Anshul Singh

1
Which
Whichare
areour
our
lowest/highest
lowest/highestmargin
margin
customers
customers??
Who
Whoare
aremy
mycustomers
customers
What
Whatisisthe
themost
most and
effective andwhat
whatproducts
products
effectivedistribution
distribution are
arethey
theybuying?
buying?
channel?
channel?

What Which
Whichcustomers
customers
Whatproduct
productprom-
prom- are
-otions
-otionshave
havethe
thebiggest
biggest aremost
mostlikely
likelyto
togo
go
impact to
tothe
thecompetition
competition??
impacton
onrevenue?
revenue?
What
Whatimpact
impactwill
will
new
newproducts/services
products/services
have
haveon
onrevenue
revenue
and
andmargins?
margins?
A single, complete and
consistent store of data
obtained from a variety of
different sources made
available to end users in a
what they can understand and
use in a business context.

3
4
 It means to retrieve and analyze data, to extract, transform and
load data, and to manage the data dictionary are also
considered essential components of a data warehousing system
 An expanded definition for data warehousing includes business
intelligence tools, tools to extract, transform, and load data into
the repository, and tools to manage and retrieve metadata.

5
 Data should be integrated across the
enterprise.
 Summary data has a real value to the
organization.
 Historical data holds the key to
understanding data over time.
 Data is non volatile.
 Data is subject oriented.

6
 The concept of data warehousing dates back to the late 1980.
 IBM researchers Barry Devlin and Paul Murphy developed the
business data warehouse.
 The data warehousing concept was intended to provide an
architectural model for the flow of data from operational systems
to decision support environments
 The process of gathering, cleaning and integrating data from
various sources, usually long existing operational systems (usually
referred to as legacy systems), was typically in part replicated for
each environment

7
Key developments in early years of data
warehousing
 1960s - General Mills and Dartmouth College, in a joint research project,
develop the terms dimensions and facts.
 1970s - ACNielsen and IRI provide dimensional data marts for retail sales.
 1983 - Teradata introduces a database management system specifically
designed for decision support.
 1988 - Barry Devlin and Paul Murphy publish the article An architecture for a
business and information systems in IBM Systems Journal where they
introduce the term "business data warehouse".
 1990 - Red Brick Systems introduces Red Brick Warehouse, a database
management system specifically for data warehousing.

8
 1991 - Prism Solutions introduces Prism Warehouse Manager, software for
developing a data warehouse.
 1991 - Bill Inmon publishes the book Building the Data Warehouse.
 1995 - The Data Warehousing Institute, a for-profit organization that
promotes data warehousing, is founded.
 1996 - Ralph Kimbal publishes the book The Data Warehouse Toolkit.

 1997 - Oracle 8, with support for star queries, is released .

9
A process of transforming
Information data into information and
making it available to users
in a timely enough manner to
make a difference.

Data
Relational
Databases
Optimized Loader
Extraction
ERP
Systems Cleansing

Data Warehouse
Engine Analyze
Purchased Query
Data

Legacy
Data Metadata Repository
11
Operational database layer
The source data for the data warehouse - An organization's ERP systems
fall into this layer.

Data access layer


The interface between the operational and informational access layer - Tools
to extract, transform, load data into the warehouse fall into this layer.

Metadata layer
The data directory - This is usually more detailed than an operational
system data directory. There are dictionaries for the entire warehouse and
sometimes dictionaries for the data that can be accessed by a particular
reporting and analysis tool.

Informational access layer


The data accessed for reporting and analyzing and the tools for reporting
and analyzing data - Business intelligence tools fall into this layer.

12
Dimensional
 Transaction data are partitioned into
fact

 key advantage of a dimensional


approach is that the data warehouse is
easier for the user to understand and to
use

 In order to maintain the integrity of


facts and dimensions, loading the data
warehouse with data from different
operational systems is complicated

  It is difficult to modify the data


 warehouse structure

13
A decision tree is one of
the most systematic tools
of decision-making theory
and practice
Trees are particularly
helpful in situations of
complex multistage
decision problems

14
 An OLAP (Online analytical
processing) cube is a data
structure that allows fast
analysis of data.
 The arrangement of data into
cubes overcomes a limitation of
relational databases

15
MOLAP
 MOLAP stands for Multidimensional Online Analytical
Processing.
 MOLAP differs significantly in that it requires the pre-
computation and storage of information in the cube - the
operation known as processing.
 MOLAP stores this data in an optimized multi-dimensional
array storage, rather than in a relational database
 Fast query performance due to optimized storage,
multidimensional indexing and caching.
 Smaller on-disk size of data compared to data stored in
relational database due to compression techniques.
 Automated computation of higher level aggregates of the data.
 It is very compact for low dimension data sets.
Off line Operational Database 
Data warehouses in this initial stage are developed by simply copying the
data off an operational system to another server where the processing load
of reporting against the copied data does not impact the operational
system's performance.

Off line Data Warehouse 


Data warehouses at this stage are updated from data in the operational
systems on a regular basis and the data warehouse data is stored in a data
structure designed to facilitate reporting.

17
Real Time Data Warehouse
Data warehouses at this stage are updated every time an operational system
performs a transaction

Integrated Data Warehouse 


Data warehouses at this stage are updated every time an
operational system performs a transaction. The data
warehouses then generate transactions that are passed back into
the operational systems

18

 Data warehouse provides a common data model for all data of
interest regardless of the data's source. This makes it easier to report
and analyze information than it would be if multiple data models
were used to retrieve information such as sales invoices, order
receipts, general ledger charges, etc.

 Prior to loading data into the data warehouse, inconsistencies are


identified and resolved. This greatly simplifies reporting and
analysis.

 Information in the data warehouse is under the control of data


warehouse users so that, even if the source system data is purged over
time, the information in the warehouse can be stored safely for
extended periods of time.

19
Since data warehouses are separate from operational
systems, they provide retrieval of data without slowing
down operational systems.
Data warehouses can work in conjunction with and,
hence, enhance the value of operational business
applications, notably customer relationship
management (CRM) systems.
Data warehouses facilitate decision support system
applications such as trend reports (e.g., the items with the
most sales in a particular area within the last two years),
exception reports, and reports that show actual
performance versus goals.
20
 Over their life, data warehouses can have high costs. The data
warehouse is usually not static. Maintenance costs are high.
 Data warehouses can get outdated relatively quickly. There is a
cost of delivering suboptimal information to the organization.
 There is often a fine line between data warehouses and
operational systems. Duplicate, expensive functionality may be
developed. Or, functionality may be developed in the data
warehouse that, in retrospect, should have been developed in
the operational systems and vice versa

21
Some of the applications data warehousing can be used for
are:
Insurance fraud analysis
Call record analysis
Logistics management.

22
A 2009 Gartner Group paper predicted these
developments in business intelligence/data warehousing
market .
Because of lack of information, processes, and tools,
through 2012, more than 35 per cent of the top 5,000
global companies will regularly fail to make insightful
decisions about significant changes in their business and
markets.
By 2012, business units will control at least 40 per cent
of the total budget for business intelligence.

23
By 2010, 20 per cent of organizations will have an
industry-specific analytic application delivered
via software as a service as a standard component of
their business intelligence portfolio.
In 2009, collaborative decision making will emerge as
a new product category that combines social software
with business intelligence platform capabilities.
By 2012, one-third of analytic applications applied to
business processes will be delivered through coarse-
grained application .
24
25

Das könnte Ihnen auch gefallen