Beruflich Dokumente
Kultur Dokumente
Session Objectives
People. Passion. Excellence
Objectives: At the end of this session, you will be able to: > Define On Line Analytical Processing > Understand the need for OLAP and applications of OLAP in BI > Describe the various OLAP solutions and Architecture > Comparison of different OLAP architectures > Evaluation parameters to be considered for selecting an OLAP tool
What is OLAP?
People. Passion. Excellence
> OLAP (On Line Analytical Processing) applications - designed for online ad-hoc data access and analysis. > Data organized into multiple dimensions. > Access to analytical content such as time series and trend analysis views and summary level information. > A set of functionality that attempts to facilitate multidimensional analysis. > Offers drill-down, drill-across and slice and dice capabilities.
Dimensions can we think in ? E.g. analysis by branch, product, agent, year !!! 2 or 3 Types of values we can handle ? E.g. Sales, Profit, Cost 1 or 2 How many levels can we handle ? E.g. number of products we can analyze
Many parameters affect a Measure (value) e.g Sales influenced by product, region, time, distribution channel, etc., Linear analysis = reports Many totals are at one level Difficult to identify the key parameters
OLAP in an Enterprise
People. Passion. Excellence
Uses of OLAP
People. Passion. Excellence
Analytical Capabilities: > Used by analysts and managers. > Offers aggregated view of the data, such as total revenues by customer profile, by product line, by geographical regions.
8
> Provides the decision support front-end for data warehousing. > Advanced statistical, financial, and analytical calculations. > Appropriate tools to access data from a relational database. > Appropriate tools to access or manage multidimensional data.
OLAP analytical features > Multi-dimensional views of data > Calculation intensive capabilities > Time intelligence The OLAP Calculation engine in OLAP tools have a wide range of built-in calculations such as: > Ratios > Time calculations > Statistics > Ranking > Custom formulas/algorithms > Forecasting and modeling
10
Evolution of OLAP
Star Schema
People. Passion. Excellence
> A Star Schema is a dimensional model created by mapping data entities from operational systems > It has a central table (fact table) that links all the other tables (dimension tables) together > Dimension: The same category of information. For example, year, month, day, and week are all part of the Time Dimension. > Measure: The property that can be summed or averaged using pre computed aggregates.
12
> Facts or Measures are the Key Performance Indicators of an enterprise > Factual data about the subject area > Numeric, summarized
13
Dimension
People. Passion. Excellence
What was sold ? Whom was it sold to ? When was it sold ? Where was it sold ?
> Dimensions put measures in perspective > What, when and where qualifiers to the measures > Dimensions could be products, customers, time, geography etc.
14
Star Schema
People. Passion. Excellence
15
16
17
CUBE
People. Passion. Excellence
Cube Multi dimensional databases store information in the form of cubes. A cube is a collection of facts and related dimensions stored together in arrays. Geography
Sales
HR
Time Product
> Hierarchy: A hierarchy defines the navigating path for drilling up and drilling
down. All attributes in a hierarchy belong to the same dimension.
> Levels: These are organized into one or more hierarchies, typically from a
coarse-grained level (for example, Year) down to the most detailed one (for example, Day).
> Members: The individual category values (for example, 2002 or 21Jan2002). > Measures: These are the data values that are summarized and analyzed.
Examples of measures are sales figures or operational costs.
> Cells: These are the intersection of one member for every dimension and
store the data for measures.
19
Time
1999
2001
Q3 Q4
Q1 Q2 Q3 Q4 Q1 QUARTER
Q2
20
Aggregates
People. Passion. Excellence
y Add up amounts for day 1 y In SQL: SELECT sum(amt) FROM SALE WHERE date = 1
sale prodId p1 p2 p1 p2 p1 p1 storeId s1 s1 s3 s2 s1 s2 date 1 1 1 1 2 2 amt 12 11 50 8 44 4
81
21
Aggregates
People. Passion. Excellence
y Add up amounts by day y In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date
sale
prodId p1 p2 p1 p2 p1 p1
storeId s1 s1 s3 s2 s1 s2
date 1 1 1 1 2 2
amt 12 11 50 8 44 4
ans
date 1 2
sum 81 48
22
Another Example
People. Passion. Excellence
y Add up amounts by day, product y In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId
sale prodId p1 p2 p1 p2 p1 p1 storeId s1 s1 s3 s2 s1 s2 date 1 1 1 1 2 2 amt 12 11 50 8 44 4
sale
prodId p1 p2 p1
date 1 1 2
amt 62 19 48
rollup drill-down
23
Aggregates
People. Passion. Excellence
> Operators: sum, count, max, min, median and avg > Having clause > Using dimension hierarchy average by region (within store) maximum by month (within date)
24
Multi-dimensional cube:
p1 p2 s1 12 11 s2 8 s3 50
dimensions = 2
25
3-D Cube
People. Passion. Excellence
Multi-dimensional cube:
day 2 day 1
p1 p2 s1 p1 12 p2 11
s1 44 s2 8
s2 4 s3 50
s3
dimensions = 3
26
Example
People. Passion. Excellence
roll-up to region
Dimensions: Time, Product, Store roll-up to brand Attributes: Product (upc, price, ) Store Hierarchies: Product p Brand p Day p Week p Quarter roll-up to week Store p Region p Country
Product
Time
56 units of bread sold in LA on M
27
...
sum
p1 p2 s1 56 11 s2 4 8 s3 50
s1 67
s2 12
s3 50
129
p1 p2 sum 110 19
rollup drill-down
28
day 2 day 1
p1 p2 s1 p1 12 p2 11
s1 44 s2 8
s2 4 s3 50
s3
p1 p2
region A region B 56 54 11 8
29
Slicing
People. Passion. Excellence
day 2 day 1
p1 p2 s1 p1 12 p2 11
s1 44 s2 8
s2 4 s3 50
s3
TIME = day 1
s1 12 11 s2 8 s3 50
p1 p2
30
OLAP - Classification
People. Passion. Excellence
> Relational OLAP (ROLAP) > Multidimensional OLAP (MOLAP) > Hybrid OLAP (HOLAP)
32
MOLAP
People. Passion. Excellence
Brand
Geography
y Multi-dimensional OLAP y MOLAP is a technology which uses a multi-dimensional database that stores data as n-dimensional cube
33
Architecture of MOLAP
People. Passion. Excellence
non-live connection Used for updating the MOLAP data cube only
LAN
Desktop Systems Data Mart Server RDBMS Connectivity Middleware MOLAP Server MDDBMS/Data Cube MOLAP Application MOLAP Client Tools
Router Firewall Issues: Size of Data Cube Cubes deployment Size of Update Data Set
Intranet Internet Thin Clients WWW Browser
34
MOLAP Products
People. Passion. Excellence
35
Architecture of ROLAP
People. Passion. Excellence
LAN
Router / Firewall
Intranet Internet Thin Clients WWW Browser
36
ROLAP Products
People. Passion. Excellence
Business Objects
Metacube
DSS Server
Information Advantage
37
Architecture of HOLAP
People. Passion. Excellence
LAN
38
HOLAP Products
People. Passion. Excellence
y SAS
39
MOLAP Vs ROLAP
Comparison of Architectures
People. Passion. Excellence
Architectural Features
Number of Dimensions
MOLAP
Ten or Less
ROLAP
Unlimited
Support for Large number of users Scalability Complex Multidimensional analysis Volume of Data storage
Limited support
Good
Up to 50 GB
Storage of Information
Through cubes
Good
Normal
NA
SQL
41
Parameters
Application design
MOLAP
Essentially the definition of dimensional model and calculation rules
ROLAP
It uses twodimensional tables that are stored in RDBMSs. (Data is stored in Star schema or Snow flake schema.) Summary tables are implemented in the relational database
Aggregation techniques
Measures are precalculated and stored at each hierarchy summary level during load time Drill down, Drill up, Drill across and Slicing /Dicing Instant response Supports complex functions like %change, ranking etc., Calculated from cubes
Drill down, Drill up, Slicing and Dicing Slower Limited value added functions
42
Parameters
Processing Over head for large input data sets Support for frequent updates Resource requirements Industry standard Access to the database through ODBC
MOLAP
High Cannot handle frequent update of cubes High No current standards The databases have proprietary API and do not provide access through ODBC.
ROLAP
Low Suitable for frequent updates Low SQL standard Provides access through ODBC
43
Session Summary
People. Passion. Excellence
In this session, We have > Understood the need for OLAP and significance of Multidimensional analysis in a Data Warehouse. > Discussed about the evolution of OLAP. > Explained architectures, characteristics as well as the merits and demerits of various OLAP solutions.
44
Thank you