Sie sind auf Seite 1von 9

Dimensional modeling

(DM)
Overview of Data
Warehouse
• Abbreviated DW, a collection of data designed to
support management decision making. Data
warehouses contain a wide variety of data that
present a coherent picture of business conditions at
a single point in time.

• A database where data is collected for the purpose of


being analyzed.

• A data warehouse collects, organizes, and makes data


available for the purpose of analysis - to give
management the ability to access and analyze
information about its business. This type of data can
be called "informational data".

• The term data warehousing generally refers to the


Data Warehouse
Query, Reporting
& Analysis
Tools

External Data Data


Sources Periodic
Updates
Warehouse

Periodic
Updates
Periodic
Updates
Operational Systems
Extract/Transform/
Load

Periodic
Updates

Other Internal
Data Sources
Dimensional modeling
(DM)
• Dimensional modeling (DM) is the name of a
set of techniques and concepts used in
data warehouse design. It is considered to be
different from entity-relationship modeling
(ER). Dimensional Modeling does not
necessarily involve a relational database. The
same modeling approach, at the logical level,
can be used for any physical form, such as
multidimensional database or even flat files.
According to Dr. Kimball, DM is a design
technique for databases intended to support
end-user queries in a data warehouse. It is
oriented around understandability and
performance. According to him, although
transaction-oriented ER is very useful for the
transaction capture, it should be avoided for
end-user delivery.
Dimensional modeling
(DM)
• Dimensional Model is a logical design
technique that seeks to present the data
in a standard, intuitive framework that
allows for high-performance access. It is
inherently dimensional, and it adheres to
a discipline that uses the relational model
with some important restrictions.
• Every dimensional model is composed of
one table with a multi-part key, called the
fact table, and a set of smaller tables
called dimension tables. Each
dimension table has a single-part primary
key that corresponds exactly to one of the
components of the multi-part key in the
fact table.This characteristic 'star-like'
structure is often called a star join.
• A fact table, because it has a multi-part primary key made
up of two OR more foreign keys, always expresses a
many-to-many relationship. The most useful fact tables
also contain one OR more numerical measures, OR
'facts,' that occur for the combination of keys that define
each record. In Figure, the facts are Units_Sold,
Dollars_Sold, and Avg_sales. The most useful facts in a
fact table are numeric and additive. Additivity is crucial
because data warehouse applications almost never
retrieve a single fact table record; rather, they fetch back
hundreds, thousands, OR even millions of these records
at a time, and the only useful thing to do with so many
records is to add them up.
• Dimension tables, by contrast, most often contain
descriptive textual information, and the attributes (also
called classification attributes), which are used for
analysis. Dimension attributes are used as the source of
most of the interesting constraints in data warehouse
queries, and they are virtually always the source of the
row headers in the SQL answer set.
Dimensional modeling
(DM)
Dimensional modeling
(DM)
• Fact Table- The central linkage in
Dimensional Modeling
• A fact table contains the value of all the
measures linked to the set of
dimensions linked to the FACT table. It
contains the measure values for the
combination of lowest level of
granularity of dimensions. The
measures are typically numeric, which
can undergo mathematical
aggregation and analysis.

Dimensional modeling
(DM)
• Dimension Table- What does and should it contain
• The dimension table contains all the information on the dimension. This
includes:
• a. The primary key (Equivalent foreign key in the Fact Table).
• b. All attributes of the dimension. These include:
• The hierarchy attributes- Consider a business hierarchy-- pin-code to city to
district to state to country for location dimension. This means that each
hierarchy element will be an attribute.

• Textual as well as the code attributes- Location code as well as the name
of the location. This is required, because both could be used for different
reasons by different users. A power user could be looking for location code
(NY01), whereas an end user could be looking for more explicit header (New
Jersey).

• Include all parallel hierarchies – A product could be having different


hierarchies, depending upon if CFO OR Head of sales is looking at it. This
enables the done on all hierarchies as well as cross-hierarchies.

• Production Primary Key Refer Surrogate primary key link to FACT


table– These keys are used because the production keys could change OR
could be reused. For example a bill number could be reused after 5 years,

Das könnte Ihnen auch gefallen