Beruflich Dokumente
Kultur Dokumente
On-line analytical processing OLAP systems provide support for the storage
and manipulation of data adopting the multidimensional data model.
As already seen, the multidimensional data model presents data in a standard,
intuitive format with a measure value related to members of n dimensions being
visualized as being stored in the appropriate cell of an n-dimensional cube.
For example, with the Suppliers-Parts-Projects database, the fact that 200 parts
P1 are supplied to project J1 by supplier S1 would be represented as follows.
OLAP OPERATIONS
OLAP cube operations are applied to a multidimensional cube to produce
another cube as a result. Common operations are slice, dice, roll-up, drill
down and pivot.
The slice operation produces a subcube by performing a selection on one
dimension e.g. produce a suppliers-parts-projects cube for those suppliers
based in London.
The dice operation produces a subcube by performing a selection on more
than one dimension e.g. produce a suppliers-parts-projects cube for those
suppliers, parts and projects based in London.
The roll-up operation aggregates measure values within a cube either by
aggregating at a higher level in the hierarchy underpinning a dimension, or
removing one or more dimensions.
The drill down operation performs the reverse of a roll-up: measure values
within a cube are aggregated at a lower level in the hierarchy underpinning
a dimension, or one or more dimensions are added.
3
For example, consider the following cube, and assume that country is within the
supplier dimension at a higher level than city.
This cube shows the effect of a roll-up operation with aggregation on supplier
country rather than city.
This cube shows the effect of a roll-up operation with removal of the part
dimension from the previous cube.
J#
---J1
J1
J1
J1
J1
J1
J1
J1
J1
SCITY
-----LEEDS
LONDON
LILLE
LEEDS
LONDON
LILLE
LEEDS
LONDON
LILLE
QUANTITY
-------200
100
400
300
200
100
200
200
400
P#
-----P1
P2
P3
J#
----J1
J1
J1
S#
----S1
S1
S1
S1
S1
S1
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S3
S3
S3
S3
P#
-----P1
P1
P1
P1
P1
P1
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P5
P3
P4
P4
P4
J#
---J1
J1
J4
J4
J4
J4
J1
J1
J1
J2
J2
J3
J3
J4
J4
J5
J5
J5
J6
J6
J7
J7
J7
J7
J7
J2
J1
J2
J2
J2
MONTH
QUANTITY
------ ---------200302
100
200305
100
200301
100
200305
300
200306
200
200309
100
200302
100
200306
100
200311
200
200305
100
200308
100
200303
100
200304
100
200307
100
200312
400
200302
100
200307
300
200310
200
200301
300
200312
100
200301
200
200303
100
200305
100
200311
200
200312
200
200308
100
200310
200
200308
100
200309
300
200310
100
S#
----S4
S4
S4
S4
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
P#
-----P6
P6
P6
P6
P2
P2
P5
P5
P5
P5
P6
P1
P3
P4
P4
P5
P5
P6
J#
---J3
J3
J7
J7
J2
J4
J5
J5
J5
J7
J2
J4
J4
J4
J4
J4
J4
J4
MONTH
QUANTITY
------ ---------200305
100
200306
200
200307
200
200310
100
200304
200
200312
100
200301
100
200302
100
200303
300
200307
100
200302
200
200311
100
200304
200
200303
600
200307
200
200309
100
200311
300
200304
500
10
The ROLLUP option to GROUP BY creates extra rows in the result compared
with normal GROUP BY. For example, consider (with results below):
SELECT S#, P#, J#, SUM(QUANTITY)
FROM SUPPLY_MONTHLY
GROUP BY ROLLUP (S#, P#, J#)
S#
----S1
S1
S1
S1
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
P#
-----P1
P1
P1
P3
P3
P3
P3
P3
P3
P3
P3
P5
P5
J#
SUM(QUANTITY)
---- ------------J1
200
J4
700
900
900
J1
400
J2
200
J3
200
J4
500
J5
600
J6
400
J7
800
3100
J2
100
100
3200
11
S#
----S3
S3
S3
S3
S3
S4
S4
S4
S4
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
P#
-----P3
P3
P4
P4
P6
P6
P6
P1
P1
P2
P2
P2
P3
P3
P4
P4
P5
P5
P5
P5
P6
P6
P6
J#
SUM(QUANTITY)
---- ------------J1
200
200
J2
500
500
700
J3
300
J7
300
600
600
J4
100
100
J2
200
J4
100
300
J4
200
200
J4
800
800
J4
400
J5
500
J7
100
1000
J2
200
J4
500
700
3100
8500
12
Whereas a normal GROUP BY would produce a result row for each group of
rows with a particular (S#, P#, J#) combination of values, GROUP BY
ROLLUP additionally produces a row for each (S#, P#) group, a row for each
(S#) group and a row for the table as a whole. Such extra rows are known as
ordered super-aggregate rows.
The effect is to produce extra result rows corresponding to multidimensional
cube roll-up operations with dimensions being removed one at a time from right
to left following the ordering of columns specified in the ROLLUP clause.
In a super-aggregate result row, a null value is returned for each column not
participating in the aggregation at that level.
Hence, for a result row corresponding to a (S#, P#) aggregation, the J# value
will be null, while for a result row corresponding to a (S#) aggregation, both P#
and J# will be null, and all of S#, P#, J# are null for the whole table
aggregation.
Multiple roll-up expressions may be specified in a single GROUP BY as well as
combinations of roll-up and non-roll-up expressions.
13
14
It has been seen that ROLLUP has an inherent ordering: in the example above,
no super-aggregate result rows were produced in respect of (P#, J#) groups
for example.
GROUP BY CUBE produces super-aggregate rows for all combinations of
columns.
SELECT S#, P#, J#, SUM(QUANTITY)
FROM SUPPLY_MONTHLY
GROUP BY CUBE (S#, P#, J#)
S#
P#
J#
SUM(QUANTITY)
----- ------ ---- ------------8500
J1
800
J2
1200
J3
500
J4
3300
J5
1100
J6
400
J7
1200
15
S#
P#
----- -----P1
P1
P1
P2
P2
P2
P3
P3
P3
P3
P3
P3
P3
P3
P4
P4
P4
P5
P5
P5
P5
P5
P6
P6
P6
P6
P6
J#
SUM(QUANTITY)
---- ------------1000
J1
200
J4
800
300
J2
200
J4
100
3500
J1
600
J2
200
J3
200
J4
700
J5
600
J6
400
J7
800
1300
J2
500
J4
800
1100
J2
100
J4
400
J5
500
J7
100
1300
J2
200
J3
300
J4
500
J7
300
16
S#
----S1
S1
S1
S1
S1
S1
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S3
S3
S3
S3
S3
S3
S3
P#
J#
SUM(QUANTITY)
------ ---- ------------900
J1
200
J4
700
P1
900
P1
J1
200
P1
J4
700
3200
J1
400
J2
300
J3
200
J4
500
J5
600
J6
400
J7
800
P3
3100
P3
J1
400
P3
J2
200
P3
J3
200
P3
J4
500
P3
J5
600
P3
J6
400
P3
J7
800
P5
100
P5
J2
100
700
J1
200
J2
500
P3
200
P3
J1
200
P4
500
P4
J2
500
17
S#
----S4
S4
S4
S4
S4
S4
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
P#
J#
SUM(QUANTITY)
------ ---- ------------600
J3
300
J7
300
P6
600
P6
J3
300
P6
J7
300
3100
J2
400
J4
2100
J5
500
J7
100
P1
100
P1
J4
100
P2
300
P2
J2
200
P2
J4
100
P3
200
P3
J4
200
P4
800
P4
J4
800
P5
1000
P5
J4
400
P5
J5
500
P5
J7
100
P6
700
P6
J2
200
P6
J4
500
18
SUM(QUANTITY)
P#)
SUM(QUANTITY)
J#)
19
Here, there is a query expression involving the join of SUPPLY and PART and
the calculation of WEIGHT * QUANTITY in both the outer and inner levels of
the query.
Use of the WITH clause enables the common query expression to be factored
out with the table resulting from the query expression being used in the
subsequent query each time it is referenced.
WITH TOTAL_WEIGHT_SUPPLY AS
(SELECT SPJ.S#, SPJ.P#, SPJ.J#,
P.WEIGHT * SPJ.QUANTITY AS TOTAL_WEIGHT
FROM SUPPLY SPJ, PART P
WHERE SPJ.P# = P.P#)
SELECT S#, P#, J#, TOTAL_WEIGHT
FROM TOTAL_WEIGHT_SUPPLY
WHERE TOTAL_WEIGHT =
(SELECT MAX (TOTAL_WEIGHT)
FROM TOTAL_WEIGHT_SUPPLY)
21
The query expression factored out in this way can contain all the constructs
which may by found in a subquery such as GROUP BY and HAVING.
In addition to making the query simpler to write, with less chance of errors being
introduced in query expressions which should be the same, there is also a
potential performance advantage since the table which results from the query
expression is used in each place in the subsequent query where it is
referenced.
22
23
24
The expression is evaluated to find the first WHEN condition which is true and the
corresponding THEN expression is returned. Again, if there are no matching
expressions, the ELSE expression is returned, or NULL if the ELSE is omitted.
An example of use of a searched CASE is:
SELECT S#,
SNAME,
CASE
WHEN
WHEN
ELSE
END AS
FROM SUPPLIER
25
J#
---J1
J1
J1
J1
J1
J1
J1
J1
J1
SCITY
-----LEEDS
LONDON
LILLE
LEEDS
LONDON
LILLE
LEEDS
LONDON
LILLE
QUANTITY
-------200
100
400
300
200
100
200
200
400
P#
-----P1
P2
P3
J#
----J1
J1
J1
26
Assuming the first table is named PIVOT_EXAMPLE, the second table can be
produced by executing the query:
SELECT P#,
J#,
MAX(CASE SCITY WHEN 'LEEDS' THEN QUANTITY ELSE NULL END)
AS LEEDS_QTY,
MAX(CASE SCITY WHEN 'LONDON' THEN QUANTITY ELSE NULL END)
AS LONDON_QTY,
MAX(CASE SCITY WHEN 'LILLE' THEN QUANTITY ELSE NULL END)
AS LILLE_QTY
FROM PIVOT_EXAMPLE
GROUP BY P#, J#
ORDER BY P#, J#
Here a GROUP BY is used to produce a single row result for each combination
of a P# and J# value.
The CASE expression is then used to generate the correct QUANTITY value
depending on the SCITY value in the row of the original table.
Use of the MAX function ensures that the resulting expression is single-valued
per group. If no function were used, an error would result.
27
28
The RANK clause enables ranking of rows, useful for top n queries.
Suppose that it is required to retrieve the rows with the 10 highest quantities
from the SUPPLY table.
SELECT S#, P#, J#, QUANTITY,
QUANTITY_RANK, QUANTITY_DENSE_RANK
FROM (SELECT S#, P#, J#, QUANTITY,
RANK() OVER
(ORDER BY QUANTITY DESC)
AS QUANTITY_RANK,
DENSE_RANK() OVER
(ORDER BY QUANTITY DESC)
AS QUANTITY_DENSE_RANK
FROM SUPPLY)
WHERE QUANTITY_RANK <= 10
29
P#
-----P3
P4
P1
P3
P3
P4
P6
P5
P3
P3
P5
J#
QUANTITY QUANTITY_RANK QUANTITY_DENSE_RANK
---- ---------- ------------- ------------------J7
800
1
1
J4
800
1
1
J4
700
3
2
J5
600
4
3
J4
500
5
4
J2
500
5
4
J4
500
5
4
J5
500
5
4
J1
400
9
5
J6
400
9
5
J4
400
9
5
11 rows selected.
Two functions have been used in the query, RANK and DENSE_RANK. These
differ in their handling of duplicate values in the ranked column.
As can be seen in the example, two rows have the highest value for quantity of
800. Whereas RANK ranks the row with the next highest value for quantity as 3
- i.e. a sparse ranking, DENSE_RANK ranks the row with the next highest value
for quantity as 2 i.e. a dense rank with no gaps in the ranking order.
30
31
32
S#
MONTH 3 SUPPLY TOTAL
----- ---------- -------------S1
200301
100
S1
200302
200
S1
200305
300
S1
200305
500
S1
200306
600
S1
200309
600
S2
200301
300
S2
200301
500
S2
200302
600
S2
200302
400
S2
200303
300
S2
200303
300
S2
200304
300
S2
200305
300
S2
200305
300
S2
200306
300
S2
200307
300
S2
200307
500
S2
200308
500
S2
200308
500
S2
200310
400
S2
200311
500
S2
200311
600
S2
200312
800
S2
200312
700
S2
200312
700
S3
200308
100
S3
200309
400
S3
200310
600
S3
200310
600
33
S#
MONTH 3 SUPPLY TOTAL
----- ---------- -------------S4
200305
100
S4
200306
300
S4
200307
500
S4
200310
500
S5
200301
100
S5
200302
200
S5
200302
400
S5
200303
600
S5
200303
1100
S5
200304
1100
S5
200304
1000
S5
200304
900
S5
200307
800
S5
200307
800
S5
200309
400
S5
200311
400
S5
200311
500
S5
200312
500
34
35
36
S#
MONTH 3 MONTH SUPPLY TOTAL
----- ---------- -------------------S1
200301
100
S1
200302
200
S1
200305
400
S1
200305
400
S1
200306
600
S1
200309
100
S2
200301
500
S2
200301
500
S2
200302
700
S2
200302
700
S2
200303
900
S2
200303
900
S2
200304
500
S2
200305
500
S2
200305
500
S2
200306
400
S2
200307
700
S2
200307
700
S2
200308
700
S2
200308
700
S2
200310
400
S2
200311
600
S2
200311
600
S2
200312
1300
S2
200312
1300
S2
200312
1300
S3
200308
100
S3
200309
400
S3
200310
700
S3
200310
700
37
S#
MONTH 3 MONTH SUPPLY TOTAL
----- ---------- -------------------S4
200305
100
S4
200306
300
S4
200307
500
S4
200310
100
S5
200301
100
S5
200302
400
S5
200302
400
S5
200303
1300
S5
200303
1300
S5
200304
2100
S5
200304
2100
S5
200304
2100
S5
200307
300
S5
200307
300
S5
200309
400
S5
200311
500
S5
200311
500
S5
200312
500
The RANGE clause specifying a logical range of rows can only be used with an
ordering column that is of a numeric data type, a datetime type or an interval
type.
A further restriction of the RANGE clause is that only one ordering column may
be specified in the ORDER BY clause.
38
The examples seen above have only used a window specified in terms of
ROW/RANGE preceding the current row. More generally, a window frame may be
specified by reference to rows/range both preceding and following the current
row e.g.
ROWS BETWEEN 3 PRECEDING AND 2 FOLLOWING
RANGE BETWEEN UNBOUNDED PRECEDING
AND 2 FOLLOWING
ROWS BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING
If neither a window frame nor an order are specified at all, the effect is that of
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
If a window order but no window frame is specified, the effect is that of
RANGE UNBOUNDED PRECEDING
39
41
42
43
REFERENCE
J Melton & A R Simon, SQL:1999 Understanding Relational Language
Components, Morgan Kaufmann, 2002.
J Melton, Advanced SQL:1999 Understanding Object-and Other Advanced
Features, Morgan Kaufmann, 2003.
Oracle Database Data Warehousing Guide Chapter 18 - SQL for Analysis and
Reporting.
Oracle Database Data Warehousing Guide Chapter 19 - SQL for Aggregation in
Data Warehouses.
44