Sie sind auf Seite 1von 44

OLAP

On-line analytical processing OLAP systems provide support for the storage
and manipulation of data adopting the multidimensional data model.
As already seen, the multidimensional data model presents data in a standard,
intuitive format with a measure value related to members of n dimensions being
visualized as being stored in the appropriate cell of an n-dimensional cube.
For example, with the Suppliers-Parts-Projects database, the fact that 200 parts
P1 are supplied to project J1 by supplier S1 would be represented as follows.

OLAP OPERATIONS
OLAP cube operations are applied to a multidimensional cube to produce
another cube as a result. Common operations are slice, dice, roll-up, drill
down and pivot.
The slice operation produces a subcube by performing a selection on one
dimension e.g. produce a suppliers-parts-projects cube for those suppliers
based in London.
The dice operation produces a subcube by performing a selection on more
than one dimension e.g. produce a suppliers-parts-projects cube for those
suppliers, parts and projects based in London.
The roll-up operation aggregates measure values within a cube either by
aggregating at a higher level in the hierarchy underpinning a dimension, or
removing one or more dimensions.
The drill down operation performs the reverse of a roll-up: measure values
within a cube are aggregated at a lower level in the hierarchy underpinning
a dimension, or one or more dimensions are added.
3

For example, consider the following cube, and assume that country is within the
supplier dimension at a higher level than city.

This cube shows the effect of a roll-up operation with aggregation on supplier
country rather than city.

This cube shows the effect of a roll-up operation with removal of the part
dimension from the previous cube.

The pivot operation rotates a cube to present the dimensions in a different


orientation. For example, the cube below on the left is pivoted to produce
the cube on the right.

The pivot operation is often used in a tabular representation to rotate rows


into columns.
.
P#
-----P1
P1
P1
P2
P2
P2
P3
P3
P3

J#
---J1
J1
J1
J1
J1
J1
J1
J1
J1

SCITY
-----LEEDS
LONDON
LILLE
LEEDS
LONDON
LILLE
LEEDS
LONDON
LILLE

QUANTITY
-------200
100
400
300
200
100
200
200
400

P#
-----P1
P2
P3

J#
----J1
J1
J1

LEEDS_QTY LONDON_QTY LILLE_QTY


---------- ----------- ---------200
100
400
300
200
100
200
200
400

OLAP SUPPORT IN SQL:1999


SQL as defined in versions earlier than SQL:1999 had only basic facilities for
analyzing data, namely the aggregate functions COUNT, MAX, MIN, SUM and
AVG, as well as the GROUP BY and HAVING clauses enabling those functions to
be applied over groups of rows within a table.
SQL:1999 introduced the CUBE, ROLLUP and GROUPING SETS constructs
which provide the basis for more general grouping capabilities.
These are now illustrated with a modified version of the SUPPLY table seen
previously. This table SUPPLY_MONTHLY - has an extra column MONTH which
records in which month a particular supply instance occurs.
The result of SELECT * FROM SUPPLY_MONTHLY follows.

S#
----S1
S1
S1
S1
S1
S1
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S3
S3
S3
S3

P#
-----P1
P1
P1
P1
P1
P1
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P3
P5
P3
P4
P4
P4

J#
---J1
J1
J4
J4
J4
J4
J1
J1
J1
J2
J2
J3
J3
J4
J4
J5
J5
J5
J6
J6
J7
J7
J7
J7
J7
J2
J1
J2
J2
J2

MONTH
QUANTITY
------ ---------200302
100
200305
100
200301
100
200305
300
200306
200
200309
100
200302
100
200306
100
200311
200
200305
100
200308
100
200303
100
200304
100
200307
100
200312
400
200302
100
200307
300
200310
200
200301
300
200312
100
200301
200
200303
100
200305
100
200311
200
200312
200
200308
100
200310
200
200308
100
200309
300
200310
100

S#
----S4
S4
S4
S4
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5

P#
-----P6
P6
P6
P6
P2
P2
P5
P5
P5
P5
P6
P1
P3
P4
P4
P5
P5
P6

J#
---J3
J3
J7
J7
J2
J4
J5
J5
J5
J7
J2
J4
J4
J4
J4
J4
J4
J4

MONTH
QUANTITY
------ ---------200305
100
200306
200
200307
200
200310
100
200304
200
200312
100
200301
100
200302
100
200303
300
200307
100
200302
200
200311
100
200304
200
200303
600
200307
200
200309
100
200311
300
200304
500

10

The ROLLUP option to GROUP BY creates extra rows in the result compared
with normal GROUP BY. For example, consider (with results below):
SELECT S#, P#, J#, SUM(QUANTITY)
FROM SUPPLY_MONTHLY
GROUP BY ROLLUP (S#, P#, J#)
S#
----S1
S1
S1
S1
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2

P#
-----P1
P1
P1
P3
P3
P3
P3
P3
P3
P3
P3
P5
P5

J#
SUM(QUANTITY)
---- ------------J1
200
J4
700
900
900
J1
400
J2
200
J3
200
J4
500
J5
600
J6
400
J7
800
3100
J2
100
100
3200

11

S#
----S3
S3
S3
S3
S3
S4
S4
S4
S4
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5

P#
-----P3
P3
P4
P4
P6
P6
P6
P1
P1
P2
P2
P2
P3
P3
P4
P4
P5
P5
P5
P5
P6
P6
P6

J#
SUM(QUANTITY)
---- ------------J1
200
200
J2
500
500
700
J3
300
J7
300
600
600
J4
100
100
J2
200
J4
100
300
J4
200
200
J4
800
800
J4
400
J5
500
J7
100
1000
J2
200
J4
500
700
3100
8500

12

Whereas a normal GROUP BY would produce a result row for each group of
rows with a particular (S#, P#, J#) combination of values, GROUP BY
ROLLUP additionally produces a row for each (S#, P#) group, a row for each
(S#) group and a row for the table as a whole. Such extra rows are known as
ordered super-aggregate rows.
The effect is to produce extra result rows corresponding to multidimensional
cube roll-up operations with dimensions being removed one at a time from right
to left following the ordering of columns specified in the ROLLUP clause.
In a super-aggregate result row, a null value is returned for each column not
participating in the aggregation at that level.
Hence, for a result row corresponding to a (S#, P#) aggregation, the J# value
will be null, while for a result row corresponding to a (S#) aggregation, both P#
and J# will be null, and all of S#, P#, J# are null for the whole table
aggregation.
Multiple roll-up expressions may be specified in a single GROUP BY as well as
combinations of roll-up and non-roll-up expressions.
13

In addition to a null value being used to represent a column not participating in


the grouping at a particular level, a grouping column may itself contain null
values. Hence, groups arise in the normal way in respect of rows with null
values.
In order to make it possible to distinguish between result rows containing null
values corresponding to nulls in the table, and null values corresponding to
columns not participating in the grouping, a function GROUPING is used. For
example, consider
SELECT S#, P#, J#, SUM(QUANTITY), GROUPING(J#)
FROM SUPPLY_MONTHLY
GROUP BY ROLLUP (S#, P#, J#)
For each result row, GROUPING(J#) will return an integer value 1 if J# is null
because the row is a super-aggregate row with J# not participating in the
aggregation, otherwise an integer value 0 will be returned.

14

It has been seen that ROLLUP has an inherent ordering: in the example above,
no super-aggregate result rows were produced in respect of (P#, J#) groups
for example.
GROUP BY CUBE produces super-aggregate rows for all combinations of
columns.
SELECT S#, P#, J#, SUM(QUANTITY)
FROM SUPPLY_MONTHLY
GROUP BY CUBE (S#, P#, J#)

S#
P#
J#
SUM(QUANTITY)
----- ------ ---- ------------8500
J1
800
J2
1200
J3
500
J4
3300
J5
1100
J6
400
J7
1200

15

S#
P#
----- -----P1
P1
P1
P2
P2
P2
P3
P3
P3
P3
P3
P3
P3
P3
P4
P4
P4
P5
P5
P5
P5
P5
P6
P6
P6
P6
P6

J#
SUM(QUANTITY)
---- ------------1000
J1
200
J4
800
300
J2
200
J4
100
3500
J1
600
J2
200
J3
200
J4
700
J5
600
J6
400
J7
800
1300
J2
500
J4
800
1100
J2
100
J4
400
J5
500
J7
100
1300
J2
200
J3
300
J4
500
J7
300

16

S#
----S1
S1
S1
S1
S1
S1
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S2
S3
S3
S3
S3
S3
S3
S3

P#
J#
SUM(QUANTITY)
------ ---- ------------900
J1
200
J4
700
P1
900
P1
J1
200
P1
J4
700
3200
J1
400
J2
300
J3
200
J4
500
J5
600
J6
400
J7
800
P3
3100
P3
J1
400
P3
J2
200
P3
J3
200
P3
J4
500
P3
J5
600
P3
J6
400
P3
J7
800
P5
100
P5
J2
100
700
J1
200
J2
500
P3
200
P3
J1
200
P4
500
P4
J2
500

17

S#
----S4
S4
S4
S4
S4
S4
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5
S5

P#
J#
SUM(QUANTITY)
------ ---- ------------600
J3
300
J7
300
P6
600
P6
J3
300
P6
J7
300
3100
J2
400
J4
2100
J5
500
J7
100
P1
100
P1
J4
100
P2
300
P2
J2
200
P2
J4
100
P3
200
P3
J4
200
P4
800
P4
J4
800
P5
1000
P5
J4
400
P5
J5
500
P5
J7
100
P6
700
P6
J2
200
P6
J4
500

18

Sometimes it is useful to combine different grouping operations in single query.


This could be used, for example, to combine grouping operations on different
dimensions in a cube.
GROUP BY GROUPING SETS produces results rows for each grouping
operation as normal, but then combines those into a single result. For example:
SELECT S#, P#, J#, SUM(QUANTITY)
FROM SUPPLY_MONTHLY
GROUP BY GROUPING SETS
(ROLLUP (S#, P#),
ROLLUP (S#, J#))
is equivalent to:
SELECT S#, P#, NULL,
FROM SUPPLY_MONTHLY
GROUP BY ROLLUP (S#,
UNION ALL
SELECT S#, NULL, J#,
FROM SUPPLY_MONTHLY
GROUP BY ROLLUP (S#,

SUM(QUANTITY)
P#)
SUM(QUANTITY)
J#)

19

Another feature introduced in SQL:1999 which can be useful in OLAP queries


is the WITH clause.
This enables a developer to introduce a name for a query expression which
would otherwise be repeated multiple times in the associated query: instead the
name can be used to reference the query expression whenever it is needed in
the query.
For example, recall the following query from the SQL DML note.
Find the supplier number, part number, project number and total weight of the
parts supplied where the total weight is the maximum for any supply instance.
SELECT SPJ.S#, SPJ.P#, SPJ.J#,
P.WEIGHT * SPJ.QUANTITY AS TOTAL_WEIGHT
FROM SUPPLY SPJ, PART P
WHERE SPJ.P# = P.P#
AND P.WEIGHT * SPJ.QUANTITY =
(SELECT MAX (P2.WEIGHT * SPJ2.QUANTITY)
FROM SUPPLY SPJ2, PART P2
WHERE SPJ2.P# = P2.P#)
20

Here, there is a query expression involving the join of SUPPLY and PART and
the calculation of WEIGHT * QUANTITY in both the outer and inner levels of
the query.
Use of the WITH clause enables the common query expression to be factored
out with the table resulting from the query expression being used in the
subsequent query each time it is referenced.
WITH TOTAL_WEIGHT_SUPPLY AS
(SELECT SPJ.S#, SPJ.P#, SPJ.J#,
P.WEIGHT * SPJ.QUANTITY AS TOTAL_WEIGHT
FROM SUPPLY SPJ, PART P
WHERE SPJ.P# = P.P#)
SELECT S#, P#, J#, TOTAL_WEIGHT
FROM TOTAL_WEIGHT_SUPPLY
WHERE TOTAL_WEIGHT =
(SELECT MAX (TOTAL_WEIGHT)
FROM TOTAL_WEIGHT_SUPPLY)

21

The query expression factored out in this way can contain all the constructs
which may by found in a subquery such as GROUP BY and HAVING.
In addition to making the query simpler to write, with less chance of errors being
introduced in query expressions which should be the same, there is also a
potential performance advantage since the table which results from the query
expression is used in each place in the subsequent query where it is
referenced.

22

A feature introduced in SQL-92 which can be useful in OLAP queries is a CASE


expression.
A CASE expression enables IF..THEN..ELSE.. logic to be used in SQL
statements. It has two forms, simple and searched.
An example of a simple CASE expression is:
CASE CITY
WHEN LONDON THEN UK
WHEN PARIS THEN EUROPE CENTRAL
WHEN ATHENS THEN EUROPE SOUTH
WHEN ROME THEN EUROPE SOUTH
WHEN OSLO THEN EUROPE NORTH
ELSE NOT KNOWN
END

23

Each WHEN expression is checked as to whether it is equal to the initial


expression: when the first match is found the corresponding THEN expression is
returned. If there are no matching expressions, the ELSE expression is
returned, or NULL if the ELSE is omitted.
A CASE can be used wherever an expression can be used in SQL, for example
in an SQL SELECT or WHERE clause, for example:
SELECT S#,
SNAME,
CASE CITY
WHEN LONDON THEN UK
WHEN PARIS THEN EUROPE CENTRAL
WHEN ATHENS THEN EUROPE SOUTH
WHEN ROME THEN EUROPE SOUTH
WHEN OSLO THEN EUROPE NORTH
ELSE NOT KNOWN
END AS REGION
FROM SUPPLIER

24

An example of a searched CASE expression is:


CASE
WHEN STATUS < 15 THEN LOW
WHEN STATUS > 25 THEN HIGH
ELSE MEDIUM
END

The expression is evaluated to find the first WHEN condition which is true and the
corresponding THEN expression is returned. Again, if there are no matching
expressions, the ELSE expression is returned, or NULL if the ELSE is omitted.
An example of use of a searched CASE is:
SELECT S#,
SNAME,
CASE
WHEN
WHEN
ELSE
END AS
FROM SUPPLIER

STATUS < 15 THEN LOW


STATUS > 25 THEN HIGH
MEDIUM
STATUS_CLASS

25

One situation where a CASE can be useful is in pivot queries.


Standard SQL does not support a PIVOT operation, though many relational
implementations including Oracle and SQL*Server do.
The effect of a pivot can often be achieved using a CASE expression. Consider
again the example:
P#
-----P1
P1
P1
P2
P2
P2
P3
P3
P3

J#
---J1
J1
J1
J1
J1
J1
J1
J1
J1

SCITY
-----LEEDS
LONDON
LILLE
LEEDS
LONDON
LILLE
LEEDS
LONDON
LILLE

QUANTITY
-------200
100
400
300
200
100
200
200
400

P#
-----P1
P2
P3

J#
----J1
J1
J1

LEEDS_QTY LONDON_QTY LILLE_QTY


---------- ----------- ---------200
100
400
300
200
100
200
200
400

26

Assuming the first table is named PIVOT_EXAMPLE, the second table can be
produced by executing the query:
SELECT P#,
J#,
MAX(CASE SCITY WHEN 'LEEDS' THEN QUANTITY ELSE NULL END)
AS LEEDS_QTY,
MAX(CASE SCITY WHEN 'LONDON' THEN QUANTITY ELSE NULL END)
AS LONDON_QTY,
MAX(CASE SCITY WHEN 'LILLE' THEN QUANTITY ELSE NULL END)
AS LILLE_QTY
FROM PIVOT_EXAMPLE
GROUP BY P#, J#
ORDER BY P#, J#

Here a GROUP BY is used to produce a single row result for each combination
of a P# and J# value.
The CASE expression is then used to generate the correct QUANTITY value
depending on the SCITY value in the row of the original table.
Use of the MAX function ensures that the resulting expression is single-valued
per group. If no function were used, an error would result.
27

OLAP SUPPORT IN SQL/OLAP


More extensive analysis capabilities are defined within SQL/OLAP, which was
added in 2000 as an amendment to SQL:1999. In particular the following were
introduced:
Functions supporting statistical analyses (standard deviations, correlation
coefficients etc.)
RANK functions enabling ranking of rows, useful for top n queries.
The WINDOW clause, enabling a more general partitioning of rows than
possible with a GROUP BY clause. In particular, window sizes may be
specified relative to a particular row, enabling moving window queries to
be easily specified.
The ROW_NUMBER function which assigns a number to each row based on
its position within the window.

28

The RANK clause enables ranking of rows, useful for top n queries.
Suppose that it is required to retrieve the rows with the 10 highest quantities
from the SUPPLY table.
SELECT S#, P#, J#, QUANTITY,
QUANTITY_RANK, QUANTITY_DENSE_RANK
FROM (SELECT S#, P#, J#, QUANTITY,
RANK() OVER
(ORDER BY QUANTITY DESC)
AS QUANTITY_RANK,
DENSE_RANK() OVER
(ORDER BY QUANTITY DESC)
AS QUANTITY_DENSE_RANK
FROM SUPPLY)
WHERE QUANTITY_RANK <= 10

29

The following result is produced.


S#
----S2
S5
S1
S2
S2
S3
S5
S5
S2
S2
S5

P#
-----P3
P4
P1
P3
P3
P4
P6
P5
P3
P3
P5

J#
QUANTITY QUANTITY_RANK QUANTITY_DENSE_RANK
---- ---------- ------------- ------------------J7
800
1
1
J4
800
1
1
J4
700
3
2
J5
600
4
3
J4
500
5
4
J2
500
5
4
J4
500
5
4
J5
500
5
4
J1
400
9
5
J6
400
9
5
J4
400
9
5

11 rows selected.

Two functions have been used in the query, RANK and DENSE_RANK. These
differ in their handling of duplicate values in the ranked column.
As can be seen in the example, two rows have the highest value for quantity of
800. Whereas RANK ranks the row with the next highest value for quantity as 3
- i.e. a sparse ranking, DENSE_RANK ranks the row with the next highest value
for quantity as 2 i.e. a dense rank with no gaps in the ranking order.
30

The WINDOW clause is illustrated below.


Suppose that it is required to retrieve from the SUPPLY_MONTHLY table in
respect of each supply instance the following: the supplier involved, the month,
and the total supplied by the supplier in that and the two previous occasions
that the supplier has supplied. This is an example of a moving window query.
SELECT S#, MONTH, SUM(QUANTITY)
OVER W AS "3 SUPPLY TOTAL"
FROM SUPPLY_MONTHLY
WINDOW W AS (PARTITION BY S#
ORDER BY MONTH
ROWS 2 PRECEDING)
or -- only the following is supported by Oracle
SELECT S#, MONTH, SUM(QUANTITY)
OVER (PARTITION BY S#
ORDER BY MONTH
ROWS 2 PRECEDING)
AS "3 SUPPLY TOTAL"
FROM SUPPLY_MONTHLY

31

The clause defines a moving window by specifying


Separate groups of rows (partitions) through which the window will move by
use of the PARTITION BY clause in the example, partitioning is on the
basis of S# values so a partition consists of all the rows for a particular
supplier
an ordering of rows within each partition with the ORDER BY clause
which rows relative to a row participate in the window (the window frame) in the example it is a rows two preceding rows specified with ROWS 2
PRECEDING.
This results in the table conceptually being partitioned into groups of rows, with
the rows in each partition being ordered, and a window frame being defined
consisting of a number of rows.
Then, for each row in the table, the function is evaluated in respect of that row
and the other rows in the window frame which belong to the same partition, and
a result row is produced.
The following result is produced.

32

S#
MONTH 3 SUPPLY TOTAL
----- ---------- -------------S1
200301
100
S1
200302
200
S1
200305
300
S1
200305
500
S1
200306
600
S1
200309
600
S2
200301
300
S2
200301
500
S2
200302
600
S2
200302
400
S2
200303
300
S2
200303
300
S2
200304
300
S2
200305
300
S2
200305
300
S2
200306
300
S2
200307
300
S2
200307
500
S2
200308
500
S2
200308
500
S2
200310
400
S2
200311
500
S2
200311
600
S2
200312
800
S2
200312
700
S2
200312
700
S3
200308
100
S3
200309
400
S3
200310
600
S3
200310
600

33

S#
MONTH 3 SUPPLY TOTAL
----- ---------- -------------S4
200305
100
S4
200306
300
S4
200307
500
S4
200310
500
S5
200301
100
S5
200302
200
S5
200302
400
S5
200303
600
S5
200303
1100
S5
200304
1100
S5
200304
1000
S5
200304
900
S5
200307
800
S5
200307
800
S5
200309
400
S5
200311
400
S5
200311
500
S5
200312
500

34

A result row has been produced in respect of every row in SUPPLY_MONTHLY.


In each case the result row contains the result of evaluating the function over
the row together with the other rows in the window frame ensuring only rows in
the same partition are included in the window.
Suppose that it is required to retrieve from the SUPPLY_MONTHLY table in
respect of each supply instance the following: the supplier involved, the month,
and the total supplied by the supplier in that and the two previous months.
ROWS 2 PRECEDING will not work for some months a supplier may have
many supply instances or no supply instances at all. Hence, window size cannot
be on the basis of number of rows.
An alternative basis for specifying window size is by a logical range of values as
in RANGE 2 PRECEDING
This is used in the following example to define a window frame consisting of the
rows with a particular month value and the preceding 2 month values.

35

SELECT S#, MONTH, SUM(QUANTITY)


OVER W AS "3 MONTH SUPPLY TOTAL"
FROM SUPPLY_MONTHLY
WINDOW W AS (PARTITION BY S#
ORDER BY MONTH
RANGE 2 PRECEDING)
or -- only the following is supported by Oracle
SELECT S#, MONTH, SUM(QUANTITY)
OVER (PARTITION BY S#
ORDER BY MONTH
RANGE 2 PRECEDING)
AS "3 MONTH SUPPLY TOTAL"
FROM SUPPLY_MONTHLY
The following result is produced.

36

S#
MONTH 3 MONTH SUPPLY TOTAL
----- ---------- -------------------S1
200301
100
S1
200302
200
S1
200305
400
S1
200305
400
S1
200306
600
S1
200309
100
S2
200301
500
S2
200301
500
S2
200302
700
S2
200302
700
S2
200303
900
S2
200303
900
S2
200304
500
S2
200305
500
S2
200305
500
S2
200306
400
S2
200307
700
S2
200307
700
S2
200308
700
S2
200308
700
S2
200310
400
S2
200311
600
S2
200311
600
S2
200312
1300
S2
200312
1300
S2
200312
1300
S3
200308
100
S3
200309
400
S3
200310
700
S3
200310
700

37

S#
MONTH 3 MONTH SUPPLY TOTAL
----- ---------- -------------------S4
200305
100
S4
200306
300
S4
200307
500
S4
200310
100
S5
200301
100
S5
200302
400
S5
200302
400
S5
200303
1300
S5
200303
1300
S5
200304
2100
S5
200304
2100
S5
200304
2100
S5
200307
300
S5
200307
300
S5
200309
400
S5
200311
500
S5
200311
500
S5
200312
500

The RANGE clause specifying a logical range of rows can only be used with an
ordering column that is of a numeric data type, a datetime type or an interval
type.
A further restriction of the RANGE clause is that only one ordering column may
be specified in the ORDER BY clause.
38

The examples seen above have only used a window specified in terms of
ROW/RANGE preceding the current row. More generally, a window frame may be
specified by reference to rows/range both preceding and following the current
row e.g.
ROWS BETWEEN 3 PRECEDING AND 2 FOLLOWING
RANGE BETWEEN UNBOUNDED PRECEDING
AND 2 FOLLOWING
ROWS BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING
If neither a window frame nor an order are specified at all, the effect is that of
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
If a window order but no window frame is specified, the effect is that of
RANGE UNBOUNDED PRECEDING

39

Other functions which can be used with windows include:


LAG/LEAD for specifying a target row at a given offset from the current row
an order must be specified for the window.
FIRST_VALUE, LAST_VALUE, NTH_VALUE for selecting a value from
the first row etc.
ROW_NUMBER for assigning a unique number to each row.
NTILE(expr) for dividing the ordered rows into expr buckets an order
must be specified for the window.
ROW_NUMBER() evaluated over a window win is equivalent to:
COUNT(*) OVER (win ROWS UNBOUNDED PRECEDING)
NTILE(expr) allows the ordered rows of a window to be divided into n
buckets where n is the result of evaluating expr.
NTILE attempts to produce buckets with the same number of rows in each, but
that is not always possible, in which case buckets with a difference of at most
40
one row in each are produced.

For example, ROW_NUMBER is used below to generate identifying numbers for


the rows in each partition of a query seen earlier.
For each supply instance, the following are generated: the supplier involved, the
month, an identifying number for each row for that supplier, and the total
supplied by the supplier in that and the two previous occasions that the supplier
has supplied.
NTILE(4) is used below to generate quartile information for the total supplied
each month.
For each month, the following are generated: the month, the total supplied that
month, and the quartile which that total quantity falls within.

41

SELECT S#, MONTH,


ROW_NUMBER()
OVER (PARTITION BY S#
ORDER BY MONTH)
AS "3 SUPPLY NUMBER",
SUM(QUANTITY)
OVER (PARTITION BY S#
ORDER BY MONTH
ROWS 2 PRECEDING)
AS "3 SUPPLY TOTAL"
FROM SUPPLY_MONTHLY
ORDER BY S#, MONTH
S#
MONTH 3 SUPPLY NUMBER 3 SUPPLY TOTAL
----- ---------- --------------- -------------S1
200301
1
100
S1
200302
2
200
S1
200305
3
300
S1
200305
4
500
S1
200306
5
600
S1
200309
6
600
S2
200301
1
300
S2
200301
2
500
S2
200302
3
600
S2
200302
4
400
S2
200303
5
300
etc.

42

SELECT MONTH, SUM(QUANTITY) AS "SUPPLY TOTAL",


NTILE(4)
OVER (ORDER BY SUM(QUANTITY))
AS QUARTILE
FROM SUPPLY_MONTHLY
GROUP BY MONTH
ORDER BY MONTH
MONTH SUPPLY TOTAL
QUARTILE
---------- ------------ ---------200301
700
2
200302
600
2
200303
1100
4
200304
1000
4
200305
700
3
200306
500
1
200307
900
4
200308
300
1
200309
500
1
200310
600
2
200311
800
3
200312
800
3

43

REFERENCE
J Melton & A R Simon, SQL:1999 Understanding Relational Language
Components, Morgan Kaufmann, 2002.
J Melton, Advanced SQL:1999 Understanding Object-and Other Advanced
Features, Morgan Kaufmann, 2003.
Oracle Database Data Warehousing Guide Chapter 18 - SQL for Analysis and
Reporting.
Oracle Database Data Warehousing Guide Chapter 19 - SQL for Aggregation in
Data Warehouses.

44

Das könnte Ihnen auch gefallen