Beruflich Dokumente
Kultur Dokumente
By Alex Kozak
20 April 2006
All four functions have "partition by" and "order by" clauses and that
makes these functions very flexible and useful. However, there is one
nuance in syntax that deserves your attention: the "order by" clause is
not an option.
If you check the execution plan for that query (see Figure 1), you will
find that the Sort operator is very expensive and costs 78 percent.
Since the parser doesn't allow you to avoid the "order by" clause,
maybe you can force the query optimizer to stop using the Sort
operator. For example, you could create a computed column that
consists of a simple integer, 1, and then use that virtual column in the
"order by" clause (Listing 2):
If you check the execution plans now (see Figure 2), you will find that
query optimizer doesn't use the Sort operator anymore. Both queries
will generate the row numbers and return the orderID values in the
original order.
RowNum orderID
1 7
2 11
3 4
4 21
5 15
The "order by" clause allows the expressions. The expression can be
simple, constant, variable, column, and so on. Simple expressions can
be organized into complex ones.
No, you can't bypass the parser. You will get an error:
O-o-o-p-s, here's the hint! The expression (or in our case, the
subquery) has to produce a single value.
Now you can write an expression in the "order by" clause that returns
a single value, forcing the query optimizer to refrain from using a sort
operation.
If you check the execution plans (see Figure 4), you will find that the
first query in Listing 4 requires a lot of resources for sorting. The
second query doesn't have a Sort operator. So the queries behave as
expected.
However, when you run the queries, the second result will be wrong:
Even though the expressions in the "order by" clause help to skip
sorting, they can't be applied to the RANK() and DENSE_RANK()
functions. Apparently, these ranking functions must have a sorted
input to produce the correct result.
Now, when you know how to avoid sorting in ranking functions you can
test their performance.
Let's insert more rows into the RankingFunctions table (Listing 6):
SET NOCOUNT ON
CREATE TABLE RankingFunctions(orderID int NOT NULL);
INSERT INTO RankingFunctions VALUES(7);
INSERT INTO RankingFunctions VALUES(11);
INSERT INTO RankingFunctions VALUES(4);
INSERT INTO RankingFunctions VALUES(21);
INSERT INTO RankingFunctions VALUES(15);
UPDATE RankingFunctions
SET orderID = orderID/5
WHERE orderID%5 = 0;
5 * POWER(2,19) = 2,621,440
Deleting every Nth row or duplicates in the table are common tasks for
a DBA or database programmer. In Listing 8, I used CTE to delete
every fifth row in the RankingFunctions table.
Test Results
Here are the results that I got on a regular Pentium 4 desktop
computer with 512 MB RAM running Windows 2000 Server and
Microsoft SQL Server 2005 Developer Edition:
DELETE
each 5th
row
without 5 sec.
sorting
with 24 sec.
sorting
Take any table with many columns and rows (or create and populate
one using the technique from Listing 6). Then create different indexes
and test the ranking functions. You will find that for covered queries
the optimizer won't use a Sort operator. This is what makes the
ranking function as fast as, or even faster than, the functions with an
expression in an "order by" clause.
Conclusion
This article explains ranking functions and helps you understand how
they work. The techniques shown here, in some situations, can
increase the performance of ranking functions 3-5 times. In addition,
this article discusses some common trends in the behavior of an
ORDER BY clause with expressions.