Beruflich Dokumente
Kultur Dokumente
2.3 Q UICKSORT
quicksort
selection
duplicate keys
Algorithms
system sorts
F O U R T H E D I T I O N
http://algs4.cs.princeton.edu
Two classic sorting algorithms: mergesort and quicksort
...
...
2
Quicksort t-shirt
3
2.3 Q UICKSORT
quicksort
selection
duplicate keys
Algorithms
system sorts
http://algs4.cs.princeton.edu
Quicksort
Basic plan.
Shuffle the array.
Partition so that, for some j
entry a[j] is in place
no larger entry to the left of j
no smaller entry to the right of j
Sort each subarray recursively.
input Q U I C K S O R T E X A M P L E
shuffle K R A T E L E P U I M Q C X O S
partitioning item
partition E C A I E K L P U T M Q R X O S
not greater not less
sort left A C E E I K L P U T M Q R X O S
sort right A C E E I K L M O P Q R S T U X
result A C E E I K L M O P Q R S T U X
Quicksort overview
5
Tony Hoare
n u m b e r ) . 9.9 X 10 45 is u s e d to r e p r e s e n t i n f i n i t y . I m a g i n a r y
v a l u e s of x m a y n o t be n e g a t i v e a n d reM v a l u e s of x m a y n o t be
conunent I a n d J are o u t p u t v a r i a b l e s , a n d A is t h e a r r a y ( w i t h
s u b s c r i p t b o u n d s M : N ) w h i c h is o p e r a t e d u p o n b y t h i s p r o c e d u r e .
s m a l l e r t h a n 1. P a r t i t i o n t a k e s t h e v a l u e X of a r a n d o m e l e m e n t of t h e a r r a y A,
V a l u e s of Qd~'(x) m a y be c a l c u l a t e d e a s i l y by h y p e r g e o m e t r i c a n d r e a r r a n g e s t h e v a l u e s of t h e e l e m e n t s of t h e a r r a y in s u c h a
series if x is n o t t o o s m a l l n o r (n - m ) t o o large. Q~m(x) c a n be w a y t h a t t h e r e e x i s t i n t e g e r s I a n d J w i t h t h e following p r o p e r t i e s :
c o m p u t e d f r o m a n a p p r o p r i a t e set of v a l u e s of Pnm(X) if X is n e a r M _-< J < I =< N p r o v i d e d M < N
Implemented quicksort.
n := nmax +7; up: for I : = I s t e p 1 u n t i l N d o
i f ri = 0 t h e n i f X < A [I] t h e n g o to d o w n ;
begin ifm = 0then I:=N;
Q[0] : = 0.5 X 10g((x + 1 ) / ( x - 1)) down: f o r J : = J s t e p --1 u n t i l M d o
else if A[J]<X then go to change;
begin t : = - - 1 . 0 / s q r t ( x X x - - 1); J:=M;
q0 : = 0; c h a n g e : i f I < J t h e n b e g i n e x c h a n g e (A[IL A[J]); Tony Hoare
Q[O] : = t ; I := I+ 1;J:= J - 1;
for i : = 1 step 1 until m do g o to u p 1980 Turing Award
begin s := (x+x)X(i-1)Xt end 2: b e g i n
4
Q[0]+ (3i-ii-2)q0; else i f [ < F t h e n b e g i n e x c h a n g e (A[IL A[F]) i i f c >= 0 t h e n
q0 : = Q[0]; I:=I+l 3:begin
Q[0] : = s e n d e n d ; end e:= aXe;f := bXd;goto8
if x = 1 then else i f F < J t l l e n b e g i n e x c h a n g e (A[F], A[J]) ; end 3 ;
Q[0] : = 9.9 I" 45; J:=J-1 e:=bXc;
R [ n + 1] : = x - s q r t ( x X x - 1); end ; ifd ~ 0then
for i : = n s t e p --1 u n t i l 1 d o end partition 4: begin
R[i] : = (i + m ) / ( ( i + i + 1) X x f:=bXd; goto8
+(m-i- 1) X R [ i + l ] ) ; e n d 4;
ALGORITHM 61
g o to t h e e n d ; f:=aXd; goto8
if m = 0 then ALGORITHM
PROCEDURES 64F O R R A N G E ARITHMETIC 5: e n d 2;
b e g i n i f x < 0.5 t b e n QUICKSORT
ALLAN GIBB* ifb > 0 then
Q[0] : = a r c t a n ( x ) - 1.5707963 e l s e U n iA.
C. i t y HOARE
v e r sR. of A l b e r t a , C a l g a r y , A l b e r t a , C a n a d a 6: b e g i n
Q[0] : = - a r e t a n ( 1 / x ) e n d e l s e if d > 0 then
Elliott
b e g i n Brothers Ltd., Borehamwood, Hertfordshire, Eng.
begin t : = 1 / s q r t ( x X x + 1); begin
p r o c e d u r e R A N G E S U M (a, b, c, d, e, f);
q0 : = 0; procedure quicksort (A,M,N); value M,N; e : = M I N ( a X d, b X c);
real a , b , c , d , e , f ;
q[0] := t; a r r a y A; i n t e g e r M , N ; f : = M A X ( a X c , b X d); go t o 8
c o m m e n t The term "range n u m b e r " was used by P. S. Dwyer,
for i : = 2 step 1 until m do comment Q u i c k s o r t is a v e r y f a s t a n d c o n v e n i e n t m e t h o d of e n d 6;
b e g i n s : = (x + x) X (i -- 1) X t X Q[0I
Linear Computations (Wiley, 1951). Machine procedures for
s o r t i n g a n a r r a y in t h e r a n d o m - a c c e s s s t o r e of a c o m p u t e r . T h e e : = b X c; f : = a X c; go t o 8
range arithmetic were developed about 1958 by Ramon Moore,
+(3i+iX i -- 2) q0; e n t i r e c o n t e n t s of t h e s t o r e m a y be s o r t e d , since no e x t r a s p a c e is e n d 5;
"Automatic Error Analysis in Digital C o m p u t a t i o n , " LMSD
qO : = Q[0]; r e q u i r e d . T h e a v e r a g e n u m b e r of c o m p a r i s o n s m a d e is 2 ( M - - N ) In f:=aXc;
Report 48421, 28 Jan. 1959, Lockheed Missiles and Space Divi-
Q[0] := s e n d e n d ; ( N - - M ) , a n d t h e a v e r a g e n m n b e r of e x c h a n g e s is one s i x t h t h i s i f d _-<O t h e n
sion, Palo Alto, California, 59 pp. If a _< x -< b and c ~ y ~ d,
R[n + 1] : = x - s q r t ( x x + 1); a m o u n t . S u i t a b l e r e f i n e m e n t s of t h i s m e t h o d will be d e s i r a b l e for 7: b e g i n
then R A N G E S U M yields an interval [e, f] such t h a t e =< (x + y)
for i := n step - 1 until 1 do its i m p l e m e n t a t i o n on a n y a c t u a l c o m p u t e r ; e:=bXd; goto8
f. Because of machine operation (truncation or rounding) the
R[i] : = (i + m ) / ( ( i -- m + 1) R[i + 1] begin i n t e g e r 1,J ; end 7 ;
machine sums a -4- c and b -4- d may not provide safe end-points
--(i+i+ 1) X x); if M < N then begin partition (A,M,N,I,J); e:=aXd;
of the output interval. Thus R A N G E S U M requires a non-local
for i : = 1 step 2 until nmax do quicksort (A,M,J) ; 8: e : = A D J U S T P R O D (e, - 1 ) ;
real procedure A D J U S T S U M which will compensate for the
Rill : = -- Rill; q u i c k s o r t (A, I, N ) f := A D J U S T P R O D (f, 1)
machine arithmetic. The body of A D J U S T S U M will be de-
the: for i : = 1 step 1 until nnmx do end end RANGEMPY;
pendent upon the type of machine for which it is written and so
Q[i] : = Q[i - 1] X R[i] end quieksort p r o c e d u r e R A N G E D V D (a, b, c, d, e, f) ;
is not given here. (An example, however, appears below.) I t
end QLEG; real a, b, c, d, e, f;
is assumed t h a t A D J U S T S U M has as parameters real v and w,
c o m m e n t If the range divisor includes zero the program
* T h i s p r o c e d u r e w a s d e v e l o p e d in p a r t u n d e r t h e s p o n s o r s h i p and integer i, and is accompanied by a non-local real procedure
exists to a non-local label " z e r o d v s r " . R A N G E D V D assumes a
of t h e A i r F o r c e C a m b r i d g e R e s e a r c h C e n t e r .
Communications of the ACM (July 1961)
C O R R E C T I O N 65
ALGORITHM which gives an upper bound to the magnitude
of the error involved in the machine representation of a number.
non-local real procedure A D J U S T Q U O T which is analogous
FIND (possibly identical) to A D J U S T P R O D ;
The output A D J U S T S U M provides the left end-point of the 6
C.output A. R.intervalHOARE begin
of R A N G E S U M when A D J U S T S U M is called
Tony Hoare
7
Bob Sedgewick
Implementing ical [2, ll, 13, 21] and analytic [9] studies show that
Quicksort can be expected to be up to twice as fast as its
K R A T E L E P U I M Q C X O S
lo i j
9
Quicksort partitioning demo
E C A I E K L P U T M Q R X O S
lo j hi
partitioned!
10
The music of quicksort partitioning (by Brad Lyon)
https://googledrive.com/host/0B2GQktu-wcTicjRaRjV1NmRFN1U/index.html
11
Quicksort: Java code for partitioning
lo hi i j
lo hi i j lo j hi
Q. How many compares (in the worst case) to partition an array of length N ?
A. ~N
B. ~N
C. ~N
D. ~ N lg N
E. I don't know.
13
Quicksort: Java implementation
14
Quicksort trace
lo j hi 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
initial values Q U I C K S O R T E X A M P L E
random shuffle K R A T E L E P U I M Q C X O S
0 5 15 E C A I E K L P U T M Q R X O S
0 3 4 E C A E I K L P U T M Q R X O S
0 2 2 A C E E I K L P U T M Q R X O S
0 0 1 A C E E I K L P U T M Q R X O S
1 1 A C E E I K L P U T M Q R X O S
4 4 A C E E I K L P U T M Q R X O S
6 6 15 A C E E I K L P U T M Q R X O S
no partition 7 9 15 A C E E I K L M O P T Q R X U S
for subarrays
of size 1 7 7 8 A C E E I K L M O P T Q R X U S
8 8 A C E E I K L M O P T Q R X U S
10 13 15 A C E E I K L M O P S Q R T U X
10 12 12 A C E E I K L M O P R Q S T U X
10 11 11 A C E E I K L M O P Q R S T U X
10 10 A C E E I K L M O P Q R S T U X
14 14 15 A C E E I K L M O P Q R S T U X
15 15 A C E E I K L M O P Q R S T U X
result A C E E I K L M O P Q R S T U X
15
Quicksort animation
50 random items
algorithm position
in order
current subarray
not in order
http://www.sorting-algorithms.com/quick-sort
16
Quicksort: implementation details
17
Quicksort: empirical analysis (1961)
18
Quicksort: empirical analysis
computer thousand million billion thousand million billion thousand million billion
home instant 2.8 hours 317 years instant 1 second 18 min instant 0.6 sec 12 min
super instant 1 second 1 week instant instant instant instant instant instant
initial values
random shuffle
20
Quicksort: worst-case analysis
initial values
random shuffle
21
Quicksort: average-case analysis
C0 + CN 1 C1 + CN 2 CN 1+ C0
CN = (N + 1) + + + ... +
N N N
N CN = N (N + 1) + 2(C0 + C1 + . . . + CN 1)
24
Quicksort properties
B1 C1 C2 A1
1 3 B1 C1 C2 A1
1 3 B1 A1 C2 C1
0 1 A1 B1 C2 C1
25
Quicksort: practical improvements
26
Quicksort: practical improvements
Median of sample.
Best choice of pivot item = median.
Estimate true median by taking median of sample.
Median-of-3 (random) items.
~ 12/7 N ln N compares (14% fewer)
~ 12/35 N ln N exchanges (3% more)
27
2.3 Q UICKSORT
quicksort
selection
duplicate keys
Algorithms
system sorts
http://algs4.cs.princeton.edu
Selection
Applications.
Order statistics.
Find the "top k."
Use theory as a guide.
Easy N log N upper bound. How?
Easy N upper bound for k = 1, 2, 3. How?
Easy N lower bound. Why?
Which is true?
N log N lower bound? is selection as hard as sorting?
29
Quick-select
k) v
before
public static Comparable select(Comparable[] a, int
{
lo if a[k] is here if a[k] is here
hi
StdRandom.shuffle(a); set hi to j-1 set lo to j+1
int lo = 0, hi = a.length - 1; during v " v !v
while (hi > lo)
{ i j
30
Quick-select: mathematical analysis
Pf sketch.
Intuitively, each partitioning step splits array approximately in half:
N + N / 2 + N / 4 + + 1 ~ 2N compares.
Formal analysis similar to quicksort analysis yields:
CN = 2 N + 2 k ln (N / k) + 2 (N k) ln (N / (N k))
31
Theoretical context for selection
bY .
Manuel Blum, Robert W. Floyd, Vaughan Watt,
Ronald L. Rive&, and Robert E. Tarjan
Abstract
http://algs4.cs.princeton.edu
Duplicate keys
34
Duplicate keys: stop on equal keys
P G E P A Q B P Y C O U P Z S R
P G E P A Q B P Y C O U P Z S R
35
Quicksort quiz 2
What is the result of partitioning the following array (skip over equal keys)?
A A A A A A A A A A A A A A A A
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
A. A A A A A A A A A A A A A A A A
B. A A A A A A A A A A A A A A A A
C. A A A A A A A A A A A A A A A A
D. I don't know.
36
Quicksort quiz 3
What is the result of partitioning the following array (stop on equal keys)?
A A A A A A A A A A A A A A A A
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
A. A A A A A A A A A A A A A A A A
B. A A A A A A A A A A A A A A A A
C. A A A A A A A A A A A A A A A A
D. I don't know.
37
Partitioning an array with all equal keys
38
Duplicate keys: partitioning strategies
B A A B A B B B C C C A A A A A A A A A A A
B A A B A B C C B C B A A A A A A A A A A A
A A A B B B B B C C C A A A A A A A A A A A
39
DUTCH NATIONAL FLAG PROBLEM
input
sorted
Operations allowed.
swap(i, j): swap the pebble in bucket i with the pebble in bucket j.
color(i): color of pebble in bucket i.
Requirements.
Exactly N calls to color().
At most N calls to swap().
Constant extra space.
40
3-way partitioning
lo hi
before v <v =v >v
during
lo hi
lt i gt
during <v =v >v
after <v =v >v
lt i gt
lo lt gt hi
after <v =v >v
3-way partitioning
lo lt gt hi
3-way partitioning
Dutch national flag problem. [Edsger Dijkstra]
Conventional wisdom until mid 1990s: not worth doing.
Now incorporated into C library qsort() and Java 6 system sort.
41
Dijkstra 3-way partitioning demo
lt i gt
P A B X W P P V P D P C Y Z
lo hi
before v
invariant
lo hi
lt i gt
lt gt
A B C D P P P P P V W Y Z X
lo hi
before v
invariant
lo hi
lt i gt
v a[]
lt i gt 0 1 2 3 4 5 6 7 8 9 10 11
0 0 11 R B W W R W B R R W B R
0 1 11 R B W W R W B R R W B R
1 2 11 B R W W R W B R R W B R
1 2 10 B R R W R W B R R W B W
1 3 10 B R R W R W B R R W B W
1 3 9 B R R B R W B R R W W W
2 4 9 B B R R R W B R R W W W
2 5 9 B B R R R W B R R W W W
2 5 8 B B R R R W B R R W W W
2 5 7 B B R R R R B R W W W W
2 6 7 B B R R R R B R W W W W
3 7 7 B B B R R R R R W W W W
3 8 7 B B B R R R R R W W W W
3 8 7 B B B R R R R R W W W W
3-way partitioning trace (array contents after each loop iteration)
44
3-way quicksort: Java implementation
lo lt gt hi
3-way partitioning
45
3-way quicksort: visual trace
46
Duplicate keys: lower bound
Sorting lower bound. If there are n distinct keys and the ith one occurs
xi times, any compare-based sorting algorithm must use at least
n
N! xi
lg xi lg N lg N when all distinct;
x1 ! x2 ! xn ! i=1
N linear when only a constant number of distinct keys
47
Sorting summary
selection N2 N2 N2 N exchanges
tight code;
shell N log3 N ? c N 3/2
subquadratic
N log N guarantee;
merge N lg N N lg N N lg N
stable
improves mergesort
timsort N N lg N N lg N
when preexisting order
improves quicksort
3-way quick N 2 N ln N N2
when duplicate keys
48
2.3 Q UICKSORT
quicksort
selection
duplicate keys
Algorithms
system sorts
http://algs4.cs.princeton.edu
Sorting applications
Computational biology.
Load balancing on a parallel computer.
...
50
War story (system sort in C)
51
War story (system sort in C)
Bug. A qsort() call that should have taken seconds was taking minutes.
52
Engineering a system sort (in 1993)
Bentley-McIlroy quicksort.
samples 9 items
Cutoff to insertion sort for small subarrays.
Partitioning item: median of 3 or Tukey's ninther.
Partitioning scheme: Bentley-McIlroy 3-way partitioning.
similar to Dijkstra 3-way partitioning
SOFTWAREPRACTICE AND EXPERIENCE, VOL. 23(11), 12491265 (NOVEMBER 1993)
(but fewer exchanges when not many equal keys)
JON L. BENTLEY
M. DOUGLAS McILROY
AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974, U.S.A.
SUMMARY
We recount the history of a new qsort function for a C library. Our function is clearer, faster and more
robust than existing sorts. It chooses partitioning elements by a new sampling scheme; it partitions by a
novel solution to Dijkstras Dutch National Flag problem; and it swaps efficiently. Its behavior was
assessed with timing and debugging testbeds, and with a program to certify performance. The design
techniques apply in domains beyond sorting.
KEY WORDS Quicksort Sorting algorithms Performance tuning Algorithm design and implementation Testing
INTRODUCTION
C libraries have long included a qsort function to sort an array, usually implemented by
Very widely used. C, C++, Java 6, .
Hoares Quicksort. 1 Because existing qsorts are flawed, we built a new one. This paper
summarizes its evolution.
Compared to existing library sorts, our new qsort is fastertypically about twice as
fastclearer, and more robust under nonrandom inputs. It uses some standard Quicksort 53
A beautiful mailing list post (Yaroslavskiy, September 2009)
Hello All,
I'd like to share with you new Dual-Pivot Quicksort which is faster than the
known implementations (theoretically and experimental). I'd like to propose
to replace the JDK's Quicksort implementation by new one.
...
The new Dual-Pivot Quicksort uses *two* pivots elements in this manner:
...
http://mail.openjdk.java.net/pipermail/core-libs-dev/2009-September/002630.html
54
A beautiful mailing list post (Yaroslavskiy-Bloch-Bentley, October 2009)
Changeset: b05abb410c52
Author: alanb
Date: 2009-10-29 11:18 +0000
URL: http://hg.openjdk.java.net/jdk7/tl/jdk/rev/b05abb410c52
! make/java/java/FILES_java.gmk
! src/share/classes/java/util/Arrays.java
+ src/share/classes/java/util/DualPivotQuicksort.java
http://mail.openjdk.java.net/pipermail/compiler-dev/2009-October.txt
55
Dual-pivot quicksort
Use two partitioning items p1 and p2 and partition into three subarrays:
Keys less than p . 1
lo lt gt hi
Initialization.
Choose a[lo] and a[hi] as partitioning items.
Exchange if necessary to ensure a[lo] a[hi].
S E A Y R L F V Z Q T C M K
lo hi
Else if (a[i] > a[hi]), exchange a[i] with a[gt] and decrement gt.
Else, increment i.
lo lt i gt hi
K E A F R L M C Z Q T V Y S
lo lt i gt hi
Dual-pivot partitioning demo
Finalize.
Exchange a[lo] with a[--lt].
Exchange a[hi] with a[++gt].
lo lt gt hi
C E A F K L M R Q S Z V Y T
lo lt gt hi
3-way partitioned
Dual-pivot quicksort
Use two partitioning items p1 and p2 and partition into three subarrays:
Keys less than p . 1
lo lt gt hi
60
Three-pivot quicksort
Use three partitioning items p1, p2, and p3 and partition into four subarrays:
Keys less than p . 1
lo a1 a2 a3 hi
A. Fewer compares.
B. Fewer exchanges.
D. I don't know.
62
Quicksort quiz 4
A. Fewer compares.
1-pivot 2 N ln N 0.333 N ln N 2 N ln N
sorts algorithms
elementary sorts insertion sort, selection sort, bubblesort, shaker sort, ...
parallel sorts bitonic sort, odd-even sort, smooth sort, GPUsort, ...
64
Which sorting algorithm to use?
Guaranteed performance?
65
System sort in Java 7
Arrays.sort().
66
Ineffective sorts
http://xkcd.com/1185 67