Beruflich Dokumente
Kultur Dokumente
1
Radar
Midterm
Two weeks from this Friday on 3/27
In class
Closed book
2
Improved Sorting Techniques
What we have seen
Bubble, selection and insertion
Easy to implement
Slower: O(n2)
Mergesort
Faster: O(n log n)
More memory (temporary array)
4
Recall Insertion Sort....
A subarray to the left is
‘partially sorted’
Start with the first element
The player immediately to
the right is ‘marked’.
The ‘marked’ player is
inserted into the correct
place in the partially
sorted array
Remove first
Marked player ‘walks’ to
the left
Shift appropriate elements
until we hit a smaller one
5
The problem
If a small item is very
far to the right
Like in this case ->
You must shift many
intervening large items
one space to the right
Almost N copies
Average case N/2
N items, N2/2 copies
Better if:
Move a small item
6
many spaces, without
shifting
Remember
What made insertion
sort the best of the
basic sorts?
If the array is almost
sorted, O(n)
7
The “Almost Sort” step
Say we have a 10-element array:
60 30 80 90 0 20 70 10 40 50
Sort indices 0, 4, and 8:
0 30 80 90 40 20 70 10 60 50
Sort indices 1, 5, and 9:
0 20 80 90 40 30 70 10 60 50
Sort indices 2, 6:
0 20 70 90 40 30 80 10 60 50
Sort indices 3, 7
0 20 70 10 40 30 80 90 60 50
8
The “Almost Sort” step
This is called a “4-sort”:
60 30 80 90 0 20 70 10 40 50
0 30 80 90 40 20 70 10 60 50
0 20 80 90 40 30 70 10 60 50
0 20 70 90 40 30 80 10 60 50
0 20 70 10 40 30 80 90 60 50
Once we’ve done this, the array is almost sorted, and we can run insertion
sort on the whole thing
Should be about O(n) time
9
Interval Sequencing
4-sort was sufficient for a 10-element array
For larger arrays, you’ll want to do many of these to achieve an array that
is really almost sorted. For example for 1000 items:
364-sort
121-sort
40-sort
13-sort
4-sort
insertion sort
10
Knuth’s Interval Sequence
How to determine?
Knuth’s algorithm:
Start at h=1
Repeatedly apply the function h=3*h+1, until you pass the number of items in
the array:
h = 1, 4, 13, 40, 121, 364, 1093….
Thus the previous sequence for a 1000-element array
So what interval sequences should we use for a 10000 element array?
11
Why is it better?
When h is very large, you are sorting small numbers of elements and
moving them across large distances
Efficient
When h is very small, you are sorting large numbers of elements and
moving them across small distances
Becomes more like traditional insertion sort
But each successive sort, the overall array is more sorted
So we should be nearing O(n)
12
Let’s do our own example…
Sort these fifteen elements:
8 10 1 15 7 4 12 13 2 6 11 14 3 9 5
13
Java Implementation, page 322
What are function needs to do:
Initialize h properly
Have an outer loop start at outer=h and count up
Have an inner loop which sorts outer, outer-h, outer-2h, etc.
For example, if h is 4:
We must sort (8, 4, 0), and (9, 5, 1), etc.
14
Other Interval Sequences
Original Shellsort:
h = h / 2
Inefficient, leads to O(n2)
Variation:
h = h / 2.2
Need extra effort to make sure we eventually hit h=1
Flamig’s Approach (yields similar to Knuth)
if (h < 5) h = 1; else h = (5*h-1) / 11;
16
Quicksort
This will be the sort that is the most optimal in terms of performance and
memory
One must exercise care to avoid the worst case, which is still very
expensive
17
Partitioning
Idea: Divide data into two groups, such that:
All items with a key value higher than a specified amount (the pivot) are in
one group
All items with a lower key value are in another
Applications:
Divide employees who live within 15 miles of the office with those who live
farther away
Divide households by income for taxation purposes
Divide computers by processor speed
19
Java Implementation, page 328
Partition function must:
Accept a pivot value, and a starting and ending index
Start an integer left=start-1, and right=end
Increment left until we find an A[left] bigger than the pivot:
Then decrement right until we find an A[right] bigger than the pivot.
Swap A[left] and A[right].
Stop when left and right cross
Then left is the partition
All items smaller are on the left, and larger on the right
There can’t be anything smaller to the right, because we would’ve swapped.
20
Efficiency: Partitioning
O(n) time
left starts at 0 and moves one-by-one to the right
right starts at n-1 and moves one-by-one to the left
When left and right cross, we stop.
So we’ll hit each element just once
Example:
Unpartitioned: 42 89 63 12 94 27 78 10 50 36
Partitioned around Pivot: 3 27 12 36 63 94 89 78 42 50
What does this imply about the pivot element after the partition?
22
Placing the Pivot
Goal: Pivot must be in the leftmost position in the right subarray
3 27 12 36 63 94 89 78 42 50
Our algorithm does not do this currently.
It currently will not touch the pivot
left increments till it finds an element < pivot
right decrements till it finds an element > pivot
So the pivot itself won’t be touched, and will stay on the right:
3 27 12 63 94 89 78 42 50 36
23
Options
We have this:
3 27 12 63 94 89 78 42 50 36
Our goal is the position of 36:
3 27 12 36 63 94 89 78 42 50
We could either:
Shift every element in the right subarray up (inefficient)
Just swap the leftmost with the pivot! Better
We can do this because the right subarray is not in any particular order
3 27 12 36 94 89 78 42 50 63
24
Swapping the Pivot
Just takes one more line to our Java method
Basically, a single call to swap()
Swaps A[end-1] (the pivot) with A[left] (the partition index)
25
Quicksort
The most popular sorting algorithm
For most situations, it runs in O(n log n)
Remember partitioning. It’s the key step. And it’s O(n).
Base case: We partition just one element, which is just the element itself
26
Java Implementation, page 333
We can probably just do this in our heads:
1. Choose the pivot to be the rightmost element
2. Partition the array into left (smaller keys) and right (larger keys) groups
3. Call ourselves and sort the left group
4. Call ourselves and sort the right group
Base case: We partition just one element, which is just the element itself
27
Shall we try it on an array?
10 70 50 30 60 90 0 40 80 20
Let’s go step-by-step on the board
28
Best case…
We partition the array each time into two equal subarrays
Say we start with array of size n = 2i
We recurse until the base case, 1 element
30
The VERY bad case…
In the worst case, we partition every time into an array of 0 elements and
an array of n-1 elements
This yields O(n2) time:
First call: Partition n elements, n operations
Second calls: Partition 0 and n-1 elements, n-1 operations
Third calls: Partition 0 and n-2 elements, n-2 operations
Draw the tree
Yielding:
Operations = n + n-1 + n-2 + … + 1 = n(n+1)/2 -> O(n2)
31
Summary
What caused the problem was “blindly” choosing the pivot from the right
end.
In the case of a reverse sorted array, this is not a good choice at all
32
Median-Of-Three Partitioning
Everytime you partition, choose the median value of the left, center and
right element as the pivot
Example:
44 11 55 33 77 22 00 99 101 66 88
34
partition, you get O(n log n) behavior
One final optimization…
After a certain point, just doing insertion sort is faster than partitioning
small arrays and making recursive calls
Once you get to a very small subarray, you can just sort with insertion sort
You can experiment a bit with ‘cutoff’ values
Knuth: n=9
35
(Time Pending) Java Implementation
QuickSort with maximum optimization
Median-Of-Three Partitioning
Insertion Sort on arrays of size less than 9
36
Operation Count Estimates
For QuickSort
n=8: 30 comparisons, 12 swaps
n=12: 50 comparisons, 21 swaps
n=16: 72 comparisons, 32 swaps
n=64: 396 comparisons, 192 swaps
n=100: 678 comparisons, 332 swaps
n=128: 910 comparisons, 448 swaps
37
Radix Sort
Idea: Disassemble elements into individual digits
Sort digit by digit
38
Assumptions
Assumption
The assumption is: base 10, positive integers!
39
Example: On the Board
421 240 35 532 305 430 124
Remember: 35 has a 100s digit of zero!
40
Operations
n Elements
Copy each element once to a group, and once back again:
2n copies -> O(n)
Then you have to copy k times, where k is the maximum number of digits
in any value
So, 2*k*n copies -> O(kn)
Zero comparisons
42
Java Implementation
Not in the book; will be left as an exercise
This is your two-week assignment! Implement the entire Radix Sort.
43