Sie sind auf Seite 1von 51

Chapter 4, Part II

Sorting Algorithms

Heap Details
A heap is a tree structure where for
each subtree the value stored at the
root is larger than all of the values
stored in the subtree
There is no ordering between the
children of any node other than that
they are smaller

Heap Details
A heap is also a complete tree, so nodes are
filled in along the bottom of the tree from left
to right and a new level is started only when
the previous level has been filled

The largest value stored in a heap will be in


the root of the heap and the smallest value
will be in one of the leaves

Heap Example

Heapsort
Heapsort begins by constructing a heap
The root (the largest value in the heap)
is moved to the last location of the list
The heap is fixed and the process is
repeated

Heap Storage
We can store the heap using an array
For an element at location i, its children will be in
locations 2i and 2i+1

If 2i and 2i+1 are greater than the list size, then the
element at location i is a leaf
If only 2i+1 is greater than the list size, then the
element at location i has just one child

Heap Construction Example


Index
16

the last internal node

8 9 10 11 12 13 14 15
i=8
2i=16

no change

Final Heapsort Loop


1

3
6

Final Heapsort Loop

Heapsort Algorithm
construct the heap
for i = 1 to N do
copy the root to the list

fix the heap


end for

10

FixHeap Algorithm
i

vacant = root
while 2*vacant bound do
2i
2i+1
largerChild = 2*vacant
if (largerChild < bound) and
(list[largerChild+1] > list[largerChild]) then
largerChild = largerChild + 1
end if
if key > list[ largerChild ] then
break
else
list[ vacant ] = list[ largerChild ]
vacant = largerChild
end if
end while
list[ vacant ] = key
11

Constructing the Heap


for i = N/2 down to 1 do
FixHeap( list, i, list[ i ], N )
end for

12

Final Heapsort Algorithm


// Constructing the heap from the initial list
for i = N/2 down to 1 do
FixHeap( list, i, list[ i ], N )
end for

// Sorting on the constructed heap


for i = N down to 2 do
max = list[ 1 ]
FixHeap( list, 1, list[ i ], i-1 )

list[ i ] = max
end for

13

Worst-Case Analysis
We analyze FixHeap because the rest of the
analysis depends on it
For each level of the heap, FixHeap does two
comparisons one between the children and the
other between the new value and the largest child
For a heap with D levels, there will be at most 2D
comparisons

14

Worst-Case Analysis
During heap construction, FixHeap is called (N / 2)
times
On the first pass, the heap will have depth of 1

On the last pass, the heap will have depth of (lg N)


We need to determine how many nodes there are on
each of the levels

15

Worst-Case Analysis
For binary trees, we know that there is 1 node on the
first level, 2 nodes on the second level, 4 nodes on
the third level, and so on
L-1 (i=0)
L-2 (i=1)

Putting this together gives:

L-3 (i=2)

2 comparisons

L-4 (i=3)
No of nodes on level i

D 1

WConstruction ( N ) 2 * ( D i ) * 2i
i 0

4 * N 2 * lgN 4

for Dlg N
Max. depth of the sub-tree at level i

O( N )
( see p. 104 for the derivation )

16

Worst-Case Analysis
In the second loop, the size of the heap decreases by
one each pass
If there are k nodes left in the heap, then the heap
will have a depth of lg k
This gives:

N 1

WLoop ( N )

2 lg k
k 1

O( N lg N )

17

Worst-Case Analysis
Overall, the worst case is given by:
W( N ) WConstruction ( N ) WLoop ( N )
O( N ) O( N lg N )
O( N lg N )

18

Best-Case Analysis
In the best case, the elements will be in the array in
reverse order
The construction phase will still be of O(N)
Once the heap is constructed, the main loop will take
the same O(N lg N) work
So, the best case for heapsort is also
O(N lg N)
19

Average-Case Analysis
Average case must be between the best case and
the worst case
The average case for heapsort must be O(N lg N),
because best and worst case are both O(N lg N)

20

Merge Sort
If you have two sorted lists, you can create a
combined sorted list if you merge the lists
We know that the smallest value will be the first one
in either of the two lists
If we move the smallest value to the new list, we can
repeat the process until the entire list is created

21

Merge Sort Example

22

Merge Sort Example (continued)

23

Merge Sort Example (continued)

24

The Algorithm
if first < last then
middle = ( first + last ) / 2
MergeSort( list, first, middle )
MergeSort( list, middle + 1, last )

MergeLists( list, first, middle, middle + 1, last )


end if

25

MergeList Algorithm
Part 1
finalStart = start1
finalEnd = end2
indexC = 1
while (start1 end1) and (start2 end2) do
if list[start1] < list[start2] then
result[indexC] = list[start1]
start1 = start1 + 1
else
result[indexC] = list[start2]
start2 = start2 + 1
end if
indexC = indexC + 1
end while
26

MergeList Algorithm
Part 2
if start1 end1 then
for i = start1 to end1 do
result[indexC] = list[i]
indexC = indexC + 1
end for
else
for i = start2 to end2 do
result[indexC] = list[i]
indexC = indexC + 1
end for
end if

27

MergeList Algorithm
Part 3
indexC = 1
for i = finalStart to finalEnd do
list[i] = result[indexC]
indexC = indexC + 1
end for

28

MergeLists Analysis
The best case is when the elements of one list are
larger than all of the elements of the other list
One worst case is when the elements are interleaved
If each list has N elements, we will do N comparisons
in the best case, and 2N-1 comparisons in the worst
case

29

MergeSort Analysis
MergeSort divides the list in half each time, so the
difference between the best and worst cases is how
much work MergeList does
In the analysis, we consider that a list of N elements
gets broken into two lists of N/2 elements that are
recursively sorted and then merged together

30

MergeSort Analysis
The worst case is:
W(N) = 2W(N/2) + N 1
W(0) = W(1) = 0
which solves to W(N) = O(N lg N)
The best case is:
B(N) = 2B(N/2) + N/2
B(0) = B(1) = 0

which solves to B(N) = O(N lg N)

31

Quicksort
In Chapter 3, we saw a partition process
used to help us find the Kth largest element in
a list
We now use the partitioning process to help
us sort a list
We now will apply the process to both parts
of the list instead of just one of them

32

Quicksort
Quicksort will partition a list into two pieces:
Those elements smaller than the pivot value
Those elements larger than the pivot value
Quicksort is then called recursively on both pieces

33

Quicksort Example

34

Quicksort Example (continued)

35

Quicksort Algorithm
if first < last then
pivot = PivotList( list, first, last )
Quicksort( list, first, pivot-1 )
Quicksort( list, pivot+1, last )
end if

36

Partitioning Process
The algorithm moves through the list
comparing values to the pivot
During this process, there are sections of
elements as indicated below

37

PivotList Algorithm
PivotValue = list[ first ]

PivotPoint = first
for index = first+1 to last do
if list[ index ] < PivotValue then
PivotPoint = PivotPoint + 1
Swap( list[ PivotPoint ], list[ index ] )
end if
end for
// move pivot value into correct place
Swap( list[ first ], list[ PivotPoint ] )
return PivotPoint

38

Worst-Case Analysis
In the worst case, PivotList will do N 1
comparisons, but create one partition that has N 1
elements and the other will have no elements
Because it winds up just reducing the partition by one
element each time, worst case is given by:
N ( N 1)
W ( N ) i 1
O( N 2 )
2
i 2
N

39

Average-Case Analysis
In the average case, we need to consider all
of the possible places where the pivot point
winds up
Because there are N 1 comparisons done
to partition the list, and there are N ways this
can be done, we have:

1 N

A ( N ) N 1
A (i 1) A ( N i )

N i 1

A (1) A (0) 0
40

Average-Case Analysis
Algebra can be used to simplify this recurrence
relation to:
( N 1) * A( N 1) 2 N 2
N
A(1) A(0) 0
A( N )

This will then solve to:


A ( N ) 1.4( N 1)lg N

41

External Polyphase Merge Sort


Used when the data to be sorted is so large
it will not fit in the computers memory
External files are used to hold partial results
Read in as many records as possible into
memory and then sort them using one of the
other sorts
Alternate writing these runs of sorted records
to one of two files
42

External Polyphase Merge Sort


Merge pairs of runs from the two files
into one run that is twice the length
To do this, the runs might have to be
read into memory in pieces, but the
entire two runs must be merged before
moving onto the next pair of runs
This doubles the run length and halves
the number of runs
43

External Polyphase Merge Sort


The bigger runs are written alternately
between two new files
The process continues to merge pairs of
runs until the entire data set has been
merged back into a single sorted file

44

Run Creation Algorithm


CurrentFile = A
while not at the end of the input file do
read S records from the input file
sort the S records
write the records to file CurrentFile
if CurrentFile == A then
CurrentFile = B
else
CurrentFile = A
end if
end while
45

Run Merge Algorithm


Size = S
Input1 = A
Input2 = B
CurrentOutput = C
while not done do
***Merge runs process on next slide**

Size = Size * 2
if Input1 == A then
Input1 = C
Input2 = D
CurrrentOutput = A
Else
Input1 = A
Input2 = B
CurrentOutput = C
end if
end while

46

Merge Runs Process


while more runs this pass do
Merge one run of length Size from file Input1
with one run of length Size from file Input2
sending output to CurrentOutput
if CurrentOutput == A then
CurrentOutput = B
elseif CurrentOutput == B then
CurrentOutput = A
elseif CurrentOutput == C then
CurrentOutput = D
elseif CurrentOutput == D then
CurrentOutput = C
end if
end while
47

Run Creation Analysis


This analysis assumes that there are N
elements in the list and that they are
broken down into R runs of S elements
(N = R * S)
If we use an efficient sort to create the
runs, each run will take O(S lg S) and
there will be R of them for a total time of
O(R * S * lg S) = O(N lg S)
48

Run Merging Analysis


On the first pass, we have R runs of S
elements, so there will be R/2 merges
that can take up to 2S 1 comparisons,
which is R/2 * (2S 1) = R*S R/2
On the second pass, we will have R/2
runs of 2S elements, so there will be
R/4 merges that can take up to 4S 1
comparisons, which is R/4 * (4S 1) =
R*S R/4
49

Run Merging Analysis


There will be lg R passes of the merge
phase, so that the complexity is given
by:
lg R

R
R * S i N * lg R R
2
i 1

50

External Polyphase Merge


Sort Analysis
Putting the run creation and run
merging calculations together we find
that the overall complexity is O(N lg N)

51

Das könnte Ihnen auch gefallen