Академический Документы
Профессиональный Документы
Культура Документы
Sorting methods
O(n2) methods
Non-comparison
Non
comparison based sorting
O( logn)
O(n
l
) methods
h d
E g Insertion,
E.g.,
Insertion bubble
R di sort,
Radix
t bucket
b k t sortt
(n2)
( )
(n)
When?
When?
3
Image courtesy: McQuain WD, VA Tech, 2004
(Divide)
Split the problem of size n into a fixed number of subproblems of smaller sizes, and solve each sub-problem
recursively
(Conquer)
Merge
g the answers to the sub-problems
p
Cpt S 223. School of EECS, WSU
Merge sort
Divide is trivial
Merge (i.e, conquer) does all the work
Quick sort
Merge Sort
Main idea:
Dividing is trivial
Merging
M i is
i non-trivial
ti i l
Input
O(lg
( g n))
steps
(divide)
How much work
at every step?
O(n) sub-problems
O(lg n)
steps
(conquer)
How much work
at every step?
Cpt S 223. School of EECS, WSU
6
Image courtesy: McQuain WD, VA Tech, 2004
A1
14
23
32
A2
10
19
10
14
19
23
32
k
(n) time
Do you always need the temporary array B to store the output, or can you do this inplace?
Cpt S 223. School of EECS, WSU
T(n) = 2 T(n/2) + n
(k times)
= 2k T(n/2k) + kn
At k = lg n, T(n/2k) = T(1) = 1 (termination case)
==> T(n) = (n lg n)
QuickSort
Main idea:
Dividing (partitioning) is non-trivial
Merging
M i is
i trivial
ti i l
QuickSort Algorithm
QuickSort( Array: S)
1.
If size of S is 0 or 1, return
2.
1.
2.
S1 = {x (S {v}) | x < v}
S2 = {x (S {v}) | x > v}
10
QuickSort Example
11
12
Goal: A good pivot is one that creates two even sized partitions
=> Median will be best, but finding median
could
ld b
be as ttough
h as sorting
ti it
itself
lf
13
8149035276
will result in
1032489576
Median of array
But median is expensive to calculate
pivot
8149035276
will result in
Median = median {first, middle, last}
1403526897
Has been shown to reduce
running time (comparisons) by 14%
14
How to write
the partitioning code?
should result in
1403526978
Goal of partitioning:
6149035278
15
6149035278
Partitioning strategy
should result in
1403526978
This is called
in place because
all operations are done
}
i place
in
l
off th
the iinputt
array (i.e., without
Swap ( pivot , S[i] )
Cpt S 223. School ofcreating
EECS, WSUtemp array)
// swap A[i] & A[j] (only if i<j)
16
Partitioning Strategy
Swap pivot
S
i t with
ith last
l t element
l
t S[right]
S[ i ht]
i = left
j = (right 1)
while (i < j)
Needs a few
boundary case
handling
17
Partitioning Example
right
i ht
left
8 1 4 9 6 3 5 2 7 0
Initial array
8 1 4 9 0 3 5 2 7 6
i
j
8 1 4 9 0 3 5 2 7 6
i
j
Positioned to swap
2 1 4 9 0 3 5 8 7 6
i
j
swapped
While (i < j) {
{ i++; } until S[i] > pivot
{ j--; } until S[j] < pivot
If (i < j),
j) th
then swap(( S[i] , S[j] )
}
18
2 1 4 9 0 3 5 8 7 6
i
j
2 1 4 5 0 3 9 8 7 6
i
j
2 1 4 5 0 3 9 8 7 6
j i
i has crossed j
2 1 4 5 0 3 6 8 7 9
i
p
19
Handling Duplicates
What happens if all input elements are equal?
Special case:
Current approach:
6666666666666666
20
Handling Duplicates
A better code
Special case:
6666666666666666
What will happen now?
21
Small Arrays
22
QuickSort Implementation
left
right
23
QuickSort Implementation
8 1 4 9 6 3 5 2 7 0
L
C
R
6 1 4 9 8 3 5 2 7 0
L
C
R
0 1 4 9 8 3 5 2 7 6
L
C
R
0 1 4 9 6 3 5 2 7 8
L
C
R
0 1 4 9 7 3 5 2 6 8
L
C
P R
Cpt S 223. School of EECS, WSU
24
Assign pivot as
median of 3
partition based
on pivot
Swap should be
compiled inline.
Recursively sort
partitions
Cpt S 223. School of EECS, WSU
25
Analysis of QuickSort
Time to
sort left
partition
Time to
sort right
partition
Time for
Ti
f partitioning
titi i
at current recursive
step
26
Analysis of QuickSort
Worst-case analysis
T ( N ) T (0) T ( N 1) O( N )
T ( N ) O(1) T ( N 1) O( N )
T ( N ) T ( N 1) O( N )
T ( N ) T ( N 2) O( N 1) O( N )
T ( N ) T ( N 3) O( N 2) O( N 1) O( N )
N
T ( N ) O(i ) O( N 2 )
i 1
27
Analysis of QuickSort
Best-case analysis
T ( N ) T ( N / 2) T ( N / 2) O( N )
T ( N ) 2T ( N / 2) O( N )
T ( N ) O( N log N )
28
=> Avg
g T(L)
( ) = Avg
g T(N-L-1)
(
) = 1/N
/
=> Avg T(N) = 2/N [
=> N T(N) = 2 [
j=0N-1 T(j)
(j)
j=0N-1 T(j) ] + cN
=> (1)
Substituting N by N-1
=>(N-1) T(N-1) = 2 [
(1) (2)
(1)-(2)
29
30
Comparison Sorting
Sort
Worst
Case
Average
Case
Best
Case
Comments
InsertionSort
(N
( 2)
(N
( 2)
(N)
( )
Fast for
small N
MergeSort
(N log N)
(N log N)
(N log N)
Requires
memory
HeapSort
(N log N)
(N log N)
(N log N)
Large
constants
QuickSort
(N
( 2)
(N
( log
g N))
(N
( log
g N))
Small
constants
31
Comparison Sorting
32
Can we do better?
33
Each node
Each branch
Each leaf
IF a<b:
possible
Height =
(lg n!)
IF a<c:
possible
Worst-case
evaluation path
for any algorithm
n! leaves
in this tree
35
36
37
N ! 2N ( N / e) (1 (1 / N ))
N ! ( N / e) N
log( N !) N log N N log e ( N log N )
log(
g( N !) ( N logg N )
log( N !) ( N log N )
38
39
40
Integer Sorting
E
E.g.,
sorting
ti an employee
l
database
d t b
by
b age off employees
l
Counting
g Sort
41
10
N=10
M=4
1
2
10
Time = O(N + M)
If (M < N),
N) Time
Ti
= O(N)
Cpt S 223. School of EECS, WSU
42
Input:
Output:
43
10
N=10
M=4
1
2
3
2
3
2
1
10
i=0;
while(i<n) {
e=A[i];
A[i]
if c[e] has gone below range, then continue after i++;
if(i==c[e]) i++;
p = A[c[e]];
[ [ ]];
tmp
Note:
N
t Thi
This code
d h
has tto kkeep ttrack
k
A[c[e]--] = e;
of the valid range for
A[i] = tmp;
each key
}
45
A:
End
points
C:
-1
10
5 4 32
0
0
2
3
1
1
1
1
3
2
2
2
2
2
2
2
3
3
3
3
8 7 6
46
Bucket sort
assuming
i each
h bucket
b k t will
ill contain
t i (1) elements
l
t
Cpt S 223. School of EECS, WSU
47
Radix Sort
4
1
0
10
5
6
1
8
0100
0000
1010
0110
1000
0001
0101
0001
0100
0000
1000
0001
0101
0001
1010
0110
0000
1000
0001
0001
1010
0100
0101
0110
0000
0001
0001
0100
0101
0110
1000
1010
4
1
1
4
5
6
8
10
msb
Cpt S 223. School of EECS, WSU
48
External Sorting
49
External MergeSort
disk
CPU
Array:
A [ 1 .. N]
50
External MergeSort
O(M log M)
Approach
1.
O(KM log M)
2.
3.
4.
O(N log k)
Update
p
input
p buffers one disk-page
p g at a time
Write output buffer one disk-page at a time
How?
K input buffers
1 output buffer
(1)
((4.2))
51
Cpt S 223. School of EECS, WSU
K-way merge
Q) How
H
to
t merge k sorted
t d arrays off total
t t l size
i N in
i O(N lg
l k)
time?
In memory:
L1
i=0k |Li| = M
Lk
L3
sorted
output
r
Cpt S 223. School of EECS, WSU
52
L1
L3
L2
Lk-1 Lk
L4
sorted
Output
merged
arrays
(temp)
2r
3r
53
L1
Lk-1 Lk
L4
L3
L2
sorted
Output
merged
arrays
(temp)
2r
2r
4r
2r
4r
Run-time Analysis
lg
g k stages
g
Total time
= (N) + (N) + : lg k times
= O(N lg k)
+
Cpt S 223. School of EECS, WSU
.
54
External MergeSort
P = page size
Accesses = O(N/P)
Cpt S 223. School of EECS, WSU
55
Sorting: Summary
QuickSort
Optimizations continue
Sort benchmarks
http://sortbenchmark.org/
http://research.microsoft.com/barc/sortbenchmark
Cpt S 223. School of EECS, WSU
56