Sorting

Sorting
Definition
Sorting is a basic operation in
computer science. sorting refers to
the operation of arranging data in
some given sequence i.e. increasing
order or decreasing order
Basic terminologies in
sorting
Internal and external sorting:
Internal sorting:
An internal sort is any data sorting process that takes place
entirely within the main memory of a computer. This is possible
whenever the data to be sorted is small enough to all be held in
the main memory.
External sorting:
External sorting is required when the data being sorted does
not fit into the main memory of a computing device (usually
RAM) and a slower kind of memory (usually a hard drive)
needs
to be used.
Sorting - what for ?
Example:
Accessing (finding a specific value in)

an unsorted and a sorted array:
Find the name of a person being 10
years old:
10
36
8
35
1
CIS 068
Bart
Homer
Lisa
Marge
Maggie
Unsorted:
Worst case: try n rows => order of

magnitude: O(n)
Average case: try n/2 rows => O(n)
CIS 068
10
36
1
35
8
Bart
Homer
Maggie
Marge
Lisa
Sorting
what
for
?
Sorted: Binary Search
Worst case: try log(n) <= k <= log(n)+1 rows =>

O(log n)
Average case: O(log n)
(for a proof see e.g.

http://www.mcs.sdsmt.edu/~ecorwin/cs251/binavg/binavg.htm)
1
8
10
35
36
CIS 068
Maggie
Lisa
Bart
Marge
Homer

Sorting and accessing is faster than accessing an
unsorted dataset (if multiple (=k) queries occur):
n*log(n) + k*log(n) < k * n
(if k is big enough)
Sorting is crucial to databases, databases are crucial

to data-management, data-management is crucial to
economy, economy is ... sorting seems to be pretty
important !
The question is WHAT (name or age ?) and HOW to
sort.
CIS 068
Classification of sorting methods

Comparison-Based Methods
Insertion Sorts
Selection Sorts
Heapsort (tree sorting) in future
lesson
Exchange sorts
Bubble sort
Quick sort
Merge sorts
Distribution Methods: Radix sorting
sorts
internal
external
*Natural
*Balanced
*Poly phase
Insertio
n
*Insertion
*Shell sort
selectio
n
*Heap
Excha
nge
*Selection *Bubble
*Quick
Complexity measure
The complexity of a sorting algorithm
measures the running time as a function
of the number n of items to be sorted.
If a1,a2,..an set of data to be
sorted and b is an auxiliary lacation
1) No. of comparison, test whether ai < aj
or ai<b
2) No. of interchange, switch the contents
of ai and aj or of ai and b.
3) Assignments which done b=ai , aj=b or
a =a
Algorithm Measures
Best Case
Often when data already sorted
Worst Case
Data completely disorganised, or in reverse
order.
Average Case
Random order
Some sorting algorithms are the same for

all three cases, others can be tailored to
suit certain cases.
Quadratic Algorithms
Bubble Sort
CIS 068
Bubble sort
To pass through the array n-1 times,
where n is the number of data in the
array
For each pass:
compare each element in the array with its
successor
bubble
(int x[], int n)
{
interchange the two
elements
if they
are
for (i=0; i<n-1; i++)
not in order
for (j=0; j<n-1; j++)
The algorithm
if (x[j] > x[j+1])

SWAP (x[j], x[j+1])
}
Bubble Sort: Example
The Famous Method: Bubble Sort
CIS 068
Bubble Sort: Example

One Pass
Array after
Completion
of Each Pass
CIS 068
Bubble Sort: Algorithm

(Bubble sort) BUBBLE (DATA,N)
DATA is an array with N elements.
1. Repeat Steps 2 and 3 for K= 1 to
N-1
2.
Set PTR:=1 ;(Initialize pass
pointer PTR)
3 Repeat while PTR<=N-K
(Executes Pass)
A) If DATA((PTR)>DATA(PTR+1)
then
Interchange DATA[PTR] and
DATA[PTR+1]
(end of if)
B) Set PTR:=PTR+1
(End of inner loop)
[end of Step 1 outer loop)
Bubble Sort: Analysis

Number of comparisons (worst case):
(n-1) + (n-2) + ... + 3 + 2 + 1 O(n)
Number of comparisons (best case):
n 1 O(n)
Number of exchanges (worst case):
(n-1) + (n-2) + ... + 3 + 2 + 1 O(n)
Number of exchanges (best case):
0 O(1)
Overall worst case: O(n) + O(n) = O(n)
Quadratic Algorithms
Selection Sort
CIS 068
Selection Sort: Example
The Brute Force Method: Selection

Sort
CIS 068
Selection Sort: Algorithm

0
last = n-1
Algorithm:
For i=0 .. last -1
find smallest element M in subarray i .. last
if M != element at i: swap elements
Next i
CIS 068
( this is for BASIC-freaks !)
Selection Sort: Analysis
Number of comparisons:
(n-1) + (n-2) + ... + 3 + 2 + 1 =
n * (n-1)/2 =
(n - n )/2
O(n)

n1
O(n)
Overall (worst case) O(n) + O(n) = O(n)
(quadratic sort)
CIS 068
Insertion sort
Insertion Sort(A[MAXSIZE],N)
Let a e an array of n elements , a temp variable to
interchange two values,k be the total no. of passes and j
be another control variable
1. Set k=1.
2. For k= 1 to (n-1)
set temp=a[k]
set j= k-1
while temp<a[j] and (j>=0) perform the following steps.
set a[j+1]=a[j]
[End of loop structure]
Assign the value of temp to a[j+1].
[End of for loop structure]
3. exit
example
25, 15, 30, 9,,99,20,26

25
15
30
99
Pass
a[1]<
15 1:25
30 a[0]
9
20
26
interchange
99
20
26
Pass 2:a[1]>a[2] remains same

15
25
30
99
20
26
Pass 3: a[3]>a[0],a[1] and a[2] so insert a[3]

before
a[0],a[5] before a[2]
9
15
25
30
99
20
26
Cont.
Pass 4:a[4] >a[3] remains same

9
15
25
30
99
20
26
Pass 5: a[5] < a[2],a[3] and a[4]

9
15
20
25
30
99
26
Pass 6: a[6]<a[4],a[5] so
9
15
20
25
26
30
99
After this we get sorted array
Insertion Sort: Analysis
Number of comparisons (worst case):

(n-1) + (n-2) + ... + 3 + 2 + 1 O(n)
Number of comparisons (best case):

n 1 O(n)

(n-1) + (n-2) + ... + 3 + 2 + 1 O(n)
Number of exchanges (best case):

0 O(1)
Overall worst case: O(n) + O(n) = O(n)
Insertion Sort Continue

Input Size
More input more time.
Running Time
The number of primitive operations or
steps executed during a program
execution is the running time of algorithm.
Comparison of Quadratic
Sorts
Comparisons
Exchanges
Best
Worst
Best
Worst
Selectio O(n)
nSort
O(n)
O(1)
O(n)
Bubble
Sort
O(n)
O(n)
O(1)
O(n)
Insertio
n
Sort
O(n)
O(n)
O(1)
O(n)
CIS 068
Result Quadratic Algorithms

`
advantage
disadvantage
Selection
Sort
If array is in
total disorder
If array is
presorted
Bubble Sort
If array is
presorted
If array is in total
disorder
Insertion
Sort
If array is
presorted
If array is in total
disorder
Overall: O(n) is not acceptable

since there are nLog(n)
algorithms !
n Log(n) Algorithms
Quick Sort
Quick sort
It is also called partition exchange sort

In each step, the original sequence is
partitioned into 3 parts:
a. all the items less than the partitioning
element
b. the partitioning element in its final position
c. all the items greater than the partitioning
element
The partitioning process continues in the left
and right partitions
OR Quicksort Algorithm
Given an array of n elements (e.g., integers):
If array only contains one element, return
Else
pick one element to use as pivot.
Partition elements into two sub-arrays:
Elements less than or equal to pivot
Elements greater than pivot
Quicksort two sub-arrays

Return results
The partitioning in each step of

quicksort
To pick one of the elements as the partitioning
element, p, usually the first element of the
sequence
To find the proper position for p while partitioning
the sequence into 3 parts
a) it employs two indexes, down and up
b) down goes from left to right to find elements greater
than p
c) up goes from right to left to find elements less than p
d) elements found by up and down are exchanged
e) process until up and down are matched or passed each
other
f) the position of p should be pointed by up
g) exchange p with the element pointed by up
Pick Pivot Element

There are a number of ways to pick the pivot
element. In this example, we will use the first
element in the array:
40
20
10
80
60
50
30
100
Partitioning Array
Given a pivot, partition the elements of
the array such that the resulting array
consists of:
1. One sub-array that contains elements >=
pivot
2. Another sub-array that contains elements
< pivot
The sub-arrays are stored in the original

data array.
Partitioning loops through, swapping
pivot_index = 0 40
[0]
20
10
[1] [2]
too_big_index
80
60
50
[3] [4] [5]
30
100
[6] [7] [8]

too_small_index
1.
While data[too_big_index] <= data[pivot]

++too_big_index
pivot_index = 0 40
[0]
20
10
[1] [2]
too_big_index
80
60
50
[3] [4] [5]
30
100
[6] [7] [8]

too_small_index
1.

++too_big_index
pivot_index = 0 40
20
[0]
10
[1] [2]
80
60
50
[3] [4] [5]
too_big_index
30
100
[6] [7] [8]

too_small_index
1.
2.

++too_big_index
While data[too_small_index] > data[pivot]
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
80
60
50
[3] [4] [5]
too_big_index
30
100
[6] [7] [8]

too_small_index
1.
2.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
80
60
50
[3] [4] [5]
too_big_index
30
100
[6] [7] [8]

too_small_index
1.
2.
3.

++too_big_index
--too_small_index
If too_big_index < too_small_index
swap data[too_big_index] and data[too_small_index]
pivot_index = 0 40
[0]
20
10
[1] [2]
80
60
50
[3] [4] [5]
too_big_index
30
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
While too_small_index > too_big_index, go to 1.
pivot_index = 0 40
[0]
20
10
[1] [2]
30
60
50
[3] [4] [5]
too_big_index
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
60
50
[3] [4] [5]
too_big_index
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
60
50
[3] [4] [5]

too_big_index
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
60
50
[3] [4] [5]

too_big_index
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
60
50
[3] [4] [5]

too_big_index
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
60
50
[3] [4] [5]

too_big_index
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.

++too_big_index
--too_small_index
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.
5.

++too_big_index
--too_small_index
Swap data[too_small_index] and data[pivot_index]
pivot_index = 0 40
[0]
20
10
[1] [2]
30
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
1.
2.
3.
4.
5.

++too_big_index
--too_small_index
Swap data[too_small_index] and data[pivot_index]
pivot_index = 4
7
[0]
20
10
[1] [2]
30
40
50
[3] [4] [5]

too_big_index
60
80
100
[6] [7] [8]

too_small_index
Partition Result
20
[0]
10
[1] [2]
30
40
50
[3] [4] [5]
<= data[pivot]
60
80
100
[6] [7] [8]
> data[pivot]
Recursion: Quicksort Subarrays

7
20
[0]
10
[1] [2]
30
40
50
[3] [4] [5]
<= data[pivot]
60
80
100
[6] [7] [8]
> data[pivot]
Quicksort Analysis
Assume that keys are random,
uniformly distributed.
What is best case running time?
An example trace of
25 57 48 37 12 92 86 33quicksort
Subsequent processes:
25 57 48 37 12 92 86 33
12 25 (48 37 57 92 86 33)
25 57 48 37 12 92 86 33
25 57 48 37 12 92 86 33
25 57 48 37 12 92 86 33
25 57 48 37 12 92 86 33
25 12 48 37 57 92 86 33
25 12 48 37 57 92 86 33
25 12 48 37 57 92 86 33
25 12 48 37 57 92 86 33
25 12 48 37 57 92 86 33
(12) 25 (48 37 57 92 86 33)
12
12
12
12
12
12
12
12
25 (48 37 33 92 86 57)
25 (48 37 33 92 86 57)
25 (33 37) 48 (92 86 57)
25 (33 37) 48 (92 86 57)
25()33 (37) 48 (57 86) 92()
25 33 37 48 (57 86) 92
25 33 37 48()57(86) 92
25 33 37 48 57 86 92
_ down, _ up
Performance considerations of quicksort

Quciksort got its name because it quickly
puts an element into its proper position by
employing two indexes to speed up the
partioning process and to minimize the
exchange
Each pass reduces the comparisons about
a half total number of comparisons is
about O(nlog2n)
It requires spaces for the recursive process
or stacks for an iterative process,
it is about O(log2n)
Quick Sort: Analysis
Exact analysis is beyond scope of this course
The complexity is O(n * Log(n))

Optimal case: pivot-index splits array into equal sizes
Worst Case: size left = 0, size right = n-1 (presorted list)
Interesting case: presorted list:

Nothing is done, except (n+1) * n /2 comparisons
Complexity grows up to O(n) !
The better the list is presorted, the worse the algorithm
performs !
The pivot-selection is crucial.
In practical situations, a finely tuned

implementation of quicksort beats most sort algorithms, including sort algorithms
whose theoretical complexity is O(n log n) in the worst case.
Comparison to Merge Sort:

Comparable best case performance
CIS No
068 extra memory needed
Shellsort
We can look at the list as a set of interleaved
sublists
For example, the elements in the even
locations could be one list and the elements in
the odd locations the other list
Shellsort begins by sorting many small lists,
and increases their size and decreases their
number as it continues
78
Shellsort
One technique is to use decreasing powers
of 2, so that if the list has 64 elements, the
first pass would use 32 lists of 2 elements,
the second pass would use 16 lists of 4
elements, and so on
These lists would be sorted with an
insertion sort
79
Shellsort Example
8 sublists
2 elements /
sublist
Increment =
8
4 sublists
4 elements /
sublist
Increment =
4
2 sublists
8 elements /
sublist
Increment =
2
1 sublist
16 elements /
sublist
80
Shellsort Algorithm
passes = lg N
while (passes 1) do
increment = 2passes - 1
for start = 1 to increment do
InsertionSort(list, N, start, increment)
end for
passes = passes - 1
end while
N=15
Pass 1: increment = 7, 7 calls, size =
2
Pass 2: increment = 3, 3 calls, size =
5
Pass 3: increment = 1, 1 call, size =
15
81
Use
Radix sort only applies to integers,
fixed size strings, floating points and
to "less than", "greater than" or
"lexicographic order" comparison
predicates, whereas comparison
sorts can accommodate different
orders.
Books referred
Data structure through c(a practical
approach)-G.S.Baluja
Data structur

Sorting

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Sorting

Загружено:

Авторское право:

Доступные форматы

Sorting

Sorting - what for ?

Accessing (finding a specific value in)

Sorting - what for ?

Worst case: try n rows => order of

Worst case: try log(n) <= k <= log(n)+1 rows =>

(for a proof see e.g.

Sorting - what for ?

n*log(n) + k*log(n) < k * n

(if k is big enough)

Sorting is crucial to databases, databases are crucial

Classification of sorting methods

Distribution Methods: Radix sorting

Some sorting algorithms are the same for

if (x[j] > x[j+1])

Bubble Sort: Example

The Famous Method: Bubble Sort

Bubble Sort: Example

Bubble Sort: Algorithm

Bubble Sort: Analysis

(n-1) + (n-2) + ... + 3 + 2 + 1 O(n)

Number of comparisons (best case):

Number of exchanges (worst case):

(n-1) + (n-2) + ... + 3 + 2 + 1 O(n)

Number of exchanges (best case):

Overall worst case: O(n) + O(n) = O(n)

Selection Sort: Example

The Brute Force Method: Selection

Selection Sort: Algorithm

find smallest element M in subarray i .. last

if M != element at i: swap elements

( this is for BASIC-freaks !)

Selection Sort: Analysis

Number of exchanges (worst case):

25, 15, 30, 9,,99,20,26

Pass 2:a[1]>a[2] remains same

Pass 3: a[3]>a[0],a[1] and a[2] so insert a[3]

Pass 4:a[4] >a[3] remains same

Pass 5: a[5] < a[2],a[3] and a[4]

After this we get sorted array

Insertion Sort: Analysis

Number of comparisons (worst case):

Number of comparisons (best case):

Number of exchanges (worst case):

Number of exchanges (best case):

Insertion Sort Continue

Result Quadratic Algorithms

Overall: O(n) is not acceptable

It is also called partition exchange sort

Quicksort two sub-arrays

The partitioning in each step of

Pick Pivot Element

The sub-arrays are stored in the original

[3] [4] [5]

[6] [7] [8]

While data[too_big_index] <= data[pivot]

[3] [4] [5]

[6] [7] [8]

While data[too_big_index] <= data[pivot]

[3] [4] [5]

[6] [7] [8]

While data[too_big_index] <= data[pivot]

[3] [4] [5]

[6] [7] [8]

nlog(n) + klog(n) < k * n